brand logo
View All Jobs

Principal Engineer - Data Layer (SOF05837)

Software Engineering
Bengaluru
About Us
Zycus, recognized by leading analyst firms in procurement technology, empowers teams to unlock deep value through its comprehensive Source-to-Pay (S2P) solutions. At the heart of our S2P solution is the Merlin Agentic Platform, which orchestrates intelligent AI agents to deliver simplified, efficient, and compliant processes.
The Merlin Intake Agent offers business users unparalleled ease of use, increasing adoption rates and significantly reducing non-compliant spending. For procurement teams, the Merlin Autonomous Negotiation Agent handles tail spend autonomously, securing additional savings; the Merlin Contract Agent helps draft compliant contracts and reduces risks by actively monitoring them; and the Merlin AP Agent further enhances efficiency by automating invoice processing with exceptional speed and accuracy.We Are An Equal Opportunity Employer:
Zycus is committed to providing equal opportunities in employment and creating an inclusive work environment. We do not discriminate against applicants on the basis of race, color, religion, gender, sexual orientation, national origin, age, disability, or any other legally protected characteristic. All hiring decisions will be based solely on qualifications, skills, and experience relevant to the job requirements.
Job Description
We are seeking a hands-on, highly technical Data Professional to lead the architecture, development, and rollout of a next-generation AI-powered Analytical Data Platform built on Lakehouse architecture.
You will not just guide teams — you’ll roll up your sleeves, build proofs of concept, and demonstrate “do-and-show” leadership in Engineering & Product Management to deliver robust, high-performance, secure, and scalable data platforms.
This role is ideal for a leader who thrives at the intersection of deep technical problem-solving, strategic vision, and team empowerment — capable of driving solutions from architecture to production with precision, quality, and speed.

Key Responsibilities: 

  • Architect, Design & Build: Define and implement enterprise-grade analytical data platforms following Lakehouse and Data-as-a-Service (DaaS) principles.
  Hands-On Engineering Leadership:
  • Personally develop and validate key platform components and data flows.
  • Create POCs and “show-by-doing” implementations to accelerate team understanding and delivery.
  • Optimize complex data integrations, transformations, and performance bottlenecks.
  • Lead by example in design, development, and debugging efforts, serving as a "player-coach" for the team.
 Data Platform Expertise:
  • Design data storage and retrieval layers using ClickHouse (Preferred), Apache Druid, Apache Doris and the likes, PostgreSQL/MS SQL, MongoDB, and Elasticsearch.
  • Build and optimize data pipelines (Apache Seatunnel, AWS Glue, etc.) for large-scale, high-volume data processing.
  • Model analytical structures using OLAP principles, semantic layers, and data visualization frameworks (Apache Superset (preferred) or similar open-source offerings).
  • Scalability & Performance: Deliver reliable and performant data systems that handle diverse, massive, and complex datasets.
  • Quality & Consistency: Set and enforce standards for data accuracy, governance, security, and performance.
  • Execution Ownership: Manage Sprints, perform SMART task breakdowns, and ensure on-time, high-quality team deliverables.
  • ClickHouse: Manage and optimize this high-performance OLAP database for real-time analytics on massive datasets. Clickhouse experience is preferred, however experience with other open source OLAPs like Apache Druid/Doris can apply.
  • Data Integration: Utilize Apache Seatunnel (or similar tools) for efficient data ingestion and synchronization across various sources.
  • AI/ML & Agentic AI: Lead the development and integration of AI models, algorithms, and Agentic AI solutions to solve complex business problems and automate processes.
  • Databases: Manage and optimize both PostgreSQL (relational) and NoSQL databases to support diverse data storage needs.
  • Data Visualization/BI: Implement and manage data visualization and exploration tools like Apache Superset to deliver actionable insights to stakeholders.
  • Infrastructure: Oversee deployment and orchestration using technologies like Docker, Kubernetes, and potentially specific environments such as MCP servers (Model Context Protocol, if applicable to the company's tech stack)
  • Data Governance & Quality: Ensure robust data governance, integrity, privacy, and security standards are maintained across the platform.
Job Requirement
Data Systems & Storage: ClickHouse (Parquet), PostgreSQL, MS SQL Server, MongoDB, Elasticsearch
ETL / Data Pipelines: Apache Seatunnel, AWS Glue, Apache Airflow (preferred), custom ETL frameworks
Modeling & Semantics: OLAP modeling, Cube.js, dbt (nice to have)
Data Visualization: Apache Superset, Tableau, Metabase, or similar BI tools
Architecture Patterns: Lakehouse, Data Mesh, DaaS, Data Governance, Security & Access Control
Languages & Tools: JAVA 21+, Angular, Node, Python, SQL, Shell scripting
Code Repositories & Build : BitBucket, Git, CI/CD, Docker/Kubernetes (for data workloads)
Agentic AI:  AI Platforms, MCP Servers, MCP Tools, Agentic Workflows, AutoGen, LibreChat (or equivalent) etc.
Cloud Platforms: AWS, GCP, or Azure (S3, Redshift, BigQuery, etc.)

Why Join Us

  • Work on cutting-edge applications powered by AI-assisted development workflows.
  • Collaborate with a forward-thinking engineering team that values innovation and continuous learning.
  • Access to premium AI development tools to boost productivity and code quality.
  • Competitive compensation, flexible work arrangements, and opportunities for career advancement