4,088 Data Pipelines jobs in India
Data Pipelines Architect
Posted today
Job Viewed
Job Description
We're revolutionizing the future of healthcare analytics by crafting pipelines that are reliable, observable, and continuously improving in production.
This role is fully remote, open to candidates based in Europe or India, with periodic team gatherings.
- Design scalable ETL data pipelines using Python (Pandas, PySpark) and SQL, orchestrated with Airflow to deliver high-quality insights.
- Develop and maintain a robust SAIVA Data Lake/Lakehouse on AWS, ensuring quality, governance, scalability, and accessibility.
- Run and optimize distributed data processing jobs with Spark on AWS EMR and/or EKS for enhanced performance.
- Implement batch and streaming ingestion frameworks (APIs, databases, files, event streams) to collect and process diverse data sources.
- Enforce validation and quality checks to ensure reliable analytics and ML readiness.
- Monitor and troubleshoot pipelines with CloudWatch, integrating observability tools like Grafana, Prometheus, or Datadog for proactive issue resolution.
- Automate infrastructure provisioning with Terraform, following AWS best practices for efficient deployment.
- Manage SQL Server, PostgreSQL, and Snowflake integrations into the Lakehouse for seamless data exchange.
- Participate in an on-call rotation to support pipeline health and resolve incidents quickly, ensuring minimal downtime.
Data Pipelines Specialist
Posted today
Job Viewed
Job Description
Job Role: Backend Engineer - Data Pipelines Specialist
The Opportunity:
We are investing in cutting-edge technology to develop innovative data products that drive business growth and transformation. Our team is focused on delivering automotive forecasting solutions, leveraging data integration and advanced analytics.
The Impact:
As a Backend Engineer specializing in data pipelines, you will play a crucial role in designing, developing, and maintaining scalable data infrastructure, ensuring seamless data flow and high-quality data products.
Responsibilities:
- Design and develop robust data pipelines using Python or similar technologies
- Build and maintain UI backend services, ensuring responsiveness and high performance
- Ensure data quality and integrity through rigorous validation processes
- Collaborate with cross-functional teams to gather data requirements and optimize data flow
- Develop strong understanding of data modeling concepts and lead data integration projects
Mastering Scalable Data Pipelines
Posted today
Job Viewed
Job Description
Senior Data Engineer Position
We are seeking a highly skilled Senior Data Engineer to join our team and play a key role in designing, building, and operating the backbone of our AI-powered applications.
About the Role:- This is an exciting opportunity for a seasoned data engineer to lead the development of event-driven data pipelines that support real-time clinical workflows.
- You will be responsible for integrating with EHR/EMR systems using standards like FHIR and HL7, ensuring seamless data exchange between different systems.
- The successful candidate will have expertise in developing reliable systems to ingest, track, and update clinical events in real-time, providing high-quality structured data for downstream ML systems.
- A strong background in workflow orchestration and event-driven systems is essential, as well as proficiency with relational databases and SQL.
Requirements:
- 5+ years of experience building production-grade data pipelines, preferably in a cloud environment.
- Familiarity with distributed systems and ability to operate effectively in a fast-paced environment.
- Excellent communication skills and ability to collaborate with cross-functional teams, including ML engineers and developers.
- Possess a strong understanding of HIPAA and healthcare data security/privacy requirements.
Bonus Skills:
- Familiarity with FHIR/HL7 data modeling in production settings and experience supporting machine learning inference pipelines in real-world applications.
- Ability to establish observability, monitoring, and error handling for robust pipeline performance.
Expert Data Pipelines Specialist
Posted today
Job Viewed
Job Description
As a data engineer, you will be responsible for designing and implementing data ingestion pipelines to support business intelligence initiatives.
- Key responsibilities include:
- Hands-on experience in building and managing data ingestion pipelines using cloud-based platforms.
- Proven expertise in data engineering with exposure to orchestration tools.
- Strong analytical and problem-solving skills to ensure data quality and integrity.
The ideal candidate will have a strong foundation in programming languages such as Python or SQL, knowledge of data warehousing and business intelligence tools, and experience with cloud-based data platforms.
We are seeking a skilled data engineer to join our team. If you are passionate about working with data and want to make a meaningful impact, we encourage you to apply.
Expert Data Pipelines Architect
Posted today
Job Viewed
Job Description
We are seeking a highly skilled Data Engineer to design and implement cutting-edge data pipelines that transform and move data from Cloudera (HDFS/Hive/Impala) into Apache Iceberg tables, integrating seamlessly with Snowflake and Databricks. The ideal candidate will have hands-on expertise in modern data lakehouse architectures and play a critical role in enabling scalable, governed, and high-performance data platforms.
Key Responsibilities:
Data Ingestion & Pipeline Development
- Design, configure, and maintain NiFi data flows to extract, transform, and load data from Cloudera into Iceberg tables.
- Implement streaming and batch ingestion pipelines with NiFi processors and custom scripting where needed.
- Optimize NiFi workflows for scalability, reliability, and monitoring.
Data Lakehouse Enablement
- Build and manage Apache Iceberg-based datasets for structured, semi-structured, and unstructured data.
- Ensure schema evolution, partitioning, and metadata management in Iceberg.
- Develop integration flows from Iceberg to Snowflake and Databricks for analytics, ML, and reporting use cases.
Integration & Orchestration
- Work with Snowflake to ingest curated data from Iceberg for enterprise reporting and commercial insights.
- Collaborate with Databricks teams to enable advanced analytics and machine learning use cases.
- Integrate NiFi pipelines with orchestration tools (Airflow, Oozie, or AWS/Azure/GCP schedulers).
Performance, Security & Governance
- Tune NiFi flows and Snowflake/Databricks ingestion for performance and cost optimization.
- Implement role-based security and ensure compliance (HIPAA, GDPR, SOX if applicable).
- Work with governance teams to enable lineage, metadata tracking, and auditability.
Qualifications:
- Bachelor's degree in Computer Science, Information Systems, or related field.
- 5+ years of data engineering experience, with at least 2+ years working with Apache NiFi.
- Strong experience with Cloudera ecosystem (HDFS, Hive, Impala, Spark).
- Hands-on expertise with Apache Iceberg (schema evolution, time travel, partitioning, compaction).
- Working knowledge of Snowflake and Databricks integration patterns.
- Proficiency in SQL and one programming language (Python, Java, or Scala).
- Understanding of data lakehouse architectures and ETL/ELT best practices.
Building Scalable Data Pipelines
Posted today
Job Viewed
Job Description
As a Data Engineer, you will be responsible for designing and implementing scalable data pipelines to integrate with Microsoft Fabric.
Key responsibilities:
- Configure data mirroring from relational sources into OneLake using Fabric Mirroring.
- Design and implement lakehouse folder structures and naming conventions for efficient Bronze ingestion.
- Implement change data capture, including soft deletes and backfills, using watermarking and reconciliation checks.
- Develop resilient ingestion pipelines with Fabric Data Factory and/or notebooks.
- Automate schema drift detection and evolution, publishing change notes and guardrails.
- Optimise batch sizes, file counts, partitions, parallelism, and capacity usage to balance speed, reliability, and cost.
Requirements:
- Strong understanding of SQL Server and T-SQL fundamentals.
- Proficiency in Microsoft Fabric Mirroring and OneLake/Lakehouse knowledge.
- Experience with schema drift detection and management.
- Hands-on experience with large relational databases.
- Proficiency in Python/Scala/Spark for ingestion and validation.
A minimum of 6–10 years of experience in data engineering, with 2+ years on Azure data platforms and hands-on Microsoft Fabric experience, is required for this role.
Responsibilities include developing and maintaining large-scale data processing systems, working closely with cross-functional teams to design and implement new features, and ensuring the overall quality and scalability of our data infrastructure.
Our ideal candidate has excellent problem-solving skills, strong communication skills, and experience working in a fast-paced environment.
Highly Skilled Data Pipelines Architect
Posted today
Job Viewed
Job Description
About the Role
- We're seeking a senior data engineer with deep expertise in building scalable data pipelines that support real-time clinical workflows.
This is a critical role where you'll design, build, and operate the backbone of our AI-powered applications. You'll work closely with software engineers, ML specialists, and clinical partners to bring our vision to life.
Key Responsibilities:- Design and operate event-driven data pipelines that integrate healthcare data from EHR/EMR systems using standards such as FHIR and HL7.
- Develop reliable systems to ingest, track, and update clinical events in real-time.
- Collaborate with ML engineers to align data flows with model needs.
- Continuously improve pipeline design to maximize scalability, resilience, and automation.
Be The First To Know
About the latest Data pipelines Jobs in India !
Senior Data Engineer - Data Pipelines
Posted today
Job Viewed
Job Description
Job Summary:
We are seeking a skilled Data Engineering Specialist to join our team. As a key member of our data engineering team, you will be responsible for designing, developing, and maintaining scalable data pipelines, including complex algorithms.
About the Role:
- Design, develop, and maintain scalable data pipelines, ensuring high performance and responsiveness.
- Develop and maintain UI backend services using Python or C#, collaborating with cross-functional teams to gather requirements.
- Evaluate and improve data quality through robust validation processes.
- Collaborate with data scientists and analysts to optimize data flow and storage for advanced analytics.
- Take ownership of your projects, delivering high-quality results on time.
Requirements:
- Bachelor's degree in computer science or a related field.
- Strong analytical and problem-solving skills.
- 7+ years of experience in Data Engineering/Advanced Analytics.
- Proficiency in Python and experience with Flask for backend development.
- Strong knowledge of object-oriented programming.
- AWS proficiency is highly valued: ECR, Containers.
Chief Architect of Data Pipelines
Posted today
Job Viewed
Job Description
Technical Engineering Director
We are seeking a highly skilled Technical Engineering Director to lead the development of our pharmacovigilance signal detection solutions. This is a unique opportunity for an experienced professional to bring their expertise and passion for advancing drug safety technologies.
About the Role
The ideal candidate will have a strong background in statistical programming, healthcare-related datasets, and be an expert in Python, JSON, PL/SQL, and SQL. They will be responsible for developing robust data pipelines and advanced configurations using Python, JSON, and SQL to implement client signal detection strategies.
Responsibilities
- Develop and oversee data pipeline architecture to meet client needs.
- Lead a team of technical engineers providing guidance, mentorship, and support.
- Collaborate with cross-functional teams to ensure alignment with business and regulatory requirements.
- Review and assure the quality of work ensuring accuracy, reliability, and performance of signalling data pipelines and associated technical documentation.
- Stay current with industry trends and emerging technologies to drive innovation.
- Support client-facing design teams to ensure feasibility and effectiveness of desired solutions.
Qualifications
- Bachelor's or Master's degree in Computer Science, Software Engineering, Informatics, or a related field.
- Proven track record of leading technical teams and delivering complex projects.
- Experience in statistical programming and data analysis.
- Expertise in Python, JSON, PL/SQL, and SQL.
- Familiarity with cloud-based solutions and big data technologies.
This role offers a unique combination of technical leadership and hands-on development experience. If you are passionate about advancing drug safety technologies and have the skills and qualifications we are looking for, please apply for this exciting opportunity.
Senior Backend Engineer, Data Pipelines
Posted today
Job Viewed
Job Description
About the Role:
Exp – 7-10 Years relevant experience
Senior Backend Engineer with strong Kafka expertise and a proven track record in Java, Flink, and Python. You will build scalable, high-performance backend services, optimize real-time data pipelines, and work with AWS cloud infrastructure.
Key Responsibilities:
- Develop and maintain backend services using Java, Flink, Python, and Kafka.
- Build real-time streaming pipelines and event-driven architectures.
- Work with AWS services (PostgreSQL, Aurora, DynamoDB).
- Automate workflows with Airflow, monitor with New Relic & Splunk.
- Deploy and manage applications using Kubernetes & Docker.
- Optimize performance and troubleshoot distributed systems.
Must-Have Qualifications:
- Highly skilled in backend development with strong Kafka expertise.
- Strong Proficiency in Java, Python, and AWS cloud services.
- Highly skilled Experience with event-driven architectures and microservices.
- Experience with Infrastructure-as-Code (IaC) tools like Terraform or CloudFormation.
- Hands-on experience with New Relic, Splunk, Kubernetes, and Docker.
- Strong problem-solving skills and a DevOps mindset.
Tech Stack –
- Programming Languages: Java, Python, Flink
- Cloud & Databases: AWS (PostgreSQL, DynamoDB)
- Streaming & Messaging: Kafka
- Infrastructure-as-Code: Terraform
- Orchestration & Monitoring: Airflow, New Relic, Splunk
- - Containerization & Deployment: Kubernetes, Docker