9 Data Pipelines jobs in India
Software Developer - Risk Data Pipelines
Posted today
Job Viewed
Job Description
Department: Risk Technology
Position Overview:
Risk Technology develops core services and systems required by Squarepoint’s systematic and quantitative trading strategies, such as real-time risk controls, position/inventory/P&L monitoring, internal order routing, and various pre and post-trading services.
Risk Data Pipelines develops software on top of the core Risk Technology platform to handle market or asset-class specific processing, including:
This role provides an opportunity to learn many aspects of the way hedge funds operate through close collaboration with trading and quantitative teams, and as a developer you will:
Required Qualifications:
Nice to Have:
Software Developer - Data Pipelines (Python)
Posted today
Job Viewed
Job Description
Team: Development - Alpha Data
Position Overview:
We are seeking an experienced Python developer to join our Alpha Data team, responsible for delivering a vast quantity of data served to users worldwide. You will be a cornerstone of a growing Data team, becoming a technical subject matter expert and developing strong working relationships with quant researchers, traders, and fellow colleagues across our Technology organisation.
Alpha Data teams are able to deploy valuable data to the rest of the Squarepoint business at speed. Ingestion pipelines and data transformation jobs are resilient and highly maintainable, while the data models are carefully designed in close collaboration with our researchers for efficient query construction and alpha generation.
We achieve an economy of scale through building new frameworks, libraries, and services used to increase the team's quality of life, throughput, and code quality. Teamwork and collaboration are encouraged, excellence is rewarded and diversity of thought and creative solutions are valued. Our emphasis is on a culture of learning, development, and growth.
Required Qualifications:
Nice to haves
Staff Software Engineer, Data Pipelines & APIs (India)
Posted today
Job Viewed
Job Description
About Mixpanel
Mixpanel is an event analytics platform for builders who need answers from their data at their fingertips—no SQL required. When everyone in the organization can see and learn from the impact of their work on product, marketing, and company revenue metrics, they are poised to make better decisions.
Over 9,000 paid customers, including companies like Netflix, Pinterest, Sweetgreen, and Samsara, use Mixpanel to understand their customers and measure progress. Our commitment is to provide the most comprehensive and reliable analytics platform accessible and trusted by all.
About Mixpanel
Mixpanel is powered by a custom distributed database. This system ingests more than 1 Trillion user-generated events every month while ensuring end-to-end latencies of under a minute and queries typically scan more than 1 Quadrillion events over the span of a month. Over the last year, our inbound traffic has doubled. As our existing customers grow in volume and we add new ones, we expect this growth in traffic to continue. The Distributed Systems engineering teams are responsible for adding new capabilities and ensuring the smooth operation of the underlying systems.
About the Team
The Data Pipeline & API team is responsible for the Data APIs, Pipelines, and Integrations that power real-time movement of customer data and support trillions of requests each month. These systems are critical to Mixpanel’s product and business, enabling core workflows for external customers. This includes data APIs, which are foundational to customer onboarding and must be highly reliable and scalable; data export APIs, which help us maintain feature parity and support the needs of scaling customers; and integrations like warehouse connectors and cohort exports, which reduce friction, drive adoption, and expand our partner ecosystem. This role ensures these systems are robust, efficient, and aligned with a product strategy that emphasizes impact, growth, and long-term reliability.
Responsibilities
As our first engineer in India you'll be responsible for:
A typical project requires a thorough understanding of how not just your service works but also how it interacts with other components. Here are some projects we've worked on in the past to give you an idea of what to expect.
If projects like the ones listed above excite you, the Distributed Systems engineering team will be a great fit.
We're Looking For Someone Who Has
#LI-Hybrid
Benefits and Perks
*please note that benefits and perks for contract positions will vary*
Culture Values
Python Developer (ETL Pandas, SQL, Data Pipelines & Automation)
Posted today
Job Viewed
Job Description
Job Summary
Synechron is seeking a detail-oriented and analytical Python Developer to join our data team. In this role, you will design, develop, and optimize data pipelines, analysis tools, and workflows that support key business and analytical functions. Your expertise in data manipulation, database management, and scripting will enable the organization to enhance data accuracy, efficiency, and insights. This position offers an opportunity to work closely with data analysts and scientists to build scalable, reliable data solutions that contribute directly to business decision-making and operational excellence.
Software Requirements
Required Skills:
Preferred Skills:
Overall Responsibilities
Technical Skills (By Category)
Programming Languages:
Databases/Data Management:
Cloud Technologies:
Frameworks and Libraries:
Development Tools and Methodologies:
Security Protocols:
Experience Requirements
Day-to-Day Activities
Qualifications
Professional Competencies
Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.
All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.
Senior Database Infrastructure Engineer- Cassandra, DataStax, Big Data Pipelines
Posted today
Job Viewed
Job Description
HEROIC Cybersecurity ( HEROIC ) is seeking a Senior Data Infrastructure Engineer with deep expertise in DataStax Enterprise (DSE) and Apache Cassandra to help architect, scale, and maintain the data infrastructure that powers our cybersecurity intelligence platforms.
You will be responsible for designing and managing fully automated, big data pipelines that ingest, process, and serve hundreds of billions of breached and leaked records sourced from the surface, deep, and dark web. You'll work with DSE Cassandra, Solr, and Spark, helping us move toward a 99% automated pipeline for data ingestion, enrichment, deduplication, and indexing — all built for scale, speed, and reliability.
This position is critical in ensuring our systems are fast, reliable, and resilient as we ingest thousands of unique datasets daily from global threat intelligence sources.
What you will do:
- Design, deploy, and maintain high-performance Cassandra clusters using DataStax Enterprise (DSE)
- Architect and optimize automated data pipelines to ingest, clean, enrich, and store billions of records daily
- Configure and manage DSE Solr and Spark to support search and distributed processing at scale
- Automate dataset ingestion workflows from unstructured surface, deep, and dark web sources
- Cluster management, replication strategy, capacity planning, and performance tuning
- Ensure data integrity, availability, and security across all distributed systems
- Write and manage ETL processes, scripts, and APIs to support data flow automation
- Monitor systems for bottlenecks, optimize queries and indexes, and resolve production issues
- Research and integrate third-party data tools or AI-based enhancements (e.g., smart data parsing, deduplication, ML-based classification)
- Collaborate with engineering, data science, and product teams to support HEROIC’s AI-powered cybersecurity platform
Requirements
- Minimum 5 years experience with Cassandra / DataStax Enterprise in production environments
- Hands-on experience with DSE Cassandra, Solr, Apache Spark, CQL, and data modeling at scale
- Strong understanding of NoSQL architecture, sharding, replication, and high availability
- Advanced knowledge of Linux/Unix, shell scripting, and automation tools (e.g., Ansible, Terraform)
- Proficient in at least one programming language: Python, Java, or Scala
- Experience building large-scale automated data ingestion systems or ETL workflows
- Solid grasp of AI-enhanced data processing, including smart cleaning, deduplication, and classification
- Excellent written and spoken English communication skills
- Prior experience with cybersecurity or dark web data (preferred but not required)
Benefits
- Position Type: Full-time
- Location: Pune, India (Remote – Work from anywhere)
- Compensation: Competitive salary based on experience
- Benefits: Paid Time Off + Public Holidays
- Professional Growth: Amazing upward mobility in a rapidly expanding company.
-
Innovative Culture: Fast-paced, innovative, and mission-driven. Be part of a team that leverages AI and cutting-edge technologies.
About Us: HEROIC Cybersecurity ( HEROIC ) is building the future of cybersecurity. Unlike traditional cybersecurity solutions, HEROIC takes a predictive and proactive approach to intelligently secure our users before an attack or threat occurs. Our work environment is fast-paced, challenging and exciting. At HEROIC, you’ll work with a team of passionate, engaged individuals dedicated to intelligently securing the technology of people all over the world.
Position Keywords: DataStax Enterprise (DSE), Apache Cassandra, Apache Spark, Apache Solr, AWS, Jira, NoSQL, CQL (Cassandra Query Language), Data Modeling, Data Replication, ETL Pipelines, Data Deduplication, Data Lake, Linux/Unix Administration, Bash, Docker, Kubernetes, CI/CD, Python, Java, Distributed Systems, Cluster Management, Performance Tuning, High Availability, Disaster Recovery, AI-based Automation, Artificial Intelligence, Big Data, Dark Web Data
Senior Database Infrastructure Engineer- Cassandra, DataStax, Big Data Pipelines
Posted today
Job Viewed
Job Description
HEROIC Cybersecurity ( HEROIC.com ) is seeking a Senior Data Infrastructure Engineer with deep expertise in DataStax Enterprise (DSE) and Apache Cassandra to help architect, scale, and maintain the data infrastructure that powers our cybersecurity intelligence platforms.
You will be responsible for designing and managing fully automated, big data pipelines that ingest, process, and serve hundreds of billions of breached and leaked records sourced from the surface, deep, and dark web. You'll work with DSE Cassandra, Solr, and Spark, helping us move toward a 99% automated pipeline for data ingestion, enrichment, deduplication, and indexing — all built for scale, speed, and reliability.
This position is critical in ensuring our systems are fast, reliable, and resilient as we ingest thousands of unique datasets daily from global threat intelligence sources.
What you will do:
- Design, deploy, and maintain high-performance Cassandra clusters using DataStax Enterprise (DSE)
- Architect and optimize automated data pipelines to ingest, clean, enrich, and store billions of records daily
- Configure and manage DSE Solr and Spark to support search and distributed processing at scale
- Automate dataset ingestion workflows from unstructured surface, deep, and dark web sources
- Cluster management, replication strategy, capacity planning, and performance tuning
- Ensure data integrity, availability, and security across all distributed systems
- Write and manage ETL processes, scripts, and APIs to support data flow automation
- Monitor systems for bottlenecks, optimize queries and indexes, and resolve production issues
- Research and integrate third-party data tools or AI-based enhancements (e.g., smart data parsing, deduplication, ML-based classification)
- Collaborate with engineering, data science, and product teams to support HEROIC’s AI-powered cybersecurity platform
Requirements
- Minimum 5 years experience with Cassandra / DataStax Enterprise in production environments
- Hands-on experience with DSE Cassandra, Solr, Apache Spark, CQL, and data modeling at scale
- Strong understanding of NoSQL architecture, sharding, replication, and high availability
- Advanced knowledge of Linux/Unix, shell scripting, and automation tools (e.g., Ansible, Terraform)
- Proficient in at least one programming language: Python, Java, or Scala
- Experience building large-scale automated data ingestion systems or ETL workflows
- Solid grasp of AI-enhanced data processing, including smart cleaning, deduplication, and classification
- Excellent written and spoken English communication skills
- Prior experience with cybersecurity or dark web data (preferred but not required)
Benefits
- Position Type: Full-time
- Location: Pune, India (Remote – Work from anywhere)
- Compensation: Competitive salary based on experience
- Benefits: Paid Time Off + Public Holidays
- Professional Growth: Amazing upward mobility in a rapidly expanding company.
-
Innovative Culture: Fast-paced, innovative, and mission-driven. Be part of a team that leverages AI and cutting-edge technologies.
About Us: HEROIC Cybersecurity ( HEROIC.com ) is building the future of cybersecurity. Unlike traditional cybersecurity solutions, HEROIC takes a predictive and proactive approach to intelligently secure our users before an attack or threat occurs. Our work environment is fast-paced, challenging and exciting. At HEROIC, you’ll work with a team of passionate, engaged individuals dedicated to intelligently securing the technology of people all over the world.
Position Keywords: DataStax Enterprise (DSE), Apache Cassandra, Apache Spark, Apache Solr, AWS, Jira, NoSQL, CQL (Cassandra Query Language), Data Modeling, Data Replication, ETL Pipelines, Data Deduplication, Data Lake, Linux/Unix Administration, Bash, Docker, Kubernetes, CI/CD, Python, Java, Distributed Systems, Cluster Management, Performance Tuning, High Availability, Disaster Recovery, AI-based Automation, Artificial Intelligence, Big Data, Dark Web Data
Requirements
Minimum 8 years years of full-stack PHP development experience, with 3 years in Laravel Deep expertise in PHP, Laravel, MySQL, JavaScript (Vue.js or similar), Git, and RESTful APIs Experience with server and database management (Linux, Apache/Nginx, MySQL/PostgreSQL, Cassandra) Strong familiarity with AI-enhanced coding tools and modern DevOps workflows (CI/CD, GitHub Actions) Experience in a security-focused or SaaS product environment is a strong plus Excellent English communication skills (written and verbal) Comfortable working independently during 9:00am-6:00pm (Pacific Time) U.S. hours and owning mission-critical systems
Senior Database Infrastructure Engineer- Cassandra, DataStax, Big Data Pipelines
Posted 22 days ago
Job Viewed
Job Description
HEROIC Cybersecurity ( HEROIC.com ) is seeking a Senior Data Infrastructure Engineer with deep expertise in DataStax Enterprise (DSE) and Apache Cassandra to help architect, scale, and maintain the data infrastructure that powers our cybersecurity intelligence platforms.
You will be responsible for designing and managing fully automated, big data pipelines that ingest, process, and serve hundreds of billions of breached and leaked records sourced from the surface, deep, and dark web. You'll work with DSE Cassandra, Solr, and Spark, helping us move toward a 99% automated pipeline for data ingestion, enrichment, deduplication, and indexing — all built for scale, speed, and reliability.
This position is critical in ensuring our systems are fast, reliable, and resilient as we ingest thousands of unique datasets daily from global threat intelligence sources.
What you will do:
Design, deploy, and maintain high-performance Cassandra clusters using DataStax Enterprise (DSE) Architect and optimize automated data pipelines to ingest, clean, enrich, and store billions of records daily Configure and manage DSE Solr and Spark to support search and distributed processing at scale Automate dataset ingestion workflows from unstructured surface, deep, and dark web sources Cluster management, replication strategy, capacity planning, and performance tuning Ensure data integrity, availability, and security across all distributed systems Write and manage ETL processes, scripts, and APIs to support data flow automation Monitor systems for bottlenecks, optimize queries and indexes, and resolve production issues Research and integrate third-party data tools or AI-based enhancements (e.g., smart data parsing, deduplication, ML-based classification) Collaborate with engineering, data science, and product teams to support HEROIC’s AI-powered cybersecurity platform Requirements Minimum 5 years experience with Cassandra / DataStax Enterprise in production environments Hands-on experience with DSE Cassandra, Solr, Apache Spark, CQL, and data modeling at scale Strong understanding of NoSQL architecture, sharding, replication, and high availability Advanced knowledge of Linux/Unix, shell scripting, and automation tools (e.g., Ansible, Terraform) Proficient in at least one programming language: Python, Java, or Scala Experience building large-scale automated data ingestion systems or ETL workflows Solid grasp of AI-enhanced data processing, including smart cleaning, deduplication, and classification Excellent written and spoken English communication skills Prior experience with cybersecurity or dark web data (preferred but not required) Benefits Position Type: Full-time Location: Pune, India (Remote – Work from anywhere) Compensation: Competitive salary based on experience Benefits: Paid Time Off + Public Holidays Professional Growth: Amazing upward mobility in a rapidly expanding company. Innovative Culture: Fast-paced, innovative, and mission-driven. Be part of a team that leverages AI and cutting-edge technologies.About Us: HEROIC Cybersecurity ( HEROIC.com ) is building the future of cybersecurity. Unlike traditional cybersecurity solutions, HEROIC takes a predictive and proactive approach to intelligently secure our users before an attack or threat occurs. Our work environment is fast-paced, challenging and exciting. At HEROIC, you’ll work with a team of passionate, engaged individuals dedicated to intelligently securing the technology of people all over the world.
Position Keywords: DataStax Enterprise (DSE), Apache Cassandra, Apache Spark, Apache Solr, AWS, Jira, NoSQL, CQL (Cassandra Query Language), Data Modeling, Data Replication, ETL Pipelines, Data Deduplication, Data Lake, Linux/Unix Administration, Bash, Docker, Kubernetes, CI/CD, Python, Java, Distributed Systems, Cluster Management, Performance Tuning, High Availability, Disaster Recovery, AI-based Automation, Artificial Intelligence, Big Data, Dark Web Data
Be The First To Know
About the latest Data pipelines Jobs in India !
Senior Group Data Engineering Manager(Data Pipelines, ADF, ADB, Python, SQL)
Posted today
Job Viewed
Job Description
Job Description
We’re AtkinsRéalis, a world-leading Design, Engineering and Project Management organization. Created by the integration of long-standing organizations dating back to 1911, we are a world-leading professional services and project management company dedicated to engineering a better future for our planet and its people. We create sustainable solutions that connect people, data and technology to transform the world's infrastructure and energy systems. We deploy global capabilities locally to our clients and deliver unique end-to-end services across the whole life cycle of an asset including consulting, advisory & environmental services, intelligent networks & cybersecurity, design & engineering, procurement, project & construction management, operations & maintenance, decommissioning and capital. The breadth and depth of our capabilities are delivered to clients in key strategic sectors such as Engineering Services, Nuclear, Operations & Maintenance and Capital.
News and information are available at or follow us on LinkedIn.
Our teams are proud to deliver on some of the most prestigious projects across the world. It's thanks to our talented people and their diverse thinking, expertise, and knowledge. Join us and you'll be part of our genuinely collaborative environment, where everyone is supported to make the most of their talents and expertise.
When it comes to work-life balance, AtkinsRéalis is a great place to be. So, let's discuss how our flexible and remote working policies can support your priorities. We're passionate about are work while valuing each other equally. So, ask us about some of our recent pledges for Women's Equality and being a 'Disability Confidence' and 'Inclusive Employer’.
EAI-AtkinsRéalis is a vibrant and continuously growing team. It is an important part of GTC-AtkinsRéalis and widely recognized for its high and quality project deliveries. This would be a vital role to take EAI one step forward in providing data solutions to our business and client. This role would simultaneously work on multiple projects and would provide planning, designing and delivery of data driven projects. Effective communication and a team player are important characteristics of this role.
Key Activities for This Role
Experience & Skills Required:
What We Can Offer You
Why work for AtkinsRéalis?
We at AtkinsRéalis are committed to developing its people both personally and professionally. Our colleagues have the advantage of access to a high ranging training portfolio and development activities designed to help make the best of individual’s abilities and talents. We also actively support staff in achieving corporate membership of relevant institutions.
Meeting Your Needs
To help you get the most out of life in and outside of work, we offer employees ‘Total Reward’.
Making sure you're supported is important to us. So, if you identify as having a disability, tell us ahead of your interview, and we’ll discuss any adjustments you might need.
Additional Information
We are an equal opportunity, drug-free employer committed to promoting a diverse and inclusive community - a place where we can all be ourselves, thrive and develop. To help embed inclusion for all, from day one, we offer a range of family friendly, inclusive employment policies, flexible working arrangements and employee networks to support staff from different backgrounds. As an Equal Opportunities Employer, we value applications from all backgrounds, cultures and ability.
We care about your privacy and are committed to protecting your privacy. Please consult our Privacy Notice on our Careers site to know more about how we collect, use and transfer your Personal Data.
Link: Equality, diversity & inclusion | Atkins India (atkinsrealis.com)
Worker Type
Employee
Job Type
RegularAt AtkinsRéalis, we seek to hire individuals with diverse characteristics, backgrounds and perspectives. We strongly believe that world-class talent makes no distinctions based on gender, ethnic or national origin, sexual identity and orientation, age, religion or disability, but enriches itself through these differences.
Data Scientist (Cloud Management, SQL, Building cloud data pipelines, Python, Power BI, GCP)
Posted today
Job Viewed
Job Description
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow—people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
Job Description:
Job Summary
UPS Marketing team is looking for a talented and driven Data Scientist to drive its strategic objectives in the areas of pricing, revenue management, market analysis and evidence/data-based decision making. This role will work across multiple channels and teams to drive tangible results in the organization. You will focus on developing metrics for multiple channels and markets, applying advanced statistical modeling where appropriate and pioneering new analytical methods in a variety of fast paced and rapidly evolving consumer channels. This high visibility position will work with multiple levels of the organization, including senior leadership to bring analytical capabilities to the forefront of pricing, rate setting, and optimization of our go-to-market offers. You will contribute to rapidly evolving UPS Marketing analytical capabilities by working amongst a collaborative team of Data Scientists, Analysts and multiple business stakeholders.
Responsibilities:
Become a subject matter expert on UPS business processes, data and analytical capabilities to help define and solve business needs using data and advanced statistical methods
Analyze and extract insights from large-scale structured and unstructured data utilizing multiple platforms and tools.
Understand and apply appropriate methods for cleaning and transforming data
Work across multiple stake holders to develop, maintain and improve models in production
Take the initiative to create and execute analyses in a proactive manner
Deliver complex analytical and visualizations to broader audiences including upper management and executives
Deliver analytics and insights to support strategic decision making
Understand the application of AI/ML when appropriate to solve complex business problems
Qualifications
Expertise in R, SQL, Python.
Strong analytical skills and attention to detail.
Able to engage key business and executive-level stakeholders to translate business problems to high level analytics solution approach.
Expertise with statistical techniques, machine learning or operations research and their application in business applications.
Deep understanding of data management pipelines and experience in launching moderate scale advanced analytics projects in production at scale.
Proficient in Azure, Google Cloud environment
Experience implementing open-source technologies and cloud services; with or without the use of enterprise data science platforms.
Solid oral and written communication skills, especially around analytical concepts and methods.
Ability to communicate data through a story framework to convey data-driven results to technical and non-technical audience.
Master’s Degree in a quantitative field of mathematics, computer science, physics, economics, engineering, statistics (operations research, quantitative social science, etc.), international equivalent, or equivalent job experience.
Bonus Qualifications
Experience with pricing methodologies and revenue management
Experience using PySpark, Azure Databricks, Google BigQuery and Vertex AI
Creating and implementing NLP/LLM projects
Experience utilizing and applying neurals networks and other AI methodologies
Familiarity with Data architecture and engineering
Employee Type:
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.