9 Data Pipelines jobs in India

Software Developer - Risk Data Pipelines

Bengaluru, Karnataka Squarepoint Capital

Posted today

Job Viewed

Tap Again To Close

Job Description

Department: Risk Technology

Position Overview:

Risk Technology develops core services and systems required by Squarepoint’s systematic and quantitative trading strategies, such as real-time risk controls, position/inventory/P&L monitoring, internal order routing, and various pre and post-trading services.

Risk Data Pipelines develops software on top of the core Risk Technology platform to handle market or asset-class specific processing, including:

  • Trade Data Ingress - Normalize and stream trade data to Squarepoint's systems from trading platforms like Bloomberg, Fidessa and SpiderRock.
  • Trade Data Egress - Feeds to 3rd party platforms to ensure trade booking correctness and regulatory compliance.
  • Services and analytics used by investment and quant teams to understand Risk exposure, P&L and to improve capital efficiency.
  • Automation of trading operations to support the growth of the business, such as the management of options & future expiry and automated trade, position and P&L reconciliation.
  • This role provides an opportunity to learn many aspects of the way hedge funds operate through close collaboration with trading and quantitative teams, and as a developer you will:

  • Design, develop and deliver high quality and maintainable business critical software.
  • Work closely with stakeholders and colleagues to capture requirements, define architecture and technologies, identify and resolve bottlenecks, and deliver functionality.
  • Lead and contribute to design discussions and mentor other team members.
  • Participate in level 2 support.
  • Required Qualifications:

  • Bachelor’s degree in computer science or related subject.
  • 4 years’ minimum Python experience working in the financial industry.
  • Team player with excellent communication skills.
  • Experience with database management systems and related technologies such as SQL.
  • Knowledge of traded financial instruments (Equity, FX, Credit or Rates).
  • Nice to Have:

  • Experience of FIX protocol
  • Experience developing in data engineering (using Python with pandas, R, Julia)
  • Experience with KDB+/q
  • This advertiser has chosen not to accept applicants from your region.

    Software Developer - Data Pipelines (Python)

    Bengaluru, Karnataka Squarepoint Capital

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    Team: Development - Alpha Data

    Position Overview:

    We are seeking an experienced Python developer to join our Alpha Data team, responsible for delivering a vast quantity of data served to users worldwide. You will be a cornerstone of a growing Data team, becoming a technical subject matter expert and developing strong working relationships with quant researchers, traders, and fellow colleagues across our Technology organisation.

    Alpha Data teams are able to deploy valuable data to the rest of the Squarepoint business at speed. Ingestion pipelines and data transformation jobs are resilient and highly maintainable, while the data models are carefully designed in close collaboration with our researchers for efficient query construction and alpha generation.

    We achieve an economy of scale through building new frameworks, libraries, and services used to increase the team's quality of life, throughput, and code quality. Teamwork and collaboration are encouraged, excellence is rewarded and diversity of thought and creative solutions are valued. Our emphasis is on a culture of learning, development, and growth.

  • Take part ownership of our ever-growing estate of data pipelines,
  • Propose and contribute to new abstractions and improvements - make a real positive impact across our team globally,
  • Design, implement, test, optimize and troubleshoot our data pipelines, frameworks, and services,
  • Collaborate with researchers to onboard new datasets,
  • Regularly take the lead on production support operations - during normal working hours only.
  • Required Qualifications:

  • 4+ years of experience coding to a high standard in Python,
  • Bachelor's degree in a STEM subject,
  • Experience with and knowledge of SQL, and one or more common RDBMS systems (we mostly use Postgres),
  • Practical knowledge of commonly used protocols and tools used to transfer data (e.g. FTP, SFTP, HTTP APIs, AWS S3),
  • Excellent communication skills.
  • Nice to haves

  • Experience with big data frameworks, databases, distributed systems, or Cloud development.
  • Experience with any of these: C++, kdb+/q, Rust.
  • This advertiser has chosen not to accept applicants from your region.

    Staff Software Engineer, Data Pipelines & APIs (India)

    Bengaluru, Karnataka Mixpanel

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    About Mixpanel

    Mixpanel is an event analytics platform for builders who need answers from their data at their fingertips—no SQL required. When everyone in the organization can see and learn from the impact of their work on product, marketing, and company revenue metrics, they are poised to make better decisions.

    Over 9,000 paid customers, including companies like Netflix, Pinterest, Sweetgreen, and Samsara, use Mixpanel to understand their customers and measure progress. Our commitment is to provide the most comprehensive and reliable analytics platform accessible and trusted by all.

    About Mixpanel

    Mixpanel is powered by a custom distributed database. This system ingests more than 1 Trillion user-generated events every month while ensuring end-to-end latencies of under a minute and queries typically scan more than 1 Quadrillion events over the span of a month. Over the last year, our inbound traffic has doubled. As our existing customers grow in volume and we add new ones, we expect this growth in traffic to continue. The Distributed Systems engineering teams are responsible for adding new capabilities and ensuring the smooth operation of the underlying systems.

    About the Team

    The Data Pipeline & API team is responsible for the Data APIs, Pipelines, and Integrations that power real-time movement of customer data and support trillions of requests each month. These systems are critical to Mixpanel’s product and business, enabling core workflows for external customers. This includes data APIs, which are foundational to customer onboarding and must be highly reliable and scalable; data export APIs, which help us maintain feature parity and support the needs of scaling customers; and integrations like warehouse connectors and cohort exports, which reduce friction, drive adoption, and expand our partner ecosystem. This role ensures these systems are robust, efficient, and aligned with a product strategy that emphasizes impact, growth, and long-term reliability.

    Responsibilities

    As our first engineer in India you'll be responsible for:

  • Starting and building the India Software Engineering Team
  • Working with other engineers to build distributed systems that can handle data at scale
  • Debugging production issues across multiple services and all levels of our infrastructure stack
  • Ensuring reliability and uptime of the services you're responsible for
  • Keeping an eye on how much your service costs every month and removing inefficiencies wherever possible
  • Improving engineering standards and holding a high bar for code quality and simplicity
  • Pushing the boundaries on how our customers analyze their product data
  • A typical project requires a thorough understanding of how not just your service works but also how it interacts with other components. Here are some projects we've worked on in the past to give you an idea of what to expect.

  • Mixpanel's infrastructure runs on Google Cloud Platform. We rely on Kubernetes and Docker for orchestration and containerization of our services. We primarily use Golang for writing services and all internal communication happens via GRPC. We use a combination of C and C++ wherever Golang doesn't meet our performance goals.
  • Most of the systems in our stack provide at least once semantics. As a result, we risk duplicating events that flow through them. To overcome this limitation, we added support for event deduplication that can work at our scale. Typical approaches for deduplication don't perform well on large amounts of data, so we had to do something highly custom for our stack. We wrote about this on our engineering blog .
  • Back in 2019, we migrated our ingestion API service from Python to Golang for better performance and type safety. We had to do this while ensuring that both systems handle data the same way. Because we had to compare, both, HTTP responses and transformed payloads, nothing out of the box worked for us. talks about how we did the actual migration without any customer visible downtime.
  • In 2021, as our traffic grew almost 100%, the cost of storing data became untenable. Our engineers worked on to eventually realize almost $3000 in savings per month.
  • Warehouse Connector Mirror mode solves for data immutability by overcoming historic data model limitations. Mixpanel now syncs with customer data warehouses in real time and will reflect all changes, including additions, updates, or deletions.
  • If projects like the ones listed above excite you, the Distributed Systems engineering team will be a great fit.

    We're Looking For Someone Who Has

  • A strong grasp of computer science fundamentals when it comes to dealing with distributed systems and networks. You'll routinely run into issues where "one in a million" chances actually happen in production
  • A knack for problem-solving and thinking from first principles. You don't shy away from any problem, no matter the scale or impact
  • A bias towards shipping early and iterating. We believe in making small incremental changes to existing systems instead of large multi-quarter undertakings
  • Experience building distributed systems or is eager to build that skillset
  • #LI-Hybrid

    Benefits and Perks

  • Comprehensive Medical, Vision, and Dental Care
  • Mental Wellness Benefit
  • Generous Vacation Policy & Additional Company Holidays
  • Enhanced Parental Leave
  • Volunteer Time Off
  • Additional US Benefits: Pre-Tax Benefits including 401(K), Wellness Benefit, Holiday Break
  • *please note that benefits and perks for contract positions will vary*

    Culture Values

  • Make Bold Bets: We choose courageous action over comfortable progress.
  • Innovate with Insight: We tackle decisions with rigor and judgment - combining data, experience and collective wisdom to drive powerful outcomes.
  • One Team: We collaborate across boundaries to achieve far greater impact than any of us could accomplish alone.
  • Candor with Connection: We build meaningful relationships that enable honest feedback and direct conversations.
  • Champion the Customer: We seek to deeply understand our customers’ needs, ensuring their success is our north star.
  • Powerful Simplicity: We find elegant solutions to complex problems, making sophisticated things accessible.
  • This advertiser has chosen not to accept applicants from your region.

    Python Developer (ETL Pandas, SQL, Data Pipelines & Automation)

    Bellandur, Karnataka Synechron

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    Job Summary

    Synechron is seeking a detail-oriented and analytical Python Developer to join our data team. In this role, you will design, develop, and optimize data pipelines, analysis tools, and workflows that support key business and analytical functions. Your expertise in data manipulation, database management, and scripting will enable the organization to enhance data accuracy, efficiency, and insights. This position offers an opportunity to work closely with data analysts and scientists to build scalable, reliable data solutions that contribute directly to business decision-making and operational excellence.

    Software Requirements

    Required Skills:

  • Python (version 3.7 or higher) with experience in data processing and scripting
  • Pandas library (experience in large dataset manipulation and analysis)
  • SQL (proficiency in writing performant queries for data extraction and database management)
  • Data management tools and databases such as MySQL, PostgreSQL, or similar relational databases
  • Preferred Skills:

  • Experience with cloud data services (AWS RDS, Azure SQL, GCP Cloud SQL)
  • Knowledge of additional Python libraries such as NumPy, Matplotlib, or Jupyter Notebooks for data analysis and visualization
  • Data pipeline orchestration tools (e.g., Apache Airflow)
  • Version control tools like Git
  • Overall Responsibilities

  • Develop, test, and maintain Python scripts for ETL processes and data workflows
  • Utilize Pandas to clean, analyze, and transform large datasets efficiently
  • Write, optimize, and troubleshoot SQL queries for data extraction, updates, and management
  • Collaborate with data analysts and scientists to create data-driven analytic tools and solutions
  • Automate repetitive data workflows to increase operational efficiency and reduce errors
  • Maintain detailed documentation of data processes, pipelines, and procedures
  • Troubleshoot data discrepancies, pipeline failures, and database-related issues efficiently
  • Support ongoing data quality initiatives by identifying and resolving data inconsistencies
  • Technical Skills (By Category)

    Programming Languages:

  • Required: Python (3.7+), proficiency with data manipulation and scripting
  • Preferred: Additional scripting languages such as R or familiarity with other programming environments
  • Databases/Data Management:

  • Relational databases: MySQL, PostgreSQL, or similar
  • Experience with query optimization and database schema design
  • Cloud Technologies:

  • Preferred: Basic experience with cloud data services (AWS, Azure, GCP) for data storage and processing
  • Frameworks and Libraries:

  • Pandas, NumPy, Matplotlib, Jupyter Notebooks for data analysis and visualization
  • Airflow or similar orchestration tools (preferred)
  • Development Tools and Methodologies:

  • Git or similar version control tools
  • Agile development practices and collaborative workflows
  • Security Protocols:

  • Understanding of data privacy, confidentiality, and secure coding practices
  • Experience Requirements

  • 3+ years of experience in Python development with a focus on data processing and management
  • Proven hands-on experience in building and supporting ETL workflows and data pipelines
  • Strong experience working with SQL and relational databases
  • Demonstrated ability to analyze and manipulate large datasets efficiently
  • Familiarity with cloud data services is advantageous but not mandatory
  • Day-to-Day Activities

  • Write and enhance Python scripts to perform ETL, data transformation, and automation tasks
  • Design and optimize SQL queries for data extraction and updates
  • Collaborate with data analysts, scientists, and team members during daily stand-ups and planning sessions
  • Investigate and resolve data quality issues or pipeline failures promptly
  • Document data pipelines, workflows, and processes for clarity and future maintenance
  • Assist in developing analytical tools and dashboards for business insights
  • Review code changes through peer reviews and ensure adherence to best practices
  • Participate in continuous improvement initiatives related to data workflows and processing techniques
  • Qualifications

  • Bachelor’s degree in Computer Science, Data Science, Information Technology, or a related field
  • Relevant certifications or training in Python, data engineering, or database management are a plus
  • Proven track record of working on data pipelines, analysis, and automation projects
  • Professional Competencies

  • Strong analytical and problem-solving skills with attention to detail
  • Effective communication skills, able to collaborate across teams and explain technical concepts clearly
  • Ability to work independently and prioritize tasks effectively
  • Continuous learner, eager to adopt new tools, techniques, and best practices in data processing
  • Adaptability to changing project requirements and proactive in identifying process improvements
  • Focused on delivering high-quality work with a results-oriented approach
  • Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.


    All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.

    This advertiser has chosen not to accept applicants from your region.

    Senior Database Infrastructure Engineer- Cassandra, DataStax, Big Data Pipelines

    Pune, Maharashtra HEROIC.com

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    HEROIC Cybersecurity ( HEROIC ) is seeking a Senior Data Infrastructure Engineer with deep expertise in DataStax Enterprise (DSE) and Apache Cassandra to help architect, scale, and maintain the data infrastructure that powers our cybersecurity intelligence platforms.

    You will be responsible for designing and managing fully automated, big data pipelines that ingest, process, and serve hundreds of billions of breached and leaked records sourced from the surface, deep, and dark web. You'll work with DSE Cassandra, Solr, and Spark, helping us move toward a 99% automated pipeline for data ingestion, enrichment, deduplication, and indexing — all built for scale, speed, and reliability.

    This position is critical in ensuring our systems are fast, reliable, and resilient as we ingest thousands of unique datasets daily from global threat intelligence sources.

    What you will do: 

    • Design, deploy, and maintain high-performance Cassandra clusters using DataStax Enterprise (DSE)
    • Architect and optimize automated data pipelines to ingest, clean, enrich, and store billions of records daily
    • Configure and manage DSE Solr and Spark to support search and distributed processing at scale
    • Automate dataset ingestion workflows from unstructured surface, deep, and dark web sources
    • Cluster management, replication strategy, capacity planning, and performance tuning
    • Ensure data integrity, availability, and security across all distributed systems
    • Write and manage ETL processes, scripts, and APIs to support data flow automation
    • Monitor systems for bottlenecks, optimize queries and indexes, and resolve production issues
    • Research and integrate third-party data tools or AI-based enhancements (e.g., smart data parsing, deduplication, ML-based classification)
    • Collaborate with engineering, data science, and product teams to support HEROIC’s AI-powered cybersecurity platform




    Requirements
    • Minimum 5 years experience with Cassandra / DataStax Enterprise in production environments
    • Hands-on experience with DSE Cassandra, Solr, Apache Spark, CQL, and data modeling at scale
    • Strong understanding of NoSQL architecture, sharding, replication, and high availability
    • Advanced knowledge of Linux/Unix, shell scripting, and automation tools (e.g., Ansible, Terraform)
    • Proficient in at least one programming language: Python, Java, or Scala
    • Experience building large-scale automated data ingestion systems or ETL workflows
    • Solid grasp of AI-enhanced data processing, including smart cleaning, deduplication, and classification
    • Excellent written and spoken English communication skills
    • Prior experience with cybersecurity or dark web data (preferred but not required)



    Benefits
    • Position Type: Full-time
    • Location: Pune, India  (Remote – Work from anywhere)
    • Compensation: Competitive salary based on experience
    • Benefits: Paid Time Off + Public Holidays
    • Professional Growth: Amazing upward mobility in a rapidly expanding company.
    • Innovative Culture: Fast-paced, innovative, and mission-driven. Be part of a team that leverages AI and cutting-edge technologies. 

       

    About Us: HEROIC Cybersecurity ( HEROIC ) is building the future of cybersecurity. Unlike traditional cybersecurity solutions, HEROIC takes a predictive and proactive approach to intelligently secure our users before an attack or threat occurs. Our work environment is fast-paced, challenging and exciting. At HEROIC, you’ll work with a team of passionate, engaged individuals dedicated to intelligently securing the technology of people all over the world.

    Position Keywords: DataStax Enterprise (DSE), Apache Cassandra, Apache Spark, Apache Solr, AWS, Jira, NoSQL, CQL (Cassandra Query Language), Data Modeling, Data Replication, ETL Pipelines, Data Deduplication, Data Lake, Linux/Unix Administration, Bash, Docker, Kubernetes, CI/CD, Python, Java, Distributed Systems, Cluster Management, Performance Tuning, High Availability, Disaster Recovery, AI-based Automation, Artificial Intelligence, Big Data, Dark Web Data



    This advertiser has chosen not to accept applicants from your region.

    Senior Database Infrastructure Engineer- Cassandra, DataStax, Big Data Pipelines

    Pune, Maharashtra HEROIC.com

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    Job Description

    HEROIC Cybersecurity ( HEROIC.com ) is seeking a Senior Data Infrastructure Engineer with deep expertise in DataStax Enterprise (DSE) and Apache Cassandra to help architect, scale, and maintain the data infrastructure that powers our cybersecurity intelligence platforms.

    You will be responsible for designing and managing fully automated, big data pipelines that ingest, process, and serve hundreds of billions of breached and leaked records sourced from the surface, deep, and dark web. You'll work with DSE Cassandra, Solr, and Spark, helping us move toward a 99% automated pipeline for data ingestion, enrichment, deduplication, and indexing — all built for scale, speed, and reliability.

    This position is critical in ensuring our systems are fast, reliable, and resilient as we ingest thousands of unique datasets daily from global threat intelligence sources.

    What you will do: 

    • Design, deploy, and maintain high-performance Cassandra clusters using DataStax Enterprise (DSE)
    • Architect and optimize automated data pipelines to ingest, clean, enrich, and store billions of records daily
    • Configure and manage DSE Solr and Spark to support search and distributed processing at scale
    • Automate dataset ingestion workflows from unstructured surface, deep, and dark web sources
    • Cluster management, replication strategy, capacity planning, and performance tuning
    • Ensure data integrity, availability, and security across all distributed systems
    • Write and manage ETL processes, scripts, and APIs to support data flow automation
    • Monitor systems for bottlenecks, optimize queries and indexes, and resolve production issues
    • Research and integrate third-party data tools or AI-based enhancements (e.g., smart data parsing, deduplication, ML-based classification)
    • Collaborate with engineering, data science, and product teams to support HEROIC’s AI-powered cybersecurity platform




    Requirements
    • Minimum 5 years experience with Cassandra / DataStax Enterprise in production environments
    • Hands-on experience with DSE Cassandra, Solr, Apache Spark, CQL, and data modeling at scale
    • Strong understanding of NoSQL architecture, sharding, replication, and high availability
    • Advanced knowledge of Linux/Unix, shell scripting, and automation tools (e.g., Ansible, Terraform)
    • Proficient in at least one programming language: Python, Java, or Scala
    • Experience building large-scale automated data ingestion systems or ETL workflows
    • Solid grasp of AI-enhanced data processing, including smart cleaning, deduplication, and classification
    • Excellent written and spoken English communication skills
    • Prior experience with cybersecurity or dark web data (preferred but not required)



    Benefits
    • Position Type: Full-time
    • Location: Pune, India  (Remote – Work from anywhere)
    • Compensation: Competitive salary based on experience
    • Benefits: Paid Time Off + Public Holidays
    • Professional Growth: Amazing upward mobility in a rapidly expanding company.
    • Innovative Culture: Fast-paced, innovative, and mission-driven. Be part of a team that leverages AI and cutting-edge technologies. 

       

    About Us: HEROIC Cybersecurity ( HEROIC.com ) is building the future of cybersecurity. Unlike traditional cybersecurity solutions, HEROIC takes a predictive and proactive approach to intelligently secure our users before an attack or threat occurs. Our work environment is fast-paced, challenging and exciting. At HEROIC, you’ll work with a team of passionate, engaged individuals dedicated to intelligently securing the technology of people all over the world.

    Position Keywords: DataStax Enterprise (DSE), Apache Cassandra, Apache Spark, Apache Solr, AWS, Jira, NoSQL, CQL (Cassandra Query Language), Data Modeling, Data Replication, ETL Pipelines, Data Deduplication, Data Lake, Linux/Unix Administration, Bash, Docker, Kubernetes, CI/CD, Python, Java, Distributed Systems, Cluster Management, Performance Tuning, High Availability, Disaster Recovery, AI-based Automation, Artificial Intelligence, Big Data, Dark Web Data




    Requirements
    Minimum 8 years years of full-stack PHP development experience, with 3 years in Laravel Deep expertise in PHP, Laravel, MySQL, JavaScript (Vue.js or similar), Git, and RESTful APIs Experience with server and database management (Linux, Apache/Nginx, MySQL/PostgreSQL, Cassandra) Strong familiarity with AI-enhanced coding tools and modern DevOps workflows (CI/CD, GitHub Actions) Experience in a security-focused or SaaS product environment is a strong plus Excellent English communication skills (written and verbal) Comfortable working independently during 9:00am-6:00pm (Pacific Time) U.S. hours and owning mission-critical systems
    This advertiser has chosen not to accept applicants from your region.

    Senior Database Infrastructure Engineer- Cassandra, DataStax, Big Data Pipelines

    411001 Pune, Maharashtra HEROIC.com

    Posted 22 days ago

    Job Viewed

    Tap Again To Close

    Job Description

    Permanent

    HEROIC Cybersecurity ( HEROIC.com ) is seeking a Senior Data Infrastructure Engineer with deep expertise in DataStax Enterprise (DSE) and Apache Cassandra to help architect, scale, and maintain the data infrastructure that powers our cybersecurity intelligence platforms.

    You will be responsible for designing and managing fully automated, big data pipelines that ingest, process, and serve hundreds of billions of breached and leaked records sourced from the surface, deep, and dark web. You'll work with DSE Cassandra, Solr, and Spark, helping us move toward a 99% automated pipeline for data ingestion, enrichment, deduplication, and indexing — all built for scale, speed, and reliability.

    This position is critical in ensuring our systems are fast, reliable, and resilient as we ingest thousands of unique datasets daily from global threat intelligence sources.

    What you will do: 

    Design, deploy, and maintain high-performance Cassandra clusters using DataStax Enterprise (DSE) Architect and optimize automated data pipelines to ingest, clean, enrich, and store billions of records daily Configure and manage DSE Solr and Spark to support search and distributed processing at scale Automate dataset ingestion workflows from unstructured surface, deep, and dark web sources Cluster management, replication strategy, capacity planning, and performance tuning Ensure data integrity, availability, and security across all distributed systems Write and manage ETL processes, scripts, and APIs to support data flow automation Monitor systems for bottlenecks, optimize queries and indexes, and resolve production issues Research and integrate third-party data tools or AI-based enhancements (e.g., smart data parsing, deduplication, ML-based classification) Collaborate with engineering, data science, and product teams to support HEROIC’s AI-powered cybersecurity platform Requirements Minimum 5 years experience with Cassandra / DataStax Enterprise in production environments Hands-on experience with DSE Cassandra, Solr, Apache Spark, CQL, and data modeling at scale Strong understanding of NoSQL architecture, sharding, replication, and high availability Advanced knowledge of Linux/Unix, shell scripting, and automation tools (e.g., Ansible, Terraform) Proficient in at least one programming language: Python, Java, or Scala Experience building large-scale automated data ingestion systems or ETL workflows Solid grasp of AI-enhanced data processing, including smart cleaning, deduplication, and classification Excellent written and spoken English communication skills Prior experience with cybersecurity or dark web data (preferred but not required) Benefits Position Type: Full-time Location: Pune, India  (Remote – Work from anywhere) Compensation: Competitive salary based on experience Benefits: Paid Time Off + Public Holidays Professional Growth: Amazing upward mobility in a rapidly expanding company. Innovative Culture: Fast-paced, innovative, and mission-driven. Be part of a team that leverages AI and cutting-edge technologies.   

    About Us: HEROIC Cybersecurity ( HEROIC.com ) is building the future of cybersecurity. Unlike traditional cybersecurity solutions, HEROIC takes a predictive and proactive approach to intelligently secure our users before an attack or threat occurs. Our work environment is fast-paced, challenging and exciting. At HEROIC, you’ll work with a team of passionate, engaged individuals dedicated to intelligently securing the technology of people all over the world.

    Position Keywords: DataStax Enterprise (DSE), Apache Cassandra, Apache Spark, Apache Solr, AWS, Jira, NoSQL, CQL (Cassandra Query Language), Data Modeling, Data Replication, ETL Pipelines, Data Deduplication, Data Lake, Linux/Unix Administration, Bash, Docker, Kubernetes, CI/CD, Python, Java, Distributed Systems, Cluster Management, Performance Tuning, High Availability, Disaster Recovery, AI-based Automation, Artificial Intelligence, Big Data, Dark Web Data

    This advertiser has chosen not to accept applicants from your region.
    Be The First To Know

    About the latest Data pipelines Jobs in India !

    Senior Group Data Engineering Manager(Data Pipelines, ADF, ADB, Python, SQL)

    Bengaluru, Karnataka AtkinsRéalis

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    Job Description

    We’re AtkinsRéalis, a world-leading Design, Engineering and Project Management organization. Created by the integration of long-standing organizations dating back to 1911, we are a world-leading professional services and project management company dedicated to engineering a better future for our planet and its people. We create sustainable solutions that connect people, data and technology to transform the world's infrastructure and energy systems. We deploy global capabilities locally to our clients and deliver unique end-to-end services across the whole life cycle of an asset including consulting, advisory & environmental services, intelligent networks & cybersecurity, design & engineering, procurement, project & construction management, operations & maintenance, decommissioning and capital. The breadth and depth of our capabilities are delivered to clients in key strategic sectors such as Engineering Services, Nuclear, Operations & Maintenance and Capital.

    News and information are available at or follow us on LinkedIn.

    Our teams are proud to deliver on some of the most prestigious projects across the world. It's thanks to our talented people and their diverse thinking, expertise, and knowledge. Join us and you'll be part of our genuinely collaborative environment, where everyone is supported to make the most of their talents and expertise.
    When it comes to work-life balance, AtkinsRéalis is a great place to be. So, let's discuss how our flexible and remote working policies can support your priorities. We're passionate about are work while valuing each other equally. So, ask us about some of our recent pledges for Women's Equality and being a 'Disability Confidence' and 'Inclusive Employer’.

    EAI-AtkinsRéalis is a vibrant and continuously growing team. It is an important part of GTC-AtkinsRéalis and widely recognized for its high and quality project deliveries. This would be a vital role to take EAI one step forward in providing data solutions to our business and client. This role would simultaneously work on multiple projects and would provide planning, designing and delivery of data driven projects. Effective communication and a team player are important characteristics of this role.

    Key Activities for This Role

  • Technical guide for a team of Lead Data Engineers.
  • Develop, configure, deploy, and optimize Microsoft Azure based Data solutions.
  • Collaborate with other team members to develop and enhance deliverables.
  • Continuously improve team processes to ensure information is of the highest quality, contributing to the overall effectiveness of the team.
  • Stay abreast of industry changes, especially in the areas of cloud data and analytics technologies.
  • Ability to simultaneously work and deliver on more than one project on Individual contributor role.
  • Ability to work on multiple areas like Data pipeline ETL, Data modelling & design, writing complex SQL queries etc.
  • Capable of planning and executing on both short-term and long-term goals on your own and with the team.
  • Partner with other Data Engineers, Data architects, domain experts, data analysts and other teams to build foundational data sets that are trusted, well understood, aligned with business strategy and enable self-service.
  • Guide, mentor guide Data Engineers, Sr Data Engineers on Data Architecture, data models, implementation techniques and technologies.
  • Experience & Skills Required:

  • 12+ years of experience designing, developing, Architecture and deploying data solutions using Power BI, Azure platform.
  • Experience on designing Data pipelines (ETL/ELT), Datawarehouse and Data marts.
  • Hands-on expert with real-time data processing and analytics, data ingestion (batched and streamed), and data storage solutions.
  • Hands on Azure Analysis Services & Power BI and good to have experience on other tools.
  • Hands on experience with Data Factory, Data Lake Storage, Databricks, Data Explorer, Machine Learning, and Azure Synapse Analytics is good to have.
  • Expert at creating data dissemination diagrams, data flow diagrams, data lifecycle diagrams, data migration diagrams, and data security diagrams etc.
  • Hands on experience with one of the data presentations tools like PowerBI, Tableau etc.
  • A proven expert in writing optimized SQL to deal with large data volumes.
  • Hands on coding in Python along its main data libraries like Pandas, NumPy, Beautiful soup etc.
  • Good to have ML exposer.
  • Good to have AWS experience.
  • Good to have GCP experience.
  • What We Can Offer You

  • Varied, interesting and meaningful work.
  • A hybrid working environment with flexibility and great opportunities.
  • Opportunities for training and, as the team grows, career progression or sideways moves.
  • An opportunity to work within a large global multi-disciplinary consultancy on a mission to change the ways we approach business as usual.
  • Why work for AtkinsRéalis?

    We at AtkinsRéalis are committed to developing its people both personally and professionally. Our colleagues have the advantage of access to a high ranging training portfolio and development activities designed to help make the best of individual’s abilities and talents. We also actively support staff in achieving corporate membership of relevant institutions.

    Meeting Your Needs

    To help you get the most out of life in and outside of work, we offer employees ‘Total Reward’.
    Making sure you're supported is important to us. So, if you identify as having a disability, tell us ahead of your interview, and we’ll discuss any adjustments you might need.
    Additional Information
    We are an equal opportunity, drug-free employer committed to promoting a diverse and inclusive community - a place where we can all be ourselves, thrive and develop. To help embed inclusion for all, from day one, we offer a range of family friendly, inclusive employment policies, flexible working arrangements and employee networks to support staff from different backgrounds. As an Equal Opportunities Employer, we value applications from all backgrounds, cultures and ability.

    We care about your privacy and are committed to protecting your privacy. Please consult our Privacy Notice on our Careers site to know more about how we collect, use and transfer your Personal Data.

    Link: Equality, diversity & inclusion | Atkins India (atkinsrealis.com)

    Worker Type

    Employee

    Job Type

    Regular

    At AtkinsRéalis, we seek to hire individuals with diverse characteristics, backgrounds and perspectives. We strongly believe that world-class talent makes no distinctions based on gender, ethnic or national origin, sexual identity and orientation, age, religion or disability, but enriches itself through these differences.

    This advertiser has chosen not to accept applicants from your region.

    Data Scientist (Cloud Management, SQL, Building cloud data pipelines, Python, Power BI, GCP)

    Chennai, Tamil Nadu UPS India

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow—people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.

    Job Description:

    Job Summary
    UPS Marketing team is looking for a talented and driven Data Scientist to drive its strategic objectives in the areas of pricing, revenue management, market analysis and evidence/data-based decision making. This role will work across multiple channels and teams to drive tangible results in the organization. You will focus on developing metrics for multiple channels and markets, applying advanced statistical modeling where appropriate and pioneering new analytical methods in a variety of fast paced and rapidly evolving consumer channels. This high visibility position will work with multiple levels of the organization, including senior leadership to bring analytical capabilities to the forefront of pricing, rate setting, and optimization of our go-to-market offers. You will contribute to rapidly evolving UPS Marketing analytical capabilities by working amongst a collaborative team of Data Scientists, Analysts and multiple business stakeholders.

    Responsibilities:

    • Become a subject matter expert on UPS business processes, data and analytical capabilities to help define and solve business needs using data and advanced statistical methods

    • Analyze and extract insights from large-scale structured and unstructured data utilizing multiple platforms and tools.

    • Understand and apply appropriate methods for cleaning and transforming data

    • Work across multiple stake holders to develop, maintain and improve models in production

    • Take the initiative to create and execute analyses in a proactive manner

    • Deliver complex analytical and visualizations to broader audiences including upper management and executives

    • Deliver analytics and insights to support strategic decision making

    • Understand the application of AI/ML when appropriate to solve complex business problems


    Qualifications

    • Expertise in R, SQL, Python.

    • Strong analytical skills and attention to detail. 

    • Able to engage key business and executive-level stakeholders to translate business problems to high level analytics solution approach.

    • Expertise with statistical techniques, machine learning or operations research and their application in business applications.

    • Deep understanding of data management pipelines and experience in launching moderate scale advanced analytics projects in production at scale.

    • Proficient in Azure, Google Cloud environment

    • Experience implementing open-source technologies and cloud services; with or without the use of enterprise data science platforms.

    • Solid oral and written communication skills, especially around analytical concepts and methods. 

    • Ability to communicate data through a story framework to convey data-driven results to technical and non-technical audience.

    • Master’s Degree in a quantitative field of mathematics, computer science, physics, economics, engineering, statistics (operations research, quantitative social science, etc.), international equivalent, or equivalent job experience.


    Bonus Qualifications

    • Experience with pricing methodologies and revenue management

    • Experience using PySpark, Azure Databricks, Google BigQuery and Vertex AI

    • Creating and implementing NLP/LLM projects

    • Experience utilizing and applying neurals networks and other AI methodologies

    • Familiarity with Data architecture and engineering


    Employee Type:
     

    Permanent


    UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.

    This advertiser has chosen not to accept applicants from your region.
     

    Nearby Locations

    Other Jobs Near Me

    Industry

    1. request_quote Accounting
    2. work Administrative
    3. eco Agriculture Forestry
    4. smart_toy AI & Emerging Technologies
    5. school Apprenticeships & Trainee
    6. apartment Architecture
    7. palette Arts & Entertainment
    8. directions_car Automotive
    9. flight_takeoff Aviation
    10. account_balance Banking & Finance
    11. local_florist Beauty & Wellness
    12. restaurant Catering
    13. volunteer_activism Charity & Voluntary
    14. science Chemical Engineering
    15. child_friendly Childcare
    16. foundation Civil Engineering
    17. clean_hands Cleaning & Sanitation
    18. diversity_3 Community & Social Care
    19. construction Construction
    20. brush Creative & Digital
    21. currency_bitcoin Crypto & Blockchain
    22. support_agent Customer Service & Helpdesk
    23. medical_services Dental
    24. medical_services Driving & Transport
    25. medical_services E Commerce & Social Media
    26. school Education & Teaching
    27. electrical_services Electrical Engineering
    28. bolt Energy
    29. local_mall Fmcg
    30. gavel Government & Non Profit
    31. emoji_events Graduate
    32. health_and_safety Healthcare
    33. beach_access Hospitality & Tourism
    34. groups Human Resources
    35. precision_manufacturing Industrial Engineering
    36. security Information Security
    37. handyman Installation & Maintenance
    38. policy Insurance
    39. code IT & Software
    40. gavel Legal
    41. sports_soccer Leisure & Sports
    42. inventory_2 Logistics & Warehousing
    43. supervisor_account Management
    44. supervisor_account Management Consultancy
    45. supervisor_account Manufacturing & Production
    46. campaign Marketing
    47. build Mechanical Engineering
    48. perm_media Media & PR
    49. local_hospital Medical
    50. local_hospital Military & Public Safety
    51. local_hospital Mining
    52. medical_services Nursing
    53. local_gas_station Oil & Gas
    54. biotech Pharmaceutical
    55. checklist_rtl Project Management
    56. shopping_bag Purchasing
    57. home_work Real Estate
    58. person_search Recruitment Consultancy
    59. store Retail
    60. point_of_sale Sales
    61. science Scientific Research & Development
    62. wifi Telecoms
    63. psychology Therapy
    64. pets Veterinary
    View All Data Pipelines Jobs