970 Big Data Technologies jobs in Bangalore
Director Data Science & Data Engineering
Posted today
Job Viewed
Job Description
At eBay, we're more than a global ecommerce leader — we’re changing the way the world shops and sells. Our platform empowers millions of buyers and sellers in more than 190 markets around the world. We’re committed to pushing boundaries and leaving our mark as we reinvent the future of ecommerce for enthusiasts.
Our customers are our compass, authenticity thrives, bold ideas are welcome, and everyone can bring their unique selves to work — every day. We're in this together, sustaining the future of our customers, our company, and our planet.
Join a team of passionate thinkers, innovators, and dreamers — and help us connect people and build communities to create economic opportunity for all.
Director – Data Science & Data Engineering
Shape the Future of AI-Driven eCommerce Discovery
About the Role
We're reimagining how people discover products in eCommerce—and we're looking for a visionary leader who blends technical depth with product intuition. If you're passionate about structured data, large language models, and building high-impact data products, this role is tailor-made for you.
As Director of Data Science & Data Engineering, you’ll lead a talented team of data scientists, analysts, and engineers working at the cutting edge of AI/ML, product analytics, and taxonomy design. Your mission? Drive innovation in product discovery through smarter data, scalable infrastructure, and breakthrough AI-powered solutions.
You’ll join the Product Knowledge org and play a key role in designing the backbone of next-gen search, recommendations, and generative AI experiences.
This is a high-impact, high-agency role—perfect for a hands-on leader who thrives in fast-paced, collaborative environments.
What You’ll Work On
Lead and inspire a cross-functional team to:
Transform Product Data into Insights
Conduct deep-dive SQL and Python analyses to uncover opportunities in taxonomy, ontology, and catalog structure that enhance discovery and user experience.
Harness the Power of Generative AI
Use prompt engineering and LLMs to create innovative tools for classification, taxonomy validation, and data enrichment.
Build & Evaluate AI/ML Models
Design frameworks to evaluate product knowledge models, semantic embeddings, and ML-based categorization systems.
Drive Data-Informed Strategy
Translate complex findings into clear, actionable insights for Product and Engineering teams. Influence roadmap decisions on entity resolution, catalog optimization, and knowledge graph development.
Partner Across Functions
Collaborate closely with Applied Research, Engineering, and Product teams to build and deploy high-impact data and AI solutions at scale.
Experiment & Innovate Fast
Prototype quickly, validate hypotheses, and iterate on structured data and AI-driven solutions that push boundaries.
What You Bring
12+ years of experience in data science or analytics roles, including 5+ years leading teams
Proven track record building data products, knowledge graphs, and scalable data pipelines
Deep understanding of eCommerce search, recommendation systems, and product analytics
Hands-on experience with LLMs, prompt engineering, and RAG techniques (preferred)
Strong communication skills and ability to influence cross-functional stakeholders
Experience evaluating ML models with custom metrics and robust frameworks
Startup mindset—comfortable with ambiguity, bias for action, and fast iteration
Why Join Us
Be at the forefront of AI-powered product discovery in eCommerce
Own high-impact initiatives in a startup-style culture with real autonomy
Work alongside world-class talent across AI, Product, and Engineering
Build solutions that scale—serving millions of users and shaping the future of shopping
Ready to lead the next wave of AI + Data innovation in commerce? Let’s build the future together.
Please see the for information regarding how eBay handles your personal data collected when you use the eBay Careers website or apply for a job with eBay.
eBay is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, sex, sexual orientation, gender identity, veteran status, and disability, or other legally protected status. If you have a need that requires accommodation, please contact us at . We will make every effort to respond to your request for accommodation as soon as possible. View our to learn more about eBay's commitment to ensuring digital accessibility for people with disabilities.
The eBay Jobs website uses cookies to enhance your experience. By continuing to browse the site, you agree to our use of cookies. Visit our for more information.
Data Engineering
Posted today
Job Viewed
Job Description
Job Title: Data Engineering Senior Associate – Microsoft Fabric, Azure (Databricks & ADF), PySpark
Experience: 4–10 Years
Location: PAN India
Job Summary:
We are looking for a skilled and experienced Data Engineer with 4-10 years of experience in building scalable data solutions on the Microsoft Azure ecosystem. The ideal candidate must have strong hands-on experience with Microsoft Fabric, Azure Databricks along with strong PySpark, Python and SQL expertise. Familiarity with Data Lake, Data Warehouse concepts, and end-to-end data pipelines is essential.
Key Responsibilities:
· Requirement gathering and analysis
· Design and implement data pipelines using Microsoft Fabric & Databricks
· Extract, transform, and load (ETL) data from various sources into Azure Data Lake Storage
· Implement data security and governance measures
· Monitor and optimize data pipelines for performance and efficiency
· Troubleshoot and resolve data engineering issues
· Provide optimized solution for any problem related to data engineering
· Ability to work with a variety of sources like Relational DB, API, File System, Realtime streams, CDC etc.
· Strong knowledge on Databricks, Delta tables
Required Skills:
· 4–10 years of experience in Data Engineering or related roles.
· Hands-on experience in Microsoft Fabric
· Hands-on experience in Azure Databricks
· Proficiency in PySpark for data processing and scripting.
· Strong command over Python & SQL – writing complex queries, performance tuning, etc.
· Experience working with Azure Data Lake Storage and Data Warehouse concepts (e.g., dimensional modeling, star/snowflake schemas).
· Hands on experience in performance tuning & optimization on Databricks & MS Fabric.
· Ensure alignment with overall system architecture and data flow.
· Understanding CI/CD practices in a data engineering context.
· Excellent problem-solving and communication skills.
· Exposure to BI tools like Power BI, Tableau, or Looker.
Good to Have:
· Experienced in Azure DevOps.
· Familiarity with data security and compliance in the cloud.
· Experience with different databases like Synapse, SQL DB, Snowflake etc.
Data Engineering
Posted today
Job Viewed
Job Description
Job Title: Middleware Engineer
Position: Data Engineer
Experience: 5-6yrs
Category: IT Infrastructure
Main location: India, Karnataka, Bangalore
Employment Type: Full Time
Qualification: Bachelor's degree in Computer Science or related field or higher.
Roles and Responsibilities
Data Engineer - 5-6 years experience.
Responsibilities
===
Design, develop, and maintain data architectures, pipelines, and workflows for the collection, processing, storage, and retrieval of large volumes of structured and unstructured data from multiple sources.
Collaborate with cross-functional teams to identify and prioritize data engineering requirements and to develop and deploy data-driven solutions to address business challenges.
Build and maintain scalable data storage and retrieval systems (e.g., data lakes, data warehouses, databases), fault-tolerant, and high-performance data platforms on cloud infrastructure such as AWS, Azure, or Google Cloud Platform.
Develop and maintain ETL workflows, data pipelines, and data transformation processes to prepare data for machine learning and AI applications.
Implement and optimize distributed computing frameworks such as Hadoop, Spark, or Flink to support high-performance and scalable processing of large data sets.
Build and maintain monitoring, alerting, and logging systems to ensure the availability, reliability, and performance of data pipelines and data platforms.
Collaborate with Data Scientists and Machine Learning Engineers to deploy models on production environments and ensure their scalability, reliability, and accuracy.
Requirements:
===
Bachelor s or master s degree in computer science, engineering, or related field.
At least 5-6 years of experience in data engineering, with a strong background in machine learning, cloud computing and big data technologies.
Experience with at least one major cloud platform (AWS, Azure, GCP).
Proficiency in programming languages like Python, Java, and SQL.
Experience with distributed computing technologies such as Hadoop, Spark, and Kafka.
Familiarity with database technologies such as SQL, NoSQL, NewSQL.
Experience with data warehousing and ETL tools such as Redshift, Snowflake, or Airflow.
Strong problem-solving and analytical skills.
Excellent communication and teamwork skills.
Preferred qualification:
===
Experience with DevOps practices and tools such as Docker, Kubernetes, or Ansible, Terraform.
Experience with data visualization tools such as Tableau, Superset, Power BI, or Plotly, D3.js.
Experience with stream processing frameworks such as Kafka, Pulsar or Kinesis.
Experience with data governance, data security, and compliance.
Experience with software engineering best practices and methodologies such as Agile or Scrum.
Must Have Skills
===
data engineer with expertise in machine learning, cloud computing , and big data technologies.
Data Engineering Experince on multiple clouds one of them , preferably GCP
data lakes, data warehouses, databases
ETL workflows, data pipelines,data platforms
Hadoop, Spark, or Flink
Hadoop, Spark, and Kafka
SQL, NoSQL, NewSQL
Redshift, Snowflake, or Airflow
Data Engineering
Posted 7 days ago
Job Viewed
Job Description
Responsibilities:
- Work with stakeholders to understand the data requirements to design, develop, and maintain complex ETL processes.
- Create the data integration and data diagram documentation.
- Lead the data validation, UAT and regression test for new data asset creation.
- Create and maintain data models, including schema design and optimization.
- Create and manage data pipelines that automate the flow of data, ensuring data quality and consistency.
Qualifications and Skills:
- Strong knowledge on Python and Pyspark
- Expectation is to have ability to write Pyspark scripts for developing data workflows.
- Strong knowledge on SQL, Hadoop, Hive, Azure, Databricks and Greenplum
- Expectation is to write SQL to query metadata and tables from different data management system such as, Oracle, Hive, Databricks and Greenplum.
- Familiarity with big data technologies like Hadoop, Spark, and distributed computing frameworks.
- Expectation is to use Hue and run Hive SQL queries, schedule Apache Oozie jobs to automate the data workflows.
- Good working experience of communicating with the stakeholders and collaborate effectively with the business team for data testing.
- Expectation is to have strong problem-solving and troubleshooting skills.
- Expectation is to establish comprehensive data quality test cases, procedures and implement automated data validation processes.
- Degree in Data Science, Statistics, Computer Science or other related fields or an equivalent combination of education and experience.
- 3-7 years of experience in Data Engineer.
- Proficiency in programming languages commonly used in data engineering, such as Python, Pyspark, SQL.
- Experience in Azure cloud computing platform, such as developing ETL processes using Azure Data Factory, big data processing and analytics with Azure Databricks.
- Strong communication, problem solving and analytical skills with the ability to do time management and multi-tasking with attention to detail and accuracy.
Data Engineering Consultant
Posted 1 day ago
Job Viewed
Job Description
**Primary Responsibilities:**
+ Provide database administration for mission-critical custom and packaged software applications
+ Design storage strategies around backup and recovery for complex database environments, physical structures, and specialized database applications - Enterprise Rapid Recovery
+ Uphold enterprise policy guidelines and recommend new/improved guidelines
+ Partner with project teams and interact with customers to find solutions for projects and operational issues for existing and proposed databases
+ Act as business liaison serving as primary point of contact between application business segments and database administrators
+ Demonstrate the knowledge and ability to perform in all of the basic database management skills of database administration, Web connectivity, physical structure, overall architecture, and database analysis
+ Provide standardization and consistency across environments
+ Ensure a stable and secure database environment
+ Apply database management consulting skills and gathers user requirements
+ Implement and monitor database functionality to ensure stable environments
+ Identify and initiate resolutions to user problems and concerns associated with database server equipment (hardware and software)
+ Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so
**Required Qualifications:**
+ Bachelor's degree in Computer Science or related field or equivalent work experience
+ 7+ years of Postgres DBA experience
+ 5+ years of Mysql database experience
+ 4+ years of experience in Azure Cloud
+ Hands-on experience in administering percona Xtradb cluster
+ Hands-on experience Mysql and Postgres replication tools
+ Experience in configuring, setting up REPMGR or similar technology for High availability solution
+ Solid knowledge of Mysql and postgres architecture and administration on both onprem and cloud platforms
+ Good understanding of PostgreSQL database architecture and repmgr cluster architecture in azure cloud
+ Expertise in Mysql and postgres database refresh and restore activities
+ Expertise in azure cloud architectur
+ Proven ability to install, monitor, and maintain PostgreSQL & MySQL software, implement monitoring and alertin
+ Proven ability to provide system and SQL performance tuning and assist in business process integration with various data sources
+ Proven ability to fulfill user requests ranging from access control, backup, restore, refresh to non-production to performance tuning
+ Proven ability to provide high availability and Disaster Recovery solutions
**Preferred Qualifications:**
+ Experience in setting up PostgreSQL clusters and handling switchover/failover activities, knowledge on repmgr
+ Work experience in backup and recovery processes, database refresh between different environments, knowledge of percona Xtrabackup
+ Knowledge on oracle Database administration
_At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission._
Technologist, Data Engineering
Posted 1 day ago
Job Viewed
Job Description
**Job Description**
We are seeking a highly experienced and visionary **Senior Data Scientist** at the **Technologist level** to lead strategic AI/ML and GenAI initiatives. This role demands deep technical expertise, leadership in complex projects, and a passion for innovation in data science and advanced analytics.
**Key Responsibilities**
+ Lead the **end-to-end data science lifecycle** : problem definition, data acquisition, model development, deployment, and monitoring.
+ Architect and implement **scalable AI/ML solutions** using modern frameworks, cloud platforms, and MLOps best practices.
+ Drive **GenAI initiatives** including fine-tuning, prompt engineering, and integration into enterprise applications.
+ Provide **strategic direction and thought leadership** on advanced analytics adoption across the business.
+ Mentor, coach, and upskill a team of data scientists and engineers; foster a culture of **innovation and collaboration** .
+ Partner with cross-functional teams (engineering, product, factory operations, IT) to translate business needs into data-driven solutions.
+ Ensure model **robustness, fairness, interpretability** , and compliance with ethical AI standards.
+ Design and oversee **experimentation frameworks** (A/B testing, causal inference, statistical modeling) for data-driven decision making.
+ Stay ahead of **emerging trends** in AI, ML, and big data technologies; evaluate their potential for business impact.
+ Present insights, models, and strategies to **senior leadership and non-technical stakeholders** in clear, actionable terms.
**Qualifications**
+ MS/ME/MTech/PhD in Data Science, Statistics, Computer Science, or related fields.
+ ~15 years of experience in data science, AI/ML, or advanced analytics, including leadership in complex projects.
+ Proven expertise in:
+ Machine Learning, Deep Learning, and Statistical Modeling
+ Optimization techniques for solving complex, high-dimensional problems.
+ GenAI applications including architectures like RAG, fine-tuning, and LLMOps.
+ Synthetic data generation and handling highly imbalanced and high-volume datasets.
+ GenAI applications including architectures like RAG, fine-tuning, and LLMOps.
+ Experience with **anomaly detection, pretrained transformers** , and **custom embedding models** .
+ Strong proficiency in **Python and SQL** for data wrangling, analysis, and modeling.
+ Hands-on experience with **TensorFlow, PyTorch, Pyspark** , and related AI/ML frameworks.
+ Deep understanding of **Big Data platforms** (e.g., Spark, Hadoop, distributed databases, cloud data warehouses).
+ Experience in **MLOps** : model deployment, monitoring, versioning, and lifecycle management.
+ Strong knowledge of **data architecture, pipelines** , and feature engineering at scale.
+ Familiarity with **data visualization tools** : Tableau, Power BI, Matplotlib, Plotly.
+ Excellent communication and stakeholder management skills, with the ability to influence at senior levels.
**Additional Information**
All your information will be kept confidential according to EEO guidelines.
Director, Data Engineering
Posted 10 days ago
Job Viewed
Job Description
**Job Purpose and Impact**
The Director, Data Engineering job leads a data engineering team responsible for the execution of the tactical and strategic plans related to design, development and maintenance of robust data pipelines and solutions. This job provides guidance to the team that ensures the efficient processing and availability of data for analysis and reporting.
**Key Accountabilities**
+ Establishes and maintains robust data systems that support large and complex data products, ensuring reliability and accessibility for partners.
+ Leads the development of technical products and solutions using big data and cloud based technologies, ensuring they are designed and built to be scalable, sustainable and robust.
+ Oversees and guides the design and development of data pipelines that facilitate the movement of data from various sources to internal databases.
+ Handles the construction and optimization of data infrastructure, resolving appropriate data formats to ensure data readiness for analysis.
+ Examines and settles appropriate data formats to optimize data usability and accessibility across the organization.
+ Liaises with partners to understand data needs and ensure alignment with organizational objectives.
+ Champions development standards and brings forward prototypes to test new data framework concepts and architecture patterns supporting efficient data processing and analysis and promoting standard methodologies in data management.
+ Leads the creation and maintenance of automated reporting systems that provide timely insights and facilitate data driven decision making.
+ Oversees data modeling to ensure the preparation of data in databases for use in various analytics tools and to configurate and develop data pipelines to move and improve data assets.
+ Manages team members to achieve the organization's goals, by ensuring productivity, communicating performance expectations, creating goal alignment, giving and seeking feedback, providing coaching, measuring progress and holding people accountable, supporting employee development, recognizing achievement and lessons learned, and developing enabling conditions for talent to thrive in an inclusive team culture.
**Qualifications**
Minimum requirement of 6 years of relevant work experience. Typically reflects 10 years or more of relevant experience.
Preferred Work Experience
+ Prior experience as a data/ software engineer performing data modeling and data pipeline engineering leveraging advanced cloud technologies and diverse coding languages
+ Leading geographically distributed engineering teams across a large global organization
+ Developing and managing strategic partnerships across both digital and business facing stakeholders
+ Track record of leading architecture strategies and execution across a diverse digital and data technology landscape
+ Experience developing and leading transformation strategies regarding to people, process, and technology
+ Thorough understanding of industry trends and best practices related to data engineering of robust, performant, and cost effective solutions
+ Proven record helping drive the adoption of new technologies and methods within the functional data and analytics team and be a role model and mentor for data engineers.
Be The First To Know
About the latest Big data technologies Jobs in Bangalore !
Data Engineering Consultant

Posted 13 days ago
Job Viewed
Job Description
**Primary Responsibilities:**
+ Support the full data engineering lifecycle including research, proof of concepts, design, development, testing, deployment, and maintenance of data management solutions
+ Utilize knowledge of various data management technologies to drive data engineering projects
+ Lead data acquisition efforts to gather data from various structured or semi-structured source systems of record to hydrate client data warehouse and power analytics across numerous health care domains
+ Leverage combination of ETL/ELT methodologies to pull complex relational and dimensional data to support loading DataMart's and reporting aggregates
+ Eliminate unwarranted complexity and unneeded interdependence
+ Detect data quality issues, identify root causes, implement fixes, and manage data audits to mitigate data challenges
+ Implement, modify, and maintain data integration efforts that improve data efficiency, reliability, and value
+ Leverage and facilitate the evolution of best practices for data acquisition, transformation, storage, and aggregation that solve current challenges and reduce the risk of future challenges
+ Effectively create data transformations that address business requirements and other constraints
+ Partner with the broader analytics organization to make recommendations for changes to data systems and the architecture of data platforms
+ Support the implementation of a modern data framework that facilitates business intelligence reporting and advanced analytics
+ Prepare high level design documents and detailed technical design documents with best practices to enable efficient data ingestion, transformation and data movement
+ Leverage DevOps tools to enable code versioning and code deployment
+ Leverage data pipeline monitoring tools to detect data integrity issues before they result into user visible outages or data quality issues
+ Leverage processes and diagnostics tools to troubleshoot, maintain and optimize solutions and respond to customer and production issues
+ Continuously support technical debt reduction, process transformation, and overall optimization
+ Leverage and contribute to the evolution of standards for high quality documentation of data definitions, transformations, and processes to ensure data transparency, governance, and security
+ Ensure that all solutions meet the business needs and requirements for security, scalability, and reliability
+ Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so
**Required Qualifications:**
+ Graduate degree or equivalent experience
+ Bachelor's Degree (preferably in information technology, engineering, math, computer science, analytics, engineering or other related field)
+ 5+ years of combined experience in data engineering, ingestion, normalization, transformation, aggregation, structuring, and storage
+ 5+ years of combined experience working with industry standard relational, dimensional or non-relational data storage systems
+ 5+ years of experience in designing ETL/ELT solutions using tools like Informatica, DataStage, SSIS , PL/SQL, T-SQL, etc.
+ 5+ years of experience in managing data assets using SQL, Python, Scala or other similar querying/coding language
+ 3+ years of experience working with healthcare data or data to support healthcare organizations
+ 3+ years of experience in Microsoft Azure Cloud, Azure Data Factory, Data Bricks, Spark, Scala / Python, ADO
**Preferred Qualifications:**
+ Certification in Azure Cloud preferably DP-203
+ Experience in Machine Learning Pipelines and AI in healthcare
+ Experience in Data Visualization and BI Tools (Power BI, Tableau) for healthcare reporting
+ Exposure in containerization and orchestration (Docker, Kubernetes, Airflow)
+ Familiarity with Data Tokenization, anonymization and synthetic data generation for research
_At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission._
Data Engineering Manager
Posted 1 day ago
Job Viewed
Job Description
iMerit is a leading AI data solutions company that transforms unstructured data into structured intelligence for advanced machine learning and analytics. Our customers span autonomous mobility, medical AI, agriculture, and more delivering high-quality data services that power next-generation AI systems.
About the Role
We are looking for a seasoned Engineering Lead to architect, scale, and continuously evolve our analytics and observability platform—a system deeply integrated with annotation tools and ML pipelines. This platform powers real-time visibility, operational insights, and automation across large-scale data operations.
In this role, you will not only lead and mentor a team but also set the technical vision for high-throughput streaming systems and modern data lake/warehouse architectures. You will bring proven expertise in high velocity, high volume data engineering, driving innovation in how we process, curate, and surface data to support mission-critical AI workflows
Key Responsibilities
- Lead & Inspire: Build and mentor a high-performing data engineering team, fostering innovation, accountability, and technical excellence
- Architect at Scale: Design and implement high-volume batch and real-time data pipelines across structured and unstructured sources
- Build and maintain real-time data lakes with streaming ingestion, ensuring data quality, lineage, and availability.
- Curate, transform, and optimize datasets into high-performance data warehouses (e.g., Redshift, Snowflake) for downstream analytics
- Deep Streaming Expertise: Drive adoption and optimization of Kafka for messaging, event streaming, and system integration, ensuring high throughput and low latency
- Advanced Processing: Leverage PySpark for distributed data processing and complex transformations, delivering scalable ETL/ELT pipelines
- Orchestration & Automation: Utilize AWS Glue and related cloud services to orchestrate data workflows, automate schema management, and scale pipelines seamlessly
- Continuous Improvement: Oversee platform upgrades, schema evolution, and performance tuning, ensuring the platform meets growing data and user demands
- Observability & Insights: Implement metrics, dashboards, and alerting for key KPIs (annotation throughput, quality, latency), ensuring operational excellence
- Cross-Functional Collaboration: Work closely with product, platform, and customer teams to define event models, data contracts, and integration strategies
- Innovation and R&D: Research emerging technologies in data streaming, lakehouse architectures, and observability, bringing forward new approaches and prototypes
Minimum Qualifications
- 10+ years of experience in data engineering or backend engineering, with at least 2–3 years in a leadership or team-lead role
- Proven track record in building and operating data pipelines at scale—including both batch ETL/ELT and real-time streaming
- Expert-level experience with Kafka for high-throughput data ingestion, streaming transformations, and integrations
- Strong hands-on experience with PySpark for distributed data processing and advanced transformations
- In-depth knowledge of AWS Glue(or similar) for orchestrating workflows, managing metadata, and automating ETL pipelines
- Demonstrated success in upgrading and maintaining real-time data lakes, curating and transforming datasets into performant data warehouses
- Familiarity with lakehouse and warehouse patterns (e.g., Delta Lake, Redshift, Snowflake) and schema versioning
- Experience with cloud-native data services (S3, Kinesis, Lambda, RDS) and infrastructure-as-code deployments
Preferred Qualifications
- Experience with Databricks and Snowflake solutions, including developing on lakehouse architectures and optimizing warehouse performance
- Exposure to annotation platforms, ML workflows, or model validation pipelines
- Experience with observability tools (Prometheus, Grafana, OpenTelemetry)
- Knowledge of data governance, RBAC, and compliance in large-scale analytics environments
- Comfort working in Agile, distributed teams with Git, JIRA, and Slack.
Why Join Us?
At iMerit, you will lead a team at the cutting edge of AI data infrastructure—building and evolving platforms that are explainable, auditable, and scalable. You will play a key role in upgrading and maintaining our streaming data lake and transforming it into analytics-ready warehouses, directly shaping how AI systems are built and trusted at scale.
Architect - Data Engineering
Posted today
Job Viewed
Job Description
Job Description
:
Education Qualification :
Engineer - B.E / B.Tech / MCA
Skills :
Secondary -> Technology | Big Data Tools / Systems | Streams | 4 - Advanced
Primary -> Technology | Data Analytics Activities | Data Integration | 4 - Advanced
Secondary -> Technology | Data Analytics Activities | Data Processing | 4 - Advanced
Primary -> Technology | Data Analytics Activities | Data Mining | 4 - Advanced
Tertiary -> Functional | Pre Sales Support Activities | Responding to RFPs | 4 - Advanced
Tertiary -> Technology | Data Analytics Activities | Data Analysis | 4 - Advanced
Tertiary -> Technology | BI, DWH, ETL Roles | DWH Architect | 4 - Advanced
Certification :
Technology | IT Certifications | Microsoft Certification | Perform Data Engineering on Microsoft HD Insight
Details:
The Professional will be responsible to architect methods to improve data reliability and quality. They will be responsible to combine raw information from different sources to create consistent and machine-readable formats. They also will need to develop and test architectures that enable data extraction and transformation for predictive or prescriptive modeling.
- Analyze and organize raw data
- Build data systems and pipelines
- Evaluate business needs and objectives
- Interpret trends and patterns
- Conduct complex data analysis and report on results
- Prepare data for prescriptive and predictive modeling
- Build algorithms and prototypes
- Combine raw information from different sources
- Explore ways to enhance data quality and reliability
- Identify opportunities for data acquisition
- Develop analytical tools and programs
- Collaborate with data scientists and architects on several projects
Location
:
India - Bengaluru
Created On
:
11-Aug-2025