21 Data Engineering jobs in Indore
Data Engineering Role
Posted 4 days ago
Job Viewed
Job Description
Minimum Requirements:
- At least 3 years of professional experience in Data Engineering
- Demonstrated end-to-end ownership of ETL pipelines
- Deep, hands-on experience with AWS services: EC2, Athena, Lambda, and Step Functions (non-negotiable)
- Strong proficiency in MySQL (non-negotiable)
- Working knowledge of Docker: setup, deployment, and troubleshooting
Highly Preferred Skills:
- Experience with orchestration tools such as Airflow or similar
- Hands-on with PySpark
- Familiarity with the Python data ecosystem: SQLAlchemy, DuckDB, PyArrow, Pandas, NumPy
- Exposure to DLT (Data Load Tool)
Ideal Candidate Profile:
The role demands a builder’s mindset over a maintainer’s. Independent contributors with clear, efficient communication thrive here. Those who excel tend to embrace fast-paced startup environments, take true ownership, and are motivated by impact—not just lines of code. Candidates are expected to include the phrase Red Panda in their application to confirm they’ve read this section in full.
Key Responsibilities:
- Architect, build, and optimize scalable data pipelines and workflows
- Manage AWS resources end-to-end: from configuration to optimization and debugging
- Work closely with product and engineering to enable high-velocity business impact
- Automate and scale data processes—manual workflows are not part of the culture
- Build foundational data systems that drive critical business decisions
Compensation range: ₹8.4–12 LPA (fixed base), excluding equity, performance bonus, and revenue share components.
Data Engineering Azure databricks
Posted 12 days ago
Job Viewed
Job Description
Data Engineer (DE) Consultant is responsible for designing, developing, and maintaining data assets and data related products by liaising with multiple stakeholders.
Responsibilities:
- Work with stakeholders to understand the data requirements to design, develop, and maintain complex ETL processes.
- Create the data integration and data diagram documentation.
- Lead the data validation, UAT and regression test for new data asset creation.
- Create and maintain data models, including schema design and optimization.
- Create and manage data pipelines that automate the flow of data, ensuring data quality and consistency.
Qualifications and Skills:
- Strong knowledge on Python and Pyspark
- Expectation is to have ability to write Pyspark scripts for developing data workflows.
- Strong knowledge on SQL, Hadoop, Hive, Azure, Databricks and Greenplum
- Expectation is to write SQL to query metadata and tables from different data management system such as, Oracle, Hive, Databricks and Greenplum.
- Familiarity with big data technologies like Hadoop, Spark, and distributed computing frameworks.
- Expectation is to use Hue and run Hive SQL queries, schedule Apache Oozie jobs to automate the data workflows.
- Good working experience of communicating with the stakeholders and collaborate effectively with the business team for data testing.
- Expectation is to have strong problem-solving and troubleshooting skills.
- Expectation is to establish comprehensive data quality test cases, procedures and implement automated data validation processes.
- Degree in Data Science, Statistics, Computer Science or other related fields or an equivalent combination of education and experience.
- 4-7 years of experience in Data Engineer.
- Proficiency in programming languages commonly used in data engineering, such as Python, Pyspark, SQL.
- Experience in Azure cloud computing platform, such as developing ETL processes using Azure Data Factory, big data processing and analytics with Azure Databricks.
- Strong communication, problem solving and analytical skills with the ability to do time management and multi-tasking with attention to detail and accuracy.
Big data Lead
Posted 2 days ago
Job Viewed
Job Description
About Impetus
Impetus Technologies is a digital engineering company focused on delivering expert services and products to help enterprises achieve their transformation goals. We solve the analytics, AI, and cloud puzzle, enabling businesses to drive unmatched innovation and growth.
Founded in 1991, we are cloud and data engineering leaders providing solutions to fortune 100 enterprises, headquartered in Los Gatos, California, with development centers in NOIDA, Indore, Gurugram, Bengaluru, Pune, and Hyderabad with over 3000 global team members. We also have offices in Canada and Australia and collaborate with a number of established companies, including American Express, Bank of America, Capital One, Toyota, United Airlines, and Verizon.
Roles & Responsibilities
In-depth understanding of Big data concepts and distributed systems
Proficiency in Python and writing code in PySpark
Should possess good understanding of SQL concepts and must be proficient writing complex SQL queries
Thorough understanding of optimization and performance improvement for Spark based Workloads
Experience with working on atleast one Cloud platform will be a good to have
Conceptual knowledge and fitment for Data Structures & Algorithms
Possessing in-depth knowledge of various Data Processing Patterns (Batch/NRT/RT processing) & capable of providing design & architecture of typical business problems
Knowledge and experience with NoSQL Database (Cassandra/HBase/MongoDB/CouchDB/Neo4j),SQL Database (MySQL/Oracle).
Kafka, Redis, Distributed Message Queues alongwith Distributed Caching
Proficient understanding of Build tools (Maven/Gradle), Code Versioning tools (Git) with Continuous Integration
Strong technical development experience with writing performant code leveraging best coding practices.
Good communication skills and aptitude
Big data Lead
Posted today
Job Viewed
Job Description
Impetus Technologies is a digital engineering company focused on delivering expert services and products to help enterprises achieve their transformation goals. We solve the analytics, AI, and cloud puzzle, enabling businesses to drive unmatched innovation and growth.
Founded in 1991, we are cloud and data engineering leaders providing solutions to fortune 100 enterprises, headquartered in Los Gatos, California, with development centers in NOIDA, Indore, Gurugram, Bengaluru, Pune, and Hyderabad with over 3000 global team members. We also have offices in Canada and Australia and collaborate with a number of established companies, including American Express, Bank of America, Capital One, Toyota, United Airlines, and Verizon.
Roles & Responsibilities
In-depth understanding of Big data concepts and distributed systems
Proficiency in Python and writing code in PySpark
Should possess good understanding of SQL concepts and must be proficient writing complex SQL queries
Thorough understanding of optimization and performance improvement for Spark based Workloads
Experience with working on atleast one Cloud platform will be a good to have
Conceptual knowledge and fitment for Data Structures & Algorithms
Possessing in-depth knowledge of various Data Processing Patterns (Batch/NRT/RT processing) & capable of providing design & architecture of typical business problems
Knowledge and experience with NoSQL Database (Cassandra/HBase/MongoDB/CouchDB/Neo4j),SQL Database (MySQL/Oracle).
Kafka, Redis, Distributed Message Queues alongwith Distributed Caching
Proficient understanding of Build tools (Maven/Gradle), Code Versioning tools (Git) with Continuous Integration
Strong technical development experience with writing performant code leveraging best coding practices.
Good communication skills and aptitude
Big Data Developer
Posted 1 day ago
Job Viewed
Job Description
Job Description:
We are seeking a highly skilled Full Stack Big Data Engineer to join our team. The ideal candidate will have strong expertise in big data technologies, cloud platforms, microservices, and system design, with the ability to build scalable and efficient data-driven applications. This role requires hands-on experience across data engineering, backend development, and cloud deployment, along with a strong foundation in modern DevOps and monitoring practices.
Key Responsibilities:
- Design, build, and optimize big data pipelines using Scala, PySpark, Spark SQL, Spark Streaming, and Databricks.
- Develop and maintain real-time data processing solutions using Kafka Streams or similar event-driven platforms.
- Implement cloud-based solutions on Azure, leveraging services such as Azure Data Factory (ADF) and Azure Functions.
- Build scalable microservices with Core Java (8+) and Spring Boot.
- Collaborate on system design , including API development and event-driven architecture.
- Contribute to front-end development (JavaScript, React) as needed.
- Ensure application reliability through monitoring tools such as Grafana, New Relic, or similar.
- Utilize modern CI/CD tools (Git, Jenkins, Kubernetes, ArgoCD, etc.) for deployment and version control.
- Work cross-functionally with data engineers, software developers, and architects to deliver high-quality solutions.
Qualifications:
- 5+ years of professional experience as a Software/Data Engineer or Full Stack Engineer.
- Strong programming skills in Scala, Python, and Java .
- Experience with Databricks, Spark SQL, Spark Streaming , and PySpark .
- Hands-on experience with Azure cloud services and data engineering tools.
- Solid knowledge of microservices development with Spring Boot.
- Familiarity with event-driven platforms such as Kafka.
- Experience with CI/CD pipelines and containerization/orchestration tools.
- Strong problem-solving and communication skills.
- Bachelor’s or master’s degree in computer science, Engineering, or a related field (preferred).
Nice to Have:
- Experience with API design and event-driven architecture .
- Frontend development experience with React and JavaScript .
Big Data Engineer
Posted 2 days ago
Job Viewed
Job Description
Who we are
Alef Education began with a bold idea: that every learner deserves a personalised and meaningful education experience. What started in 2016 as a small pilot programme in Abu Dhabi has evolved into one of the world’s most dynamic EdTech companies—reshaping how millions of students engage with learning across the globe.
Today, Alef is proudly headquartered in the UAE, working hand-in-hand with ministries of education, schools, and teachers to bring smart, data-powered platforms into classrooms in over 14,000 schools.
Supporting over 1.1 million students and 50,000 teachers across the UAE, Indonesia & Morocco our AI-driven platforms generate 16+ million data points every day, helping drive smarter learning decisions. Whether it’s improving national exam results, boosting classroom engagement, or supporting educators with world-class tools, Alef is committed to impact at scale.
In 2024, Alef made history as the first EdTech company to list on the Abu Dhabi Securities Exchange (ADX), cementing our role as a regional innovator with global reach.
About The Role
As an ALEF Big Data Engineer you will have a strong understanding of big data technologies with an exceptional ability to code. You will provide technical leadership, working closely with the wider team to ensure high quality code is delivered in line with the project goals and delivery cycles. You will work closely with other teams to deliver rapid prototypes as well as production code for which you will ensure high accessibility standards are upheld. We expect familiarity with modern frameworks and languages, as well as working practices such as Clean Code, TDD, BDD, continuous integration, continuous delivery, and DevOps.
Key Responsibilities
Defining and developing services and solutions
- Define, design, and develop services and solutions around large data ingestion, storage, and management such as withRDBMS, No SQL DBs, Log Files, Events.
- Define, design, and run robust data pipelines/batch jobs in a production environment.
- Architecting highly scalable, highly concurrent, and low latency systems
Maintain, support, and enhance current systems.
- Contribute to paying down technical debt and use development approaches that minimize the growth of new technical debt.
- Contribute feedback to improve the quality, readability, and testability of the code base within your team.
- Mentor and train other developers in a non-line management capacity.
- Build tools (One of SBT, Gradle, Maven).
- Ensure all software built is robust and scalable.
Collaborating with Internal and external stakeholders
- Participating in sprint planning to work with developers and project teams to ensure projects are deployable and monitorable from the outside.
- Work with third-party and other internal providers to support a variety of integrations.
- As part of the team, you may be expected to participate in some of the 2nd line in-house support and Out-of-Hours support rotas.
- Proactively advise on best practices.
To Be The Right Fit, You'll Need
- Degree in Computer Science, Software Engineering or related preferred
- Minimum of 5 years experience in a Big Data
- Follow Clean Code/Solid principles
- Adhere and use TDD/BDD.
- Outstanding ability to develop efficient, readable, highly optimized/maintainable and clear code.
- Highly Proficient in either Functional Java or Scala, Python
- Knowledge of Azure Big Data/Analytics services – ADLS (Azure Data Lake Storage), HDInsight, Azure Data Factory, Azure Synapse Analytics, Azure Fabric, Azure Event Hubs, Azure Stream Analytics, Azure Databricks
- Experience of Storing Data in systems such as Hadoop HDFS, ADLS, Event Hubs
- Experience of designing, setting up and running big data tech stacks such as Hadoop, Azure Databricks, Spark and distributed datastores such as Cassandra, DocumentDBs, MongoDB, Event Hubs
- In-depth knowledge of Hadoop technology ecosystem – HDFS, Spark, Hive, HBase, Event Hubs, Flume, Sqoop, Oozie, SPARK, Avro, Parquet
- Experience debugging a complex multi-server service.
- In depth knowledge and experience in IaaS/PaaS solutions (eg AWS Infrastructure hosting and managed services)
- Familiarity with network protocols - TCP/IP, HTTP, SSL, etc.
- Knowledge of relational and non-relational database systems
- Understanding continuous integration and delivery.
- Mocking (any of the following Mockito, ScalaTest Spock, Jasmine, Mocha).
- IDE Intellij or Eclipse.
- Build tools (One of SBT, Gradle, Maven).
- Ensure all software built is robust and scalable.
- An ability to communicate technical concepts to a non-technical audience.
- Working knowledge of unix-like operating systems such as Linux and/or Mac OS X.
- Knowledge of the git version control system.
- Ability to quickly research and learn new programming tools and techniques.
Big data lead
Posted 2 days ago
Job Viewed
Job Description
Impetus Technologies is a digital engineering company focused on delivering expert services and products to help enterprises achieve their transformation goals. We solve the analytics, AI, and cloud puzzle, enabling businesses to drive unmatched innovation and growth.
Founded in 1991, we are cloud and data engineering leaders providing solutions to fortune 100 enterprises, headquartered in Los Gatos, California, with development centers in NOIDA, Indore, Gurugram, Bengaluru, Pune, and Hyderabad with over 3000 global team members. We also have offices in Canada and Australia and collaborate with a number of established companies, including American Express, Bank of America, Capital One, Toyota, United Airlines, and Verizon.
Roles & Responsibilities
In-depth understanding of Big data concepts and distributed systems
Proficiency in Python and writing code in Py Spark
Should possess good understanding of SQL concepts and must be proficient writing complex SQL queries
Thorough understanding of optimization and performance improvement for Spark based Workloads
Experience with working on atleast one Cloud platform will be a good to have
Conceptual knowledge and fitment for Data Structures & Algorithms
Possessing in-depth knowledge of various Data Processing Patterns (Batch/NRT/RT processing) & capable of providing design & architecture of typical business problems
Knowledge and experience with No SQL Database (Cassandra/HBase/Mongo DB/Couch DB/Neo4j), SQL Database (My SQL/Oracle).
Kafka, Redis, Distributed Message Queues alongwith Distributed Caching
Proficient understanding of Build tools (Maven/Gradle), Code Versioning tools (Git) with Continuous Integration
Strong technical development experience with writing performant code leveraging best coding practices.
Good communication skills and aptitude
Be The First To Know
About the latest Data engineering Jobs in Indore !
Senior Engineering Manager-Big Data, Generative AI
Posted 2 days ago
Job Viewed
Job Description
Over 50,000 customers globally trust our end-to-end, cloud-driven networking solutions. They rely on our top-rated services and support to accelerate their digital transformation efforts and deliver unprecedented progress. With double-digit growth year over year, no provider is better positioned to deliver scalable outcomes than Extreme.
Inclusion is one of our core values and in our DNA. We are committed to fostering an inclusive workplace that embraces our differences and creates an atmosphere where all our employees thrive because of their differences, not in spite of them.
Become part of Something big with Extreme! As a global networking leader, learn why there’s no better time to join the Extreme team.
Senior Engineering Manager (Big Data, Gen AI):
Experience : 12+ Years
Location: Bangalore, India
Remote
Are you energized by the idea of innovating with Generative AI? Do you want to create global impact while tackling challenges at the forefront of Artificial Intelligence? Do you dream of building ground-breaking products that define the future of AI-driven network management? Then come, advance with us at Extreme.
This is a greenfield opportunity to shape next-gen networking experiences at the cutting edge of Generative AI, Machine Learning, Big Data, and Cloud Computing. You will help define every aspect of the user journey, product vision, and technical roadmap, and you will drive innovation from concept to delivery.
There has never been a better time to join Extreme. With multiple acquisitions expanding our portfolio and market strategy, we are experiencing unprecedented growth worldwide. Recognized as a Technology Leader in the Gartner Magic Quadrant and a multi-year Best Employer award winner, Extreme is committed to a culture of diversity, inclusion, and equality, where every employee thrives because of their differences, not despite them.
Our AI Core group is pioneering platforms and solutions for Generative AI, including AI Agents, RAG, Knowledge Bases, Data Mining, Anomaly Detection, and LLM fine-tuning. These innovations power flagship Extreme products while enabling entirely new offerings. Together, we are driving a fundamental shift in how businesses manage networks by building intelligent, high-performance multi-agent systems that perceive, learn, and act in real time.
At Extreme, innovation is not just encouraged, it is expected. Advance with us and help shape the future of network intelligence.
Job Responsibilities
· Provide technical leadership and vision, driving innovation in distributed systems, large-scale data pipelines, and ML solutions
· Lead the full software development lifecycle, including design, architecture, testing, deployment, and operations
· Architect and deliver high-performance, scalable microservices and real-time inferencing systems using modern ML infrastructure
· Mentor and grow engineering talent, establish technical direction, and foster a culture of excellence and collaboration
· Champion engineering rigor, operational excellence, and process improvements to deliver resilient, scalable systems
Basic Qualifications
· Degree in Computer Science, Mathematics, or a related discipline
· 8+ years of experience across the full SDLC including design, coding, reviews, testing, deployment, and operations
· 4+ years of experience managing engineering teams with a proven track record of delivery
· 4+ years of experience building distributed Big Data solutions such as Spark, Kafka, Debezium, Hudi, Flink, or Glue
· 4+ years of experience designing and architecting large-scale distributed systems on cloud platforms such as AWS, Azure, or GCP
· Proven ability to optimize Big Data workflows and improve system performance at scale
· Proficiency with Docker, Kubernetes, and modern CI/CD practices
· Experience serving as a mentor, tech lead, or people manager in engineering organizations
Preferred Qualifications
· MS or PhD in Computer Science or a related field
· Experience with Graph ML and Graph technologies such as GNNs
· Experience building Generative AI solutions such as RAG, AI Agents, or LLM fine-tuning
Extreme Networks, Inc. (EXTR) creates effortless networking experiences that enable all of us to advance. We push the boundaries of technology leveraging the powers of machine learning, artificial intelligence, analytics, and automation. Over 50,000 customers globally trust our end-to-end, cloud-driven networking solutions and rely on our top-rated services and support to accelerate their digital transformation efforts and deliver progress like never before. For more information, visit Extreme's website or follow us on Twitter, LinkedIn, and Facebook.
We encourage people from underrepresented groups to apply. Come Advance with us! In keeping with our values, no employee or applicant will face discrimination/harassment based on: race, color, ancestry, national origin, religion, age, gender, marital domestic partner status, sexual orientation, gender identity, disability status, or veteran status. Above and beyond discrimination/harassment based on “protected categories,” Extreme Networks also strives to prevent other, subtler forms of inappropriate behavior (e.g., stereotyping) from ever gaining a foothold in our organization. Whether blatant or hidden, barriers to success have no place at Extreme Networks.
Big Data Java Developer
Posted 2 days ago
Job Viewed
Job Description
At ClearTrail, work is more than ‘just a job’. Our calling is to develop solutions that empower those dedicated to keep their people, places and communities safe.
For over 23 years, law enforcement & federal agencies across the globe have trusted ClearTrail as their committed partner in safeguarding nations & enriching lives. We are envisioning the future of intelligence gathering by developing artificial intelligence and machine learning based lawful interception & communication analytics solutions that solve the world’s most challenging problems.
Job: Big Data Java Developer
Location: Indore
Years of experience: 2-4 years
Requisition Description
1. Problem solving and analytical skills
2. Good verbal and written communication skills
Roles and Responsibilities
1. Design and develop high performance, scale-able applications using java and big data technologies
2. Build and maintain efficient data pipelines to process large volumes of structured and unstructured data.
3. Develop micro-services, API and distributed systems
4. Worked on Spark, HDFS, CEPh, Solr/Elastic search, Kafka, Delta lake
5. Mentor and Guide junior members
Big Data/Hadoop Administrator
Posted 1 day ago
Job Viewed
Job Description
At ClearTrail, work is more than ‘just a job’. Our calling is to develop solutions that empower those dedicated to keeping their people, places and communities safe. For over 23 years, law enforcement & federal agencies across the globe have trusted ClearTrail as their committed partner in safeguarding nations & enriching lives. We are envisioning the future of intelligence gathering by developing artificial intelligence and machine learning-based lawful interception & communication analytics solutions that solve the world’s most challenging problems.
Role- Big Data/Hadoop Administrator
Location – Indore, MP
Experience Required – 3 to 5 Years
What is your Role?
You will work in a multi-functional role with a combination of expertise in System and Hadoop administration. You will work in a team that often interacts with customers on various aspects related to technical support for deployed system. You will be deputed at customer premises to assist customers for issues related to System and Hadoop administration. You will Interact with QA and Engineering team to co-ordinate issue resolution within the promised SLA to customer.
What will you do?
- Deploying and administering Hortonworks, Cloudera, Apache Hadoop/Spark ecosystem.
- Installing Linux Operating System and Networking.
- Writing Unix SHELL/Ansible Scripting for automation.
- Maintaining core components such as Zookeeper, Kafka, NIFI, HDFS, YARN, REDIS, SPARK, HBASE etc.
- Takes care of the day-to-day running of Hadoop clusters using Ambari/Cloudera manager/Other monitoring tools, ensuring that the Hadoop cluster is up and running all the time.
- Maintaining HBASE Clusters and capacity planning.
- Maintaining SOLR Cluster and capacity planning.
- Work closely with the database team, network team and application teams to make sure that all the big data applications are highly available and performing as expected.
- Manage KVM Virtualization environment.
Must Have Skills -
- Technical Domain: Linux administration, Hadoop Infrastructure and Administration, SOLR, Configuration Management (Ansible etc).
- Linux Administration
- Experience in Python and Shell Scripting
- Deploying and administering Hortonworks, Cloudera, Apache Hadoop/Spark ecosystem
- Knowledge of Hadoop core components such as Zookeeper, Kafka, NIFI, HDFS, YARN, REDIS, SPARK etc.
- Knowledge of HBASE Clusters
- Working knowledge of SOLR, Elastic Search
Good to Have Skills:
- Experience in Networking Concepts
- Experience in Virtualization technology, KVM, OLVM