Didn't find the right job?

Get expert career advice to help you find the ideal role and improve your job search strategy.

115 Devops Engineers jobs in Jaipur

Site Reliability Engineer

Jaipur, Rajasthan CodeKarma

Posted 3 days ago

Tap Again To Close

Job Description

Site Reliability Engineer (Multi-Cloud Deployments)

Location: Bangalore / Remote

Experience: 4–10 years

Type: Full-time (6-month probation)

About CodeKarma

CodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s workflow.

Our platform runs both as SaaS and as sub-account / on-prem deployments within our customers’ cloud environments.

We’re looking for engineers who can take ownership of these deployments end-to-end — from setup to monitoring, upgrades, and ongoing reliability.

What You’ll Do

You’ll be responsible for managing CodeKarma’s distributed deployments across client environments — ensuring reliability, security, and performance at scale.

Deploy and manage CodeKarma clusters across AWS, GCP, and Azure customer sub-accounts.
Monitor, upgrade, and maintain Kubernetes clusters and related infrastructure.
Implement observability, alerting, and disaster recovery for each deployment.
Handle CI/CD automation for platform releases, patches, and version upgrades.
Work closely with client engineering teams to adapt deployments to their environments, policies, and security constraints.
Diagnose and resolve environment-specific issues across networking, storage, and configuration layers.
Build and maintain infrastructure playbooks, Helm charts, and Terraform modules for standardized deployment.

What We’re Looking For

Strong experience managing Kubernetes clusters (EKS, GKE, AKS, or on-prem equivalents).
Deep understanding of Kubernetes internals, Helm, ingress controllers, networking, and storage classes .
Hands-on experience with CI/CD tools (GitHub Actions, ArgoCD, or similar).
Familiarity with monitoring and alerting stacks (Prometheus, Grafana, Loki, ELK, etc.).
Working knowledge of cloud infrastructure across AWS / GCP / Azure.
Ability to work directly with client engineering and DevOps teams , understanding their constraints and helping them integrate CodeKarma.
Strong debugging and communication skills — you’ll often be the bridge between CodeKarma and client infrastructure.

Why Join Us

Manage real, large-scale production environments across multiple enterprises.
Work directly with founders and senior engineers to shape how CodeKarma scales across clients.
High ownership, fast-moving environment, and exposure to deep-tech systems.

How to Apply

Please share:

A short summary of your Kubernetes experience (cluster management, scaling, debugging, etc.).
Any automation or deployment tooling you’ve built or maintained.
Links to your GitHub / GitLab / blog posts (if available).

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Jaipur, Rajasthan o9 Solutions, Inc.

Posted 3 days ago

Tap Again To Close

Job Description

Be part of something revolutionary

At o9 Solutions, our mission is clear: be the Most Valuable Platform (MVP) for enterprises. With our AI-driven platform — the o9 Digital Brain — we integrate global enterprises’ siloed planning capabilities, helping them capture millions and, in some cases, billions of dollars in value leakage. But our impact doesn’t stop there. Businesses that plan better and faster also reduce waste, which drives better outcomes for the planet, too.

We're on the lookout for the brightest, most committed individuals to join us on our mission. Along the journey, we’ll provide you with a nurturing environment where you can be part of something truly extraordinary and make a real difference for companies and the planet.

Site Reliability Engineer

You'll be working in the shift: > Remote (WFH): Night Shift (6PM - 2AM)

About the role.

This SRE professional will have the opportunity to work for an AI-based Unicorn which is recognized as one of the fastest-growing companies on the Inc. 5000 list. This role will provide you opportunity to deploy, maintain and support the o9 Digital Brain Platform across the world on AWS, AZURE, GCP &; Samsung Cloud utilizing state of the art CI/CD tools. This role will empower you to continuously challenge the status quo and implement the great ideas you may have to create value for o9 clients.

Major focus is on deployments, provisioning, upgrades, resizing, migrations, vulnerability fixes, security bug fixes, and patching at the infrastructure level. This team handles upgrades primarily on weekends, particularly Saturdays, and operates in shifts that cover different days of the week, including Tuesday to Saturday to support weekend upgrades.

What you will do in this role:

Deploy, maintain and support o9 digital Brain SaaS environments on all major clouds
Monitor availability and maintain system in good health
Build software and systems to manage platform infrastructure and applications
Improve reliability, quality, cost, and time-to-deploy, and time-to-upgrade
Monitor, measure and optimize system performance
Provide on-call support on rotation basis
Ability and flexibility to work with teams globally, across the time zones

What you’ll have.

Education: Bachelor’s degree in computer science, Software Engineering, Information Technology, Industrial Engineering, Engineering Management
Cloud (at least one) and Kubernetes administration certification
Experience: 4-8 years of experience in a SRE role , deploying and maintaining applications, performance tuning, conducting application upgrades, patches, and supporting continuous integration and deployment tooling
4+ years of experience deploying and maintaining applications in any one of the clouds (AWS, AZURE, GCP)
Experience with Dockers or similar and experience with Kubernetes or similar
Skills: Ability to debug issues and solve problems
Working knowledge with Jenkins, Ansible, Terraform, ArgoCD
Knowledge on any of the scripting language (Bash, shell, Powershell or Python etc)
Working knowledge with Linux and Windows operating system
Primary Skill: strong in operating system concepts, Linux and troubleshooting.
Secondary Skill: Automation and cloud
Characteristics: Passion to learn and adapt to new technology
We really value team spirit: Transparency and frequent communication is key. At o9, this is not limited by hierarchy, distance, or function

What we’ll do for you:

Flat organization: With a very strong entrepreneurial culture (and no corporate politics).
Great people and unlimited fun at work.
Possibility to really make a difference in a scale-up environment.
Support network: Work with a team you can learn from every day.
Diversity: We pride ourselves on our international working environment.
Work-Life Balance:
Feel part of A team:

How the process works.

Respond with your interest to us.
We’ll contact you either via video call or phone call - whatever you prefer, with the further schedule status.
HackerEarth Online Assessment - Domain specific
During the interview phase, you will meet with the technical panel for 60 minutes. We will contact you after the interview to let you know if we’d like to progress your application.
There will be 2 rounds of technical discussion followed by a Managerial round.
We will let you know if you’re the successful candidate.

Good luck!

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Jaipur, Rajasthan Amicon Hub Services

Posted 4 days ago

Tap Again To Close

Job Description

Key Responsibilities

Manage and scale production systems hosted on Google Cloud Platform (GCP)
Implement SRE best practices : monitoring, alerting, SLAs, SLOs, and error budgets
Automate operational tasks using Infrastructure as Code (IaC) tools like Terraform
Improve system reliability and reduce manual interventions through automation
Collaborate with development teams to ensure new services are production-ready
Incident response and post-mortem analysis to prevent recurring issues
Design and implement CI/CD pipelines for rapid and safe deployments
Manage GCP resources: IAM, VPC, Compute Engine, GKE, Cloud Functions, Pub/Sub, BigQuery, etc.
Ensure security, compliance, and cost optimization on the cloud infrastructure

Required Skills & Qualifications

5+ years of experience in SRE, DevOps, or Cloud Infrastructure roles
Strong hands-on experience with Google Cloud Platform (GCP) services
Proficiency with Terraform or other IaC tools
Solid knowledge of Kubernetes (GKE) , containerization, and microservices
Strong scripting skills in Python, Go, or Shell
Familiarity with incident response and post-mortem culture
Knowledge of networking, security, and cloud cost management

Preferred Qualifications

GCP certifications (e.g., Professional Cloud DevOps Engineer )
Prior experience working with e-commerce or high-scale platforms
Familiarity with SRE tooling like Chaos Engineering, Service Mesh (Istio), etc.

Soft Skills

Strong communication and stakeholder management
Problem-solving mindset with a focus on reliability and automation
Ability to work independently in a distributed, outsourced team model

This advertiser has chosen not to accept applicants from your region.

Senior Site Reliability Engineer- ELK Expert

Jaipur, Rajasthan iVedha Inc.

Posted today

Tap Again To Close

Job Description

Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice

Location: India (Remote) - Must be available to work in the EST (US/Canada) Time Zone.

Role Summary:

Are you a Senior Site Reliability Engineer (SRE) with deep ELK expertise, ready to take ownership of large-scale observability infrastructure?

We're looking for an SRE with 7+ years of experience, including 4+ years specializing in the ELK stack (Elasticsearch, Logstash, Kibana), to join our Platform Engineering Practice. In this role, you'll design, manage, and scale ELK clusters ingesting 2–3+ TB/day, enhance reliability across distributed systems, and drive automation within Azure cloud environments. This is a high-impact engineering opportunity focused on performance, observability, and operational excellence at scale.

Why Join Us

Career Growth: Work alongside industry experts on cutting-edge cloud technologies
Competitive Compensation and Benefits: We recognize and reward top talent
Exciting, Impactful Work: Design and build scalable, resilient cloud environments
Strategic Platform Role: Contribute to the foundation of next-gen observability and reliability infrastructure

What You Will Do

Design and Optimize Cloud Infrastructure: Architect scalable, fault-tolerant systems on Microsoft Azure
Automate Everything: Use Terraform, Ansible, and GitHub Actions to streamline deployment and configuration
Ensure Reliability and Performance: Proactively monitor, troubleshoot, and resolve production issues using Prometheus, Grafana, and Azure Monitor
Enhance Security and Compliance: Implement security best practices across DevOps workflows
Collaborate and Innovate: Work closely with engineering, security, and operations teams to drive automation and efficiency
Manage and scale large ELK clusters handling 2–3+ TB/day log volumes, ensuring high availability and performance
Optimize ELK architecture: Implement efficient index lifecycle policies, shard strategies, and hot-warm-cold tiered storage
Build and tune log pipelines: Scale Logstash and Beats pipelines across distributed environments
Support Kibana observability layers: Create dashboards, visualizations, and custom alerting frameworks (e.g., Watcher, ElastAlert)

What You Bring

7+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering
4+ years of dedicated, hands-on experience with ELK (Elasticsearch, Logstash, Kibana)
Strong experience managing large-scale ELK clusters in production with heavy ingestion (multi-TB/day)
Deep knowledge of index tuning, shard allocation, ILM policies, and scaling ELK components
Expertise in GitHub Actions, Terraform, Ansible, and Infrastructure as Code (IaC)
Proficiency in Python, Go, or Bash for automation and scripting
Deep understanding of Kubernetes, Docker, and cloud-native architectures
Experience with observability tools such as Prometheus, Grafana, Azure Monitor
Ability to work in a fast-paced, collaborative environment and solve complex operational issues

Education

Bachelor's or Master's degree in Computer Science, Information Technology, or a related field

Certifications (Nice to Have)

Microsoft Azure certifications: AZ-104, AZ-400

This advertiser has chosen not to accept applicants from your region.

Senior Site Reliability Engineer- ELK Expert

Jaipur, Rajasthan iVedha Inc.

Posted today

Tap Again To Close

Job Description

Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice

Location: India (Remote) - Must be available to work in the EST (US/Canada) Time Zone.

Role Summary:

Are you a Senior Site Reliability Engineer (SRE) with deep ELK expertise, ready to take ownership of large-scale observability infrastructure?

We're looking for an SRE with 7+ years of experience , including 4+ years specializing in the ELK stack (Elasticsearch, Logstash, Kibana) , to join our Platform Engineering Practice . In this role, you’ll design, manage, and scale ELK clusters ingesting 2–3+ TB/day , enhance reliability across distributed systems, and drive automation within Azure cloud environments. This is a high-impact engineering opportunity focused on performance, observability, and operational excellence at scale.

Why Join Us

Career Growth: Work alongside industry experts on cutting-edge cloud technologies
Competitive Compensation and Benefits: We recognize and reward top talent
Exciting, Impactful Work: Design and build scalable, resilient cloud environments
Strategic Platform Role: Contribute to the foundation of next-gen observability and reliability infrastructure

What You Will Do

Design and Optimize Cloud Infrastructure: Architect scalable, fault-tolerant systems on Microsoft Azure
Automate Everything: Use Terraform, Ansible, and GitHub Actions to streamline deployment and configuration
Ensure Reliability and Performance: Proactively monitor, troubleshoot, and resolve production issues using Prometheus, Grafana, and Azure Monitor
Enhance Security and Compliance: Implement security best practices across DevOps workflows
Collaborate and Innovate: Work closely with engineering, security, and operations teams to drive automation and efficiency
Manage and scale large ELK clusters handling 2–3+ TB/day log volumes, ensuring high availability and performance
Optimize ELK architecture: Implement efficient index lifecycle policies, shard strategies, and hot-warm-cold tiered storage
Build and tune log pipelines: Scale Logstash and Beats pipelines across distributed environments
Support Kibana observability layers: Create dashboards, visualizations, and custom alerting frameworks (e.g., Watcher, ElastAlert)

What You Bring

7+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering
4+ years of dedicated, hands-on experience with ELK (Elasticsearch, Logstash, Kibana)
Strong experience managing large-scale ELK clusters in production with heavy ingestion (multi-TB/day)
Deep knowledge of index tuning, shard allocation, ILM policies , and scaling ELK components
Expertise in GitHub Actions, Terraform, Ansible, and Infrastructure as Code (IaC)
Proficiency in Python, Go, or Bash for automation and scripting
Deep understanding of Kubernetes, Docker , and cloud-native architectures
Experience with observability tools such as Prometheus, Grafana, Azure Monitor
Ability to work in a fast-paced, collaborative environment and solve complex operational issues

Education

Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field

Certifications (Nice to Have)

Microsoft Azure certifications: AZ-104 , AZ-400

This advertiser has chosen not to accept applicants from your region.

Senior Site Reliability Engineer- ELK Expert

Jaipur, Rajasthan iVedha Inc.

Posted 27 days ago

Tap Again To Close

Job Description

Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering Practice

Location: India (Remote) - Must be available to work in the EST (US/Canada) Time Zone.

Role Summary:

Are you a Senior Site Reliability Engineer (SRE) with deep ELK expertise, ready to take ownership of large-scale observability infrastructure?

Why Join Us

Career Growth: Work alongside industry experts on cutting-edge cloud technologies
Competitive Compensation and Benefits: We recognize and reward top talent
Exciting, Impactful Work: Design and build scalable, resilient cloud environments
Strategic Platform Role: Contribute to the foundation of next-gen observability and reliability infrastructure

What You Will Do

Design and Optimize Cloud Infrastructure: Architect scalable, fault-tolerant systems on Microsoft Azure
Automate Everything: Use Terraform, Ansible, and GitHub Actions to streamline deployment and configuration
Ensure Reliability and Performance: Proactively monitor, troubleshoot, and resolve production issues using Prometheus, Grafana, and Azure Monitor
Enhance Security and Compliance: Implement security best practices across DevOps workflows
Collaborate and Innovate: Work closely with engineering, security, and operations teams to drive automation and efficiency
Manage and scale large ELK clusters handling 2–3+ TB/day log volumes, ensuring high availability and performance
Optimize ELK architecture: Implement efficient index lifecycle policies, shard strategies, and hot-warm-cold tiered storage
Build and tune log pipelines: Scale Logstash and Beats pipelines across distributed environments
Support Kibana observability layers: Create dashboards, visualizations, and custom alerting frameworks (e.g., Watcher, ElastAlert)

What You Bring

7+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering
4+ years of dedicated, hands-on experience with ELK (Elasticsearch, Logstash, Kibana)
Strong experience managing large-scale ELK clusters in production with heavy ingestion (multi-TB/day)
Deep knowledge of index tuning, shard allocation, ILM policies , and scaling ELK components
Expertise in GitHub Actions, Terraform, Ansible, and Infrastructure as Code (IaC)
Proficiency in Python, Go, or Bash for automation and scripting
Deep understanding of Kubernetes, Docker , and cloud-native architectures
Experience with observability tools such as Prometheus, Grafana, Azure Monitor
Ability to work in a fast-paced, collaborative environment and solve complex operational issues

Education

Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field

Certifications (Nice to Have)

Microsoft Azure certifications: AZ-104 , AZ-400

This advertiser has chosen not to accept applicants from your region.

Cloud Engineer

Jaipur, Rajasthan Applicantz

Posted today

Tap Again To Close

Job Description

THIS IS A LONG TERM CONTRACT POSITION WITH ONE OF THE LARGEST, GLOBAL, TECHNOLOGY LEADER.

Client's R&D team is looking for a talented and highly passionate individual to join its product development team and develop world class cloud-based software products and solutions to solve interesting problems in the construction industry. If you are a software developer who is proficient in web-based technologies, possess strong design and coding skills, passionate about critical thinking while solving problems, we would love to hear from you You will be part of an agile team of smart and highly motivated engineers building highly scalable, secure and cloud-based products/services. You will work in a global team and collaborate with local and remote colleagues from various disciplines like business, engineering, operations, support etc. You will work with latest technologies in a flexible environment.

Minimum Qualifications:

5+ years of overall experience with good knowledge of data structures, algorithms, object-oriented programming.
Solid understanding of typical web architecture (data, application, web tiers etc.).
Proficient in Python based technologies like Conceptual understanding of Web and RESTful APIs.
Experience in networking related services – like VPC, Subnet etc., in AWS.
Monitoring tool experience like Splunk / Cloudwatch.
Ability to work with a team in an Agile environment.

Responsibilities:

Involved with all aspects of software development.
Design and develop highly scalable, reliable and fault tolerant systems with minimal guidance.
Ensure the best possible performance, quality, and responsiveness of the applications.
Identify bottlenecks and bugs, and devise solutions to these problems.
Should help maintain code quality, organization, and automation.
Should have the ability to span to full stack development whenever necessary.
Write and maintain code with high attention to details, perform peer code-reviews, and participate in technical design discussions.

Build & Release Automation:

Provide solution to implement Continuous integration & continuous deployment solutions for medium sized project.
Manage Medium to large sized projects.
Guide the team to solve build & deployment automation issues.
Design/Implement release orchestration solutions for medium or large sized projects.
Participates in the discovery phase of large sized projects to come up with high level design.

Infrastructure Automation:

Develop playbooks/cookbooks for configuration management for medium to large sized projects.
Install and optimizing tools on cloud infrastructure.
Design/Implement solution for infrastructure automation over cloud infrastructure.
Implement optimized complex networking configurations.
Implement optimized complex storage setup.
Develop Terraform script for deploying AWS resources.
Develop Python script for automating DevOps task.
Develop shell script for Infrastructure automation .

Must Have Skills:

- AWS Cloud and DevOps.

- Jenkins CI/CD.

- Git.

- Docker containerization and administration.

- Linux administration.

- Python.

Our large, Fortune Technology client is ranked as one of the best companies to work with, in the world. As a global leader in 3D design, engineering, and entertainment software, they foster progressive culture, creativity, and a flexible work environment. They use cutting-edge technologies to keep themselves ahead of the curve. Diversity in all aspects is respected. Integrity, experience, honesty, people, humanity, and passion for excellence are some other adjectives that define this global technology leader.

This advertiser has chosen not to accept applicants from your region.

Be The First To Know

About the latest Devops engineers Jobs in Jaipur !

Set Email Alert:

Enter your email

Job title

Location

Cloud Engineer

Jaipur, Rajasthan EXL

Posted today

Tap Again To Close

Job Description

We're looking for a highly skilled and experienced Cloud AI Engineer to join our dynamic team. In this role, you'll be instrumental in designing, developing, and deploying cutting-edge artificial intelligence and machine learning solutions leveraging the full suite of Google Cloud Platform (GCP) services.

Objectives of this role

Lead the end-to-end development cycle of AI applications, from conceptualization and prototyping to deployment and optimization, with a core focus on LLM-driven solutions.
Architect and implement highly performant and scalable AI services, effectively integrating with GCP's comprehensive AI/ML ecosystem.
Collaborate closely with product managers, data scientists, and MLOps engineers to translate complex business requirements into tangible, AI-powered features.
Continuously research and apply the latest advancements in LLM technology, prompt engineering, and AI frameworks to enhance application capabilities and performance.

# Responsibilities

Develop and deploy production-grade AI applications and microservices primarily using Python and FastAPI, ensuring robust API design, security, and scalability.
Design and implement end-to-end LLM pipelines, encompassing data ingestion, processing, model inference, and output generation.
Utilize Google Cloud Platform (GCP) services extensively, including Vertex AI (Generative AI, Model Garden, Workbench), Cloud Functions, Cloud Run, Cloud Storage, and BigQuery, to build, train, and deploy LLMs and AI models.
Expertly apply prompt engineering techniques and strategies to optimize LLM responses, manage context windows, and reduce hallucinations.
Implement and manage embeddings and vector stores for efficient information retrieval and Retrieval-Augmented Generation (RAG) patterns.
Work with advanced LLM orchestration frameworks such as LangChain, LangGraph, Google ADK, and CrewAI to build sophisticated multi-agent systems and complex AI workflows.
Integrate AI solutions with other enterprise systems and databases, ensuring seamless data flow and interoperability.
Participate in code reviews, establish best practices for AI application development, and contribute to a culture of technical excellence.
Keep abreast of the latest advancements in GCP AI/ML services and broader AI/ML technologies, evaluating and recommending new tools and approaches.

# Required skills and qualifications

Two or more years of hands-on experience as an AI Engineer with a focus on building and deploying AI applications, particularly those involving Large Language Models (LLMs).
Strong programming proficiency in Python, with significant experience in developing web APIs using FastAPI.
Demonstrable expertise with Google Cloud Platform (GCP), specifically with services like Vertex AI (Generative AI, AI Platform), Cloud Run/Functions, and Cloud Storage.
Proven experience in prompt engineering, including advanced techniques like few-shot learning, chain-of-thought prompting, and instruction tuning.
Practical knowledge and application of embeddings and vector stores for semantic search and RAG architectures.
Hands-on experience with at least one major LLM orchestration framework (e.g., LangChain, LangGraph, CrewAI).
Solid understanding of software engineering principles, including API design, data structures, algorithms, and testing methodologies.
Experience with version control systems (Git) and CI/CD pipelines.

Preferred skills and qualifications

Bachelor's or Master's degree in Computer Science

Good to have:

Experience with MLOps practices for deploying, monitoring, and maintaining AI models in production.

Understanding of distributed computing and data processing technologies.

Contributions to open-source AI projects or a strong portfolio showcasing relevant AI/LLM applications.

Excellent analytical and problem-solving skills with a keen attention to detail.

Strong communication and interpersonal skills, with the ability to explain complex technical concepts to non-technical stakeholders

This advertiser has chosen not to accept applicants from your region.

Industry

View All Devops Engineers Jobs View All Jobs in Jaipur

Menu

Search Suggestions

Recent Searches

Popular Searches

Location Suggestions

Popular Locations

Nearby Locations

Other Jobs Near Me

Industry

115 Devops Engineers jobs in Jaipur

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Senior Site Reliability Engineer- ELK Expert

Job Description

Senior Site Reliability Engineer- ELK Expert

Job Description

Senior Site Reliability Engineer- ELK Expert

Job Description

Cloud Engineer

Job Description

Be The First To Know

Cloud Engineer

Job Description

Nearby Locations

Other Jobs Near Me

Industry