Didn't find the right job?

Get expert career advice to help you find the ideal role and improve your job search strategy.

34 Site Reliability Engineer jobs in Coimbatore

Site Reliability Engineer

Coimbatore, Tamil Nadu Yum! India Global Services Private Limited

Posted today

Tap Again To Close

Job Description

Roles & Responsibilities

Design, test, implement, deploy, and support continuous integration pipelines that build and deploy to cloud-based environments (development, stage/testing, production). In this role, you will help us build the foundation for our future technology platforms by setting standards around cloud-native patterns, which will inform and be the driving factor in system wide strategic cloud transformation journey.

Write infrastructure code and test cases code.
Deploy and configure monitoring and logging tools used for generating alerts about the health of our systems and applications.
Help with the design, building, and automation of cloud-based infrastructure and provide guidance to development teams regarding how they can continually improve their applications cost, performance, and reliability through investigation, analysis, and best practice recommendations.
Work in a Cloud Native infrastructure environment built to host and support true micro services architecture applications.

Mandatory Skills

6-12 years of experience in a professional cloud computing role with Kubernetes, Docker and Infra-as-Code.
A BA/BS in Computer Science or equivalent work experience.
Experience in design, implementation, and deployment of large-scale, highly available, cloud-based infrastructure utilizing AWS, GCP, Azure or other public cloud providers.
Strong collaboration skills with multiple IT functions, business leaders and vendors, exhibiting excellent teamwork and strong verbal and written communication skills along with expert troubleshooting and analytical skills.
In depth knowledge of cloud-native application development paradigms and tools.
Strong know-how with the current trends of large-scale infrastructure environments with proven success in the deployment of resilient cloud-native solutions.
Experience with source code control systems, branching and merging strategies, automated unit testing frameworks, automated build tools, and automated deploy frameworks.
Deep working knowledge of serverless and container-based technologies such as Lambda, Docker, Kubernetes, and container platforms such as Rancher Labs or RedHat OpenShift.

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Coimbatore, Tamil Nadu ElevenX Capital

Posted today

Tap Again To Close

Job Description

About the Role:

We are looking for a skilled Site Reliability Engineer (SRE) to join our team and help us ensure the reliability, scalability, and performance of our critical systems. As an SRE, you will work closely with development and operations teams to build and maintain highly available services, automate operational tasks, and monitor system health.

Key Responsibilities:

Design, implement, and maintain scalable and reliable infrastructure for production systems.
Automate repetitive operational tasks, deployments, and monitoring.
Collaborate with software engineering teams to build reliable and efficient services.
Develop and maintain tools for system monitoring, alerting, and incident response.
Participate in on-call rotations and manage incident response to minimize downtime.
Analyze system performance and identify bottlenecks or failure points.
Develop and implement disaster recovery and backup strategies.
Advocate for reliability, availability, and performance best practices throughout the engineering teams.
Document processes, architecture, and troubleshooting guides.

Required Qualifications:

Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience.
Proven 5+ years of experience in Site Reliability Engineering with Tools like Grafana, and Prometheus
Experience with Docker, Kubernetes, and Terraform
Strong knowledge of Linux/Unix systems and networking fundamentals.
Proficiency with scripting and programming languages such as Python, Go, Bash, or Ruby.
Experience with cloud platforms like AWS, GCP, or Azure.

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Coimbatore, Tamil Nadu Times Internet

Posted today

Tap Again To Close

Job Description

Role: Site Reliability Engineer

Experience: 8-14 years

Location: Sector 16, Noida

Notice Period: Immediate / Serving only

About Times Internet

At Times Internet, we create premium digital products that simplify and enhance the lives of

millions. As India’s largest digital products company, we have a significant presence across a

wide range of categories, including News, Sports, Fintech, and Enterprise solutions.

Our portfolio features market-leading and iconic brands such as TOI, ET, NBT, Cricbuzz, Times

Prime, Times Card, Indiatimes, Whatshot, Abound, Willow TV, Techgig, and Times Mobile

among many more. Each of these products is crafted to enrich your experiences and bring you

closer to your interests and aspirations.

As an equal opportunity employer, Times Internet strongly promotes inclusivity and diversity. We

are proud to have achieved overall gender pay parity in 2018, verified by an independent audit

conducted by Aon Hewitt.

We are driven by the excitement of new possibilities and are committed to bringing innovative

products, ideas, and technologies to help people make the most of every day. Join us and take

us to the next level!

Job Description

We are looking for a Site Reliability Engineer (SRE) to join our News Team. The SRE will be

responsible for maintaining the reliability, scalability, and performance of our critical

infrastructure, ensuring high availability for our services.

Job Role:

As a Site Reliability Engineer (SRE) in the News Team, you will be responsible for ensuring the

stability, performance, and scalability of our systems. You will play a key role in various

migration activities, including Kubernetes cluster upgrades, and application re-platforming. A

significant part of your role will involve migrating applications into Kubernetes, ensuring

seamless deployment, high availability, and minimal downtime.

Additionally, you will be responsible for configuring and maintaining Elasticsearch and Kafka

clusters, ensuring optimal performance, availability, and reliability. You will work on tuning

Elasticsearch for efficient search and indexing, managing Kafka for real-time data streaming,

and troubleshooting any issues related to these services.

You will work on automating operational tasks, optimizing infrastructure, and proactively

resolving issues to maintain system reliability. Additionally, you will collaborate with

development, DevOps, and infrastructure teams to implement best practices for security,

observability, and scalability. Your expertise will be crucial in improving deployment pipelines,

incident response, and overall system performance.

Job Responsibilities:

● Ensure IT services and infrastructure uptime.

● Implement monitoring, alerting, and incident response processes

● Automate repetitive ops tasks (deployments, scaling, failover).

● Respond to outages and production incidents (on-call duties).

● Perform root cause analysis (RCA) and drive postmortems.

● Measure and optimize system performance (latency, throughput, resource usage).

● Support reliable and safe code releases

● Ensure systems are patched, hardened, and compliant with standards.

● Collaborate with technology teams for new requirements and deliver them

Technical Skills Required:

● 8+ years of experience in Site Reliability Engineering, or a related role.

● Proficiency in Kubernetes, Docker, and container orchestration.

● Experience with CI/CD tools.

● Strong knowledge of Linux systems and scripting (Bash, Python).

● Familiarity with configuration management tools like Ansible,Helm.

● Experience with monitoring and logging tools (ELK Stack, or NewRelic).

● Strong troubleshooting skills and incident management experience.

● Experience with Elasticsearch and Kafka

● Knowledge of networking concepts, load balancers, and DNS.

● Experience in performance tuning and optimization.

Soft Skills Required:

● Systems & OS Knowledge

● Linux/Unix administration (process management, system tuning, networking)

● Understanding of filesystems, memory, CPU, kernel basics (centos / Ubuntu )

● Scripting for automation: BASH, python

● Knowledge of cloud platforms : AWS, GCP, Azzure

● Networking and Protocols

● TCP/IP, DNS, HTTP/HTTPS, CDN concepts

● Debugging latency, connectivity, and routing issues

● CI/CD and DevOps Practices

● Jenkins, GitHub Actions, GitLab CI, BitBucket, Git

● Working knowledge of Apache, Tomcat, Nginx

● Knowledge of DNS, Load Balancer, WAF, Firewall.

● Working knowledge of Monitoring tools and ELK

● Knowledge hypervisor like VMware.

● Strong on Virtualization technologies, Docker, Kubernetes

● Knowledge of Database concepts

Qualifications - Education & Experience:

● Bachelor’s degree in Electronic and Telecom, Computer Science, Information

Technology, or a related field.

● 8+ years of experience in Site Reliability Engineering

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Coimbatore, Tamil Nadu Yum! India Global Services Private Limited

Posted today

Tap Again To Close

Job Description

Roles & Responsibilities

Write infrastructure code and test cases code.
Deploy and configure monitoring and logging tools used for generating alerts about the health of our systems and applications.
Help with the design, building, and automation of cloud-based infrastructure and provide guidance to development teams regarding how they can continually improve their applications cost, performance, and reliability through investigation, analysis, and best practice recommendations.
Work in a Cloud Native infrastructure environment built to host and support true micro services architecture applications.

Mandatory Skills

6-12 years of experience in a professional cloud computing role with Kubernetes, Docker and Infra-as-Code.
A BA/BS in Computer Science or equivalent work experience.
Experience in design, implementation, and deployment of large-scale, highly available, cloud-based infrastructure utilizing AWS, GCP, Azure or other public cloud providers.
Strong collaboration skills with multiple IT functions, business leaders and vendors, exhibiting excellent teamwork and strong verbal and written communication skills along with expert troubleshooting and analytical skills.
In depth knowledge of cloud-native application development paradigms and tools.
Strong know-how with the current trends of large-scale infrastructure environments with proven success in the deployment of resilient cloud-native solutions.
Experience with source code control systems, branching and merging strategies, automated unit testing frameworks, automated build tools, and automated deploy frameworks.
Deep working knowledge of serverless and container-based technologies such as Lambda, Docker, Kubernetes, and container platforms such as Rancher Labs or RedHat OpenShift.

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Coimbatore, Tamil Nadu Employ

Posted today

Tap Again To Close

Job Description

Role - Site Reliability Engineer (SRE)/ Platform Engineering/ or DevOps Engineering roles

Location – Bangalore/ Remote

Type - Contract

Work Ex - 4-6 yrs

We’re working with a AI product company that’s building the next generation of GenAI powered developer platforms .

We’re looking for an experienced Site Reliability Engineer to join their Platform Engineering team . This role is perfect for someone who thrives at the intersection of software engineering and systems operations , and wants to build infrastructure that powers millions of AI-driven code reviews at scale.

What We’re Looking For

4–6 years in SRE, Platform Engineering, or DevOps roles.
Strong hands-on with GCP (or AWS) , Kubernetes , Docker , and Terraform .
Proficiency in Node.js / TypeScript for automation and tooling.
Strong background in Linux/Unix systems, networking, and CI/CD pipelines .
Familiarity with observability platforms (Datadog, Prometheus, Grafana, ELK).

Nice to Have

AI/ML infrastructure exposure
Experience running high-traffic, distributed systems
Open-source contributions
Knowledge of compliance (SOC 2, ISO 27001) and cost optimization

Why This Role?

Work on cutting-edge AI systems with massive real-world developer impact
Join a collaborative, high-growth product team
Competitive salary + equity + benefits
Shape infrastructure that supports millions of real-time code reviews

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Coimbatore, Tamil Nadu MyRemoteTeam Inc

Posted today

Tap Again To Close

Job Description

About Us

MyRemoteTeam, Inc is a fast-growing distributed workforce enabler, helping companies scale with top global talent. We empower businesses by providing world-class software engineers, operations support, and infrastructure to help them grow faster and better.

Job Title: AWS SRE Engineer

Mandatory skills: Java, Cloud(AWS or Docker/Kubernetes), Prod support knowledge, Snow Tool

Exp: 8+ Yrs

Candidate needs to work from client office.

Detailed JD:

We are seeking a skilled and proactive engineer with expertise in Kubernetes, Java-based applications, and cloud platforms (AWS/Azure/GCP), along with experience in ServiceNow for support ticket management. The ideal candidate will be responsible for maintaining cloud-native applications, troubleshooting production issues, and ensuring smooth operations through effective ticket handling and resolution.

Key Responsibilities:

Kubernetes & Cloud Operations:

Deploy, manage, and monitor containerized applications using Kubernetes.
Maintain and optimize cloud infrastructure (AWS, Azure, or GCP).
Automate deployments and infrastructure using CI/CD pipelines and Infrastructure as Code (IaC) tools like Terraform or Helm.
Monitor system performance, availability, and security.

Java Application Support:

Troubleshoot and debug Java-based microservices and APIs.
Collaborate with development teams to resolve application issues.
Participate in code reviews and suggest performance improvements.

ServiceNow (SNOW) Support:

Handle incident, problem, and change management via ServiceNow.
Raise, track, and resolve support tickets in coordination with internal and external teams.
Document root cause analysis (RCA) and resolution steps for recurring issues.

Collaboration & Documentation:

Work closely with DevOps, QA, and development teams.
Maintain technical documentation, runbooks, and knowledge base articles.
Participate in on-call rotations and provide timely support for critical issues.

Required Skills:

Strong hands-on experience with Kubernetes and container orchestration.
Proficiency in Java and related frameworks (Spring Boot, REST APIs).
Experience with cloud platforms (AWS, Azure, or GCP).
Familiarity with ServiceNow or similar ITSM tools.
Good understanding of CI/CD tools (Jenkins, GitLab CI, etc.).
Knowledge of monitoring tools (Prometheus, Grafana, ELK, etc.)

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Coimbatore, Tamil Nadu ElevenX Capital

Posted today

Tap Again To Close

Job Description

About the Role:

Key Responsibilities:

Design, implement, and maintain scalable and reliable infrastructure for production systems.
Automate repetitive operational tasks, deployments, and monitoring.
Collaborate with software engineering teams to build reliable and efficient services.
Develop and maintain tools for system monitoring, alerting, and incident response.
Participate in on-call rotations and manage incident response to minimize downtime.
Analyze system performance and identify bottlenecks or failure points.
Develop and implement disaster recovery and backup strategies.
Advocate for reliability, availability, and performance best practices throughout the engineering teams.
Document processes, architecture, and troubleshooting guides.

Required Qualifications:

Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience.
Proven 5+ years of experience in Site Reliability Engineering with Tools like Grafana, and Prometheus
Experience with Docker, Kubernetes, and Terraform
Strong knowledge of Linux/Unix systems and networking fundamentals.
Proficiency with scripting and programming languages such as Python, Go, Bash, or Ruby.
Experience with cloud platforms like AWS, GCP, or Azure.

This advertiser has chosen not to accept applicants from your region.

Be The First To Know

About the latest Site reliability engineer Jobs in Coimbatore !

Set Email Alert:

Enter your email

Job title

Location

Site Reliability Engineer

Coimbatore, Tamil Nadu Xebia

Posted today

Tap Again To Close

Job Description

We are looking for a highly skilled AWS Engineer with strong Python development and Chaos Engineering expertise to design, build, and validate resilient, scalable, and automated cloud-native environments. The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault tolerance, and operational efficiency of critical systems.

Key Responsibilities

Cloud Engineering (AWS):

Architect, implement, and manage secure, scalable, and cost-efficient AWS infrastructure (EC2, Lambda, EKS, S3, RDS, IAM, CloudFront, etc.).
Automate infrastructure provisioning and configuration using Terraform / CloudFormation and AWS SDKs.
Manage containerized workloads (Docker, Kubernetes, EKS).

Python Development:

Build automation scripts, deployment utilities, and infrastructure tooling using Python (Boto3, Flask, FastAPI, etc.) .
Develop custom monitoring/alerting integrations with APIs, SDKs, and third-party observability platforms.
Implement self-healing and resilience-focused automation scripts.

Chaos Engineering & Resiliency:

Design and execute chaos experiments (fault injection, latency, outages, resource failures) to validate system resilience.
Use tools like Gremlin, Litmus, Chaos Mesh, or AWS Fault Injection Simulator .
Partner with SRE and development teams to define SLIs, SLOs, and error budgets .
Document learnings from chaos tests and improve incident response & recovery playbooks.

DevOps & Observability:

Build and maintain CI/CD pipelines for automated deployments (Jenkins, GitHub Actions, GitLab CI, AWS CodePipeline).
Integrate observability frameworks (Prometheus, Grafana, ELK/EFK, CloudWatch, Datadog) for monitoring and tracing.
Ensure proactive alerting and real-time visibility into system health.

Security & Compliance:

Apply AWS security best practices for IAM, networking, and data protection.
Ensure compliance with internal and external regulatory frameworks (SOC2, ISO, GDPR, etc.).

Required Skills & Qualifications

6–10 years of experience in Cloud, DevOps, or SRE roles.
Strong hands-on expertise in AWS Cloud (certifications preferred: AWS DevOps Engineer / Solutions Architect).
Advanced Python development skills for automation and tooling (Boto3 a must).
Experience designing and running chaos experiments (Gremlin, AWS FIS, Litmus, Chaos Mesh, or custom Python-based fault injection).
Solid knowledge of IaC (Terraform / CloudFormation) .
Proficiency in containers & orchestration (Docker, Kubernetes, EKS) .
Strong background in monitoring, observability, and incident management .
Familiarity with DevOps toolchain (CI/CD, Git, Jenkins, GitLab, CodePipeline) .
Good understanding of resilient architectures, reliability principles, and disaster recovery .

Preferred Skills

Knowledge of Go / Shell scripting in addition to Python.
Experience with chaos testing in production-like environments .
Exposure to multi-cloud or hybrid-cloud environments .
Strong problem-solving and debugging skills.

What We Offer

Opportunity to lead cloud reliability & chaos engineering initiatives .
Culture focused on automation, resilience, and continuous improvement .
Growth opportunities through certifications, R&D projects, and leadership roles.

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Coimbatore, Tamil Nadu Uplers

Posted today

Tap Again To Close

Job Description

Uplers is hiring for one of the clients. It is a remote opportunity.

Role Details:

Position: SRE (Oracle Cloud Infrastructure)
Type: 10-month contract (possible extension)
Mode: Remote | Mon–Fri | 10:30 AM – 7:30 PM IST
Policy: Use of personal device required
Experience: 7–10 yrs (min. 7–8 yrs in OCI)
Skills: OCI, Terraform, GitLab
Rounds: 2

About the Role:

SRE Engineer: build and manage our OCI cloud infrastructure using Terraform and GitLab CI/CD, ensuring its stability, security, and scalability.

You will automate build and deployments of various apps, implement monitoring and logging, and collaborate with development teams while staying current with OCI best practices.

Experience with OCI, Terraform, GitLab CI/CD, and cloud-native principles is essential.

Preferable if the candidate has FS banking experience.

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Coimbatore, Tamil Nadu CES

Posted today

Tap Again To Close

Job Description

We are seeking a hands-on SRE with expertise in infrastructure automation, cloud scalability, and performance optimization. You’ll design, manage, and monitor large-scale AWS environments, ensuring high availability, security, and reliability for our SaaS platforms

Key Responsibilities

Develop and execute UI automation using Cypress with TypeScript.
Conduct performance testing using K6.
Perform API testing with Postman.
Run accessibility testing using Wave, AudioEye, and similar tools.
Manage and optimize AWS infrastructure at scale (EC2, S3, ELB, Lambda, Route 53, ECS, SQS, CloudWatch).
Package, deploy, and manage containerized workloads (Docker, Kubernetes).
Automate workflows using Terraform, CDK, Chef .
Implement CI/CD pipelines (TeamCity, Octopus Deploy, GitHub, Jenkins, Codefresh).
Monitor and troubleshoot using ELK stack, Dynatrace, New Relic, Nagios.
Manage and optimize IIS and web farms in high-traffic SaaS environments.

Key Skills & Experience

3+ years with IaaC & DSC tools (Terraform, CDK, Chef).
3+ years managing containerized workloads on PaaS (Docker, Kubernetes).
Strong scripting/automation skills (PowerShell, Ruby, Go, Python, Bash).
Experience with large-scale monitoring & reporting.
Solid understanding of .NET application architecture .
Proven problem-solving & troubleshooting skills in DevOps/SRE environments.

This advertiser has chosen not to accept applicants from your region.

Industry

View All Site Reliability Engineer Jobs View All Jobs in Coimbatore

Menu

Search Suggestions

Recent Searches

Popular Searches

Location Suggestions

Popular Locations

Nearby Locations

Other Jobs Near Me

Industry

34 Site Reliability Engineer jobs in Coimbatore

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Be The First To Know

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Nearby Locations

Other Jobs Near Me

Industry