Site Reliability Engineer

Coimbatore, Tamil Nadu Vertex Agility

Posted today

Job Viewed

Tap Again To Close

Job Description

Site Reliability Engineer (SRE) – Remote (India)

We’re Vertex Agility – a remote-first consultancy delivering impactful cloud, DevOps, and engineering solutions for global leaders.

We’re hiring a Site Reliability Engineer to join a high-profile initiative with one of the world’s leading management consulting firms. You’ll play a key role in building scalable, cloud-native systems that drive innovation at a global scale—combining engineering excellence with SRE principles that matter.

What You’ll Work With:


  • AWS and Azure (primary cloud platforms)
  • Kubernetes, Terraform, Docker, Helm
  • Python or Golang
  • Modern DevOps: CI/CD, TDD, Chaos Engineering, SLOs/SLIs, Prometheus, Grafana, Splunk


What We’re Looking For:


  • 8+ years of cloud and infrastructure experience
  • Deep hands-on skills with IaC, containers, and automation tooling
  • Passion for engineering reliable systems and scaling operations
  • Strong DevOps and SRE mindset—automation-first, blameless culture.


Why Join Vertex Agility?


We’re a remote-first, high-caliber consultancy solving meaningful technical challenges. You’ll lead transformation projects, grow fast, and work with smart, collaborative teams building what’s next.

#SRE #DevOpsJobs #Kubernetes #AWS #Terraform #Python #CloudNative #RemoteJobs #VertexAgility #EngineeringExcellence

Sound like you? Let’s talk.

DM me or apply directly at


Let’s build something reliable—and brilliant.

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Coimbatore, Tamil Nadu Celsior

Posted today

Job Viewed

Tap Again To Close

Job Description

SRE – Staff Aug - Azure Cloud Heavy


Technical Skills:

  • Programming: Proficiency in languages like Python, Bash, or Java is essential.
  • Operating Systems: Deep understanding of Linux/Windows operating systems and networking concepts.
  • Cloud Technologies: Experience with AWS & Azure including services, architecture, and best practices.
  • Containerization and Orchestration: Hands-on experience with Docker, and Kubernetes, and related tools.
  • Infrastructure as Code (IaC): Familiarity with tools like Terraform,/ CloudFormation or Azure CLI.
  • Monitoring and Observability: Experience with tools like Splunk, /New Relic or Azure Monitoring.
  • CI/CD: Experience with continuous integration and continuous delivery pipelines, GitHub, and GitHub Actions.
  • Knowledge in supporting Azure ML, Databricks and other related SAAS tools.
  • Soft Skills:
  • Problem-Solving: Ability to troubleshoot and debug complex distributed systems independently.
  • Communication: Strong written and verbal communication skills to collaborate with development and operations teams, and able to write documentation like Runbook etc.
  • Specific Experience:
  • Incident Management: Experience with incident response, root cause analysis, and post-incident reviews.
  • Scalability and Performance: Understanding of scalability, availability, and performance monitoring for large-scale systems.
  • Automation: Experience in automating repetitive tasks and workflows.
  • Preferred Qualifications:
  • Experience with specific cloud platforms (AWS, Azure).
  • Certifications related to cloud engineering or DevOps.
  • Experience with microservices architecture including supporting AI/ML solutions.
  • Experience with large-scale system management and configuration.


Immediate Joiners Preferred !


Location - Remote


Please share CVs at

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Coimbatore, Tamil Nadu mindcurv

Posted today

Job Viewed

Tap Again To Close

Job Description

  About Mindcurv We help our customers rethink their digital business, experiences, and technology to navigate the new digital reality.

We do this by designing sustainable and accountable solutions for humans living in a digital world.

Mindcurv holistically covers the market’s need to digitalise business processes and customer experiences and take advantage of the cloud following DevOps and agile principles.

We cater to the following six solution lines: Strategy and Advisory Creative Services and Digital Products Client Engagement Platforms Digital Experience and Solutions Data Services Cloud Platforms and Managed Services We are made up of a team of experts from various domains that define, create, and improve digital experiences.

They engage with people to achieve solutions that enable businesses to grow and scale sustainably.

Within Digital Platforms & Commerce we design and fully craft tailored solutions for our customers enabling them to get the most out of their business.

We design and build a solid foundation in commerce, marketplace, responsive design, DXP and order management to name a few.

  •   Your role As a Site Reliability Engineer (SRE), you’ll bring together your software engineering expertise and systems knowledge to ensure our systems are scalable, reliable, and efficient.

    You’ll be instrumental in automating operations, solving complex infrastructure challenges, and driving continuous improvement to deliver seamless and resilient services.

    Your responsibilities will include: Design, build, and maintain scalable infrastructure and systems.

    Automate operational tasks to improve efficiency and reliability.

    Implement application monitoring and continuous improvement of application performance and stability.

    Develop and implement disaster recovery and incident management strategies.

    Collaborate with developers to improve application architecture and deployment.

    Optimize system availability, latency, and performance metrics.

    Manage CI/CD pipelines for seamless software delivery.

    Perform root cause analysis and lead detailed post-mortems.

    Consult with software development teams to implement reliability best practices.

    Write and maintain infrastructure and operational documentation.

    Operational responsibility of a number of distributed applications.

    Including on-call shifts.

  •   Who you are Strong experience in software engineering and systems architecture.

    Multiple years of experience programming in languages such as Python, Go, or Java.

    Expertise with cloud platforms (AWS, Azure, GCP) and tools.

    Hands-on experience with infrastructure as code (Terraform, Ansible, etc.).

    Familiarity with Linux/Unix systems and networking fundamentals.

    Familiarity with containerization and orchestration tools like Docker and Kubernetes.

    Proven ability to monitor, debug, and optimize distributed systems.

    Experience managing CI/CD pipelines and automation frameworks.

    Strong problem-solving skills and attention to detail.

    Excellent communication and collaboration skills for cross-functional teamwork.

    Ability to analyze and improve complex systems for reliability and scalability.

    Self-motivated with a passion for continuous learning and improvement.

  •     Why work at Mindcurv ?

    We are an Advanced consulting  partner for AWS and Gold partner for Microsoft Azure  We train our engineers with the required skills to be able to develop platforms in the cloud We invest in your certifications provided by approved programs like AWS and Microsoft and that are mandated in the role  Access to learning and training portals to help you stay up to date with latest technologies We have virtual rooms with specialists from various regions support in active discussion on technologies  Finally, we offer a great place to work and collaborate What do we offer you?

    Perks, like drinks and snacks, pizza sessions, gaming nights and sensational parties, a pension scheme and a great salary, we’ve got them all.

    And if you crave an intellectual challenge, Mindcurv has you covered.

    Interesting projects involving the latest, hyper innovative tech.

    An agile, entrepreneurial environment, with lots of freedom and no politics.

    Work-life balance, a culture of transparency and a management team with their ears to the ground.

    We typically are the trailblazers and innovators.

    A hybrid way of working in which you can work from home.

    You just come into one of our offices in Essen, Cologne, Düsseldorf, Munich, Frankfurt a.M., Hamburg, Jena, Utrecht, Madrid, Cochin, Coimbatore or Trivandrum when it adds value to be on-site.

    Looking for a glance behind the scenes?   Our high performers You know who really thrive with us?

    Self-starters, team-players and continuous learners, with an uncanny ability to handle ambiguity.

    We’ll equip you with everything you need to succeed, help you explore the length and breadth of your domain and provide you with constant growth opportunities, to enrich your career.

    Ready for change?

    Are you ready for the next step in your career?

    For a role in which you can be fully yourself and bring out the best?

    In yourself, your colleagues and your clients?

    Don’t wait any longer and apply for this job right now.

    Powered by JazzHR

  • This advertiser has chosen not to accept applicants from your region.

    Site Reliability Engineer

    Coimbatore, Tamil Nadu Celsior

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    SRE – Staff Aug - Azure Cloud Heavy


    Technical Skills:

    • Programming: Proficiency in languages like Python, Bash, or Java is essential.
    • Operating Systems: Deep understanding of Linux/Windows operating systems and networking concepts.
    • Cloud Technologies: Experience with AWS & Azure including services, architecture, and best practices.
    • Containerization and Orchestration: Hands-on experience with Docker, and Kubernetes, and related tools.
    • Infrastructure as Code (IaC): Familiarity with tools like Terraform,/ CloudFormation or Azure CLI.
    • Monitoring and Observability: Experience with tools like Splunk, /New Relic or Azure Monitoring.
    • CI/CD: Experience with continuous integration and continuous delivery pipelines, GitHub, and GitHub Actions.
    • Knowledge in supporting Azure ML, Databricks and other related SAAS tools.
    • Soft Skills:
    • Problem-Solving: Ability to troubleshoot and debug complex distributed systems independently.
    • Communication: Strong written and verbal communication skills to collaborate with development and operations teams, and able to write documentation like Runbook etc.
    • Specific Experience:
    • Incident Management: Experience with incident response, root cause analysis, and post-incident reviews.
    • Scalability and Performance: Understanding of scalability, availability, and performance monitoring for large-scale systems.
    • Automation: Experience in automating repetitive tasks and workflows.
    • Preferred Qualifications:
    • Experience with specific cloud platforms (AWS, Azure).
    • Certifications related to cloud engineering or DevOps.
    • Experience with microservices architecture including supporting AI/ML solutions.
    • Experience with large-scale system management and configuration.


    Immediate Joiners Preferred !


    Location - Remote


    Please share CVs at

    This advertiser has chosen not to accept applicants from your region.

    Site Reliability Engineer

    Coimbatore, Tamil Nadu Vertex Agility

    Posted 3 days ago

    Job Viewed

    Tap Again To Close

    Job Description

    Site Reliability Engineer (SRE) – Remote (India)

    We’re Vertex Agility – a remote-first consultancy delivering impactful cloud, DevOps, and engineering solutions for global leaders.

    We’re hiring a Site Reliability Engineer to join a high-profile initiative with one of the world’s leading management consulting firms. You’ll play a key role in building scalable, cloud-native systems that drive innovation at a global scale—combining engineering excellence with SRE principles that matter.

    What You’ll Work With:


    • AWS and Azure (primary cloud platforms)
    • Kubernetes, Terraform, Docker, Helm
    • Python or Golang
    • Modern DevOps: CI/CD, TDD, Chaos Engineering, SLOs/SLIs, Prometheus, Grafana, Splunk


    What We’re Looking For:


    • 8+ years of cloud and infrastructure experience
    • Deep hands-on skills with IaC, containers, and automation tooling
    • Passion for engineering reliable systems and scaling operations
    • Strong DevOps and SRE mindset—automation-first, blameless culture.


    Why Join Vertex Agility?


    We’re a remote-first, high-caliber consultancy solving meaningful technical challenges. You’ll lead transformation projects, grow fast, and work with smart, collaborative teams building what’s next.

    #SRE #DevOpsJobs #Kubernetes #AWS #Terraform #Python #CloudNative #RemoteJobs #VertexAgility #EngineeringExcellence

    Sound like you? Let’s talk.

    DM me or apply directly at


    Let’s build something reliable—and brilliant.

    This advertiser has chosen not to accept applicants from your region.

    Senior Site Reliability Engineer

    Coimbatore, Tamil Nadu BQE Software

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    We are seeking a Senior Site Reliability Engineer to lead reliability efforts across our application stack, focusing on high availability, performance, and scalability. This role will own the health and uptime of our mission-critical application , Cloud infrastructure , database system , and monitoring infrastructure .


    About Us

    At BQE, our mission is to transform the operational landscape of professional services firms, empowering them to achieve more and serve their customers better. These firms play a crucial role in building infrastructure that significantly impacts global progress. BQE CORE serves as the operational backbone for these firms, providing an all-in-one SaaS solution. Our platform enables them to efficiently manage projects, improve budget tracking and profitability, and streamline processes through automation. With a robust customer base, we are on a trajectory of continuous growth, constantly innovating to meet the evolving needs of our customers and the industries they influence.


    Why Join Us

    • Work with a modern tech stack in a high-impact reliability role.
    • Be a key part of our CloudOps and App Reliability strategy .
    • A collaborative and supportive engineering culture.


    Responsibilities:

    • Ensure application uptime , performance, and scalability.
    • Own incident management , including on-call rotations, root cause analysis, and incident reviews.
    • Manage and monitor MS SQL Server clusters and high-availability configurations.
    • Set up and improve monitoring, alerting, and observability using New Relic, Logz.io, CloudWatch , and other tools.
    • Proactively identify system bottlenecks and improve system reliability and automation.
    • Define and improve SLOs/SLAs across services.
    • Drive disaster recovery testing and availability simulations.
    • Collaborate with CloudOps and DevOps for infrastructure automation and enhancements.
    • Work with Jira and JSM to manage operational tasks, incidents, and changes.


    Qualifications & Experience:

    • Bachelor’s degree in computer science, Engineering, or related field (or equivalent experience).
    • 5-8 years of experience in Site Reliability Engineering, CloudOps, DevOps or related roles.


    Must Have Skills :

    • Certifications in AWS, Microsoft, Windows, SQL Server, or SRE disciplines .
    • Exposure to New Relic APM, IaC automation is a plus.
    • Experience working in a 24x7 on-call rotation .
    • Strong knowledge of Windows OS eco-system , IIS , MS SQL Server administration, clustering, performance tuning, and failover.
    • Deep experience with monitoring/logging tools like New Relic, Logz.io, AWS CloudWatch .
    • Experience with AWS (EC2, ASG, CloudWatch, CloudTrail, VPC) and infrastructure management.
    • Good understanding of networking , DNS , load balancing , and security principles .
    • Proficient in scripting languages such as PowerShell, Python .
    • Strong understanding of incident response, change management, postmortem culture .
    • Experience using Jira and Jira Service Management for operational workflows.
    • Ability to work independently and drive technical initiatives.
    This advertiser has chosen not to accept applicants from your region.

    Junior Site Reliability Engineer

    Coimbatore, Tamil Nadu JoVE

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    JoVE is the world- leading producer and provider of video solutions with the mission to improve scientific research and education. Millions of scientists, educators and students use JoVE for their research, teaching and learning. Our institutional clients comprise over 1,000 universities, colleges, and biopharma companies, including such leaders as Harvard, MIT, Yale, and Stanford. As a rapidly growing company, with offices in the USA, UK, Australia, and India servicing clients in over 60 countries, we are seeking talented and ambitious individuals to join our company.



    The Role


    We are looking for a Junior Site Reliability Engineer who will be part of our centralized Site Reliability Team. You will play an integral role in leading the deployment of highly scalable systems, optimization, documentation, and support of the infrastructure components of JoVE’s software products hosted on AWS. Cloud Infrastructure and Operations are critical in enabling JoVE to provide users with our technology offerings.



    Responsibilities:


    • Design, build, test, and deploy cloud-native applications and microservices using IaC tools like Terraform and Crossplane.
    • Maintain availability, latency, performance, efficiency, monitoring/observability, emergency response, capacity planning, setting and maintaining SLOs, SLIs, and Error Budgets, and creating dashboards.
    • Plan for automation to reduce toil and increase development velocity.
    • Perform application-specific production support, incident management, change management, problem management, RCAs, and service restoration as needed.
    • Actively look for opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation.
    • Collaborate with software development teams in the release management process to shape the future roadmap and establish strong operational readiness across teams.
    • Spearhead implementation of reliability and observability tools (like Groundcover, Prometheus, Grafana, etc.)
    • Support Infrastructure squad On-call practice and participate in 24x7 on-call rotations.


    Requirements:


    • 2+ years of professional experience as a Software Engineer and Site Reliability Engineer (SRE).
    • Extensive in-depth experience with cloud-based provisioning, monitoring, troubleshooting, and related SRE and DevOps technologies, in addition to networking knowledge.
    • MUST have working experience with AWS infrastructure.
    • MUST understand AWS VPC, subnets, Network ACLs, Security Groups, IAM Role, EKS.
    • MUST have experience of using Crossplane
    • MUST have working knowledge of GitOps, FluxCD, or ArgoCD
    • Experience configuring Kubernetes RBAC Authorization, Ingress controller, ServiceAccount, and AWS role annotations.
    • Basic experience with monitoring, and observability systems such as DataDog, Prometheus, Grafana, Kibana, CloudWatch.
    • Ability to triage and resolve incidents and lead incident investigations.
    • Experience working in a 24/7 on-call, highly transactional, or streaming production environment.
    • Experience with Kubernetes Operators is a plus.


    Why Join JoVE?

    • When working with JoVE, you can expect compensation packages competitively placed within the local market.
    • You will make a direct impact in accelerating science research and in improving student learning in science education.
    • Opportunity to work with global teams and in an environment that promotes innovation and collaboration.
    • Our strong promotion from within culture draws a clear path to advance your career with us.
    This advertiser has chosen not to accept applicants from your region.
    Be The First To Know

    About the latest Site reliability engineer Jobs in Coimbatore !

    Site Reliability Engineer (Junior)

    Coimbatore, Tamil Nadu CES

    Posted today

    Job Viewed

    Tap Again To Close

    Job Description

    We’re looking for a highly skilled Site Reliability Engineer to help us build, manage, and scale modern infrastructure systems for high-availability applications. If you're passionate about automation, cloud platforms, and solving tough operational challenges, we would love to hear from you.


    Key Skills and Competencies

    • 3+ years of extensive experience with Infrastructure as Code (IaC) and Desired State Configuration (DSC) tools like Terraform, CDK, and Chef
    • Experience in packaging, deploying, and managing containerized workloads on Docker and Kubernetes
    • Expertise in managing AWS infrastructure at scale – EC2, S3, ELB, Lambda, Route 53, ECS, SQS, CloudWatch
    • Prior experience working in DevOps or SRE environments
    • Strong automation/scripting skills using PowerShell, Ruby, Go, Python, and Bash
    • Hands-on with monitoring and reporting tools – ELK Stack, Dynatrace, New Relic, Nagios
    • Experience with IIS management , performance monitoring, and troubleshooting
    • Background in web farm management for high-traffic SaaS applications
    • Strong problem-solving and root-cause analysis skills
    • Experience working with .NET application architectures – caching, content delivery, high availability, load balancing
    • Familiarity with CI/CD pipelines and tools – TeamCity, Octopus Deploy, GitHub, Jenkins, Codefresh , etc


    Responsibilities:

    • Drive initiatives to improve platform scalability and operational efficiency
    • Lead standardization efforts across engineering and infrastructure teams
    • Identify opportunities to improve and automate deployments, visibility, and management
    • Apply cloud security best practices to ensure infrastructure safety
    • Provide full-stack diagnostics and resolve complex infrastructure issues
    • Track performance metrics and make data-backed improvement decisions
    • Proactively suggest infrastructure or process changes for system reliability
    • Ensure disaster recovery readiness and implement high availability systems
    • Build support workflows and assist with incident response
    • Own and improve the customer experience through system reliability and uptime


    Personal Attributes:

    • Passionate about learning and applying new technologies
    • A strong collaborator who believes in team success
    • Excellent communicator – verbal, written, and virtual
    • High integrity and commitment to ethical standards
    • Self-motivated, driven, and detail-oriented
    • Able to work independently on short-term projects
    This advertiser has chosen not to accept applicants from your region.
     

    Nearby Locations

    Other Jobs Near Me

    Industry

    1. request_quote Accounting
    2. work Administrative
    3. eco Agriculture Forestry
    4. smart_toy AI & Emerging Technologies
    5. school Apprenticeships & Trainee
    6. apartment Architecture
    7. palette Arts & Entertainment
    8. directions_car Automotive
    9. flight_takeoff Aviation
    10. account_balance Banking & Finance
    11. local_florist Beauty & Wellness
    12. restaurant Catering
    13. volunteer_activism Charity & Voluntary
    14. science Chemical Engineering
    15. child_friendly Childcare
    16. foundation Civil Engineering
    17. clean_hands Cleaning & Sanitation
    18. diversity_3 Community & Social Care
    19. construction Construction
    20. brush Creative & Digital
    21. currency_bitcoin Crypto & Blockchain
    22. support_agent Customer Service & Helpdesk
    23. medical_services Dental
    24. medical_services Driving & Transport
    25. medical_services E Commerce & Social Media
    26. school Education & Teaching
    27. electrical_services Electrical Engineering
    28. bolt Energy
    29. local_mall Fmcg
    30. gavel Government & Non Profit
    31. emoji_events Graduate
    32. health_and_safety Healthcare
    33. beach_access Hospitality & Tourism
    34. groups Human Resources
    35. precision_manufacturing Industrial Engineering
    36. security Information Security
    37. handyman Installation & Maintenance
    38. policy Insurance
    39. code IT & Software
    40. gavel Legal
    41. sports_soccer Leisure & Sports
    42. inventory_2 Logistics & Warehousing
    43. supervisor_account Management
    44. supervisor_account Management Consultancy
    45. supervisor_account Manufacturing & Production
    46. campaign Marketing
    47. build Mechanical Engineering
    48. perm_media Media & PR
    49. local_hospital Medical
    50. local_hospital Military & Public Safety
    51. local_hospital Mining
    52. medical_services Nursing
    53. local_gas_station Oil & Gas
    54. biotech Pharmaceutical
    55. checklist_rtl Project Management
    56. shopping_bag Purchasing
    57. home_work Real Estate
    58. person_search Recruitment Consultancy
    59. store Retail
    60. point_of_sale Sales
    61. science Scientific Research & Development
    62. wifi Telecoms
    63. psychology Therapy
    64. pets Veterinary
    View All Site Reliability Engineer Jobs View All Jobs in Coimbatore