Site Reliability Engineer

Hyderabad, Andhra Pradesh ServiceNow, Inc.

Posted today

Job Viewed

Tap Again To Close

Job Description

It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today - ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500®. Our intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and better ways to work. But this is just the beginning of our journey. Join us as we pursue our purpose to make the world work better for everyone.
**What you get to do in this role:**
+ Drive immediate relief and provide a sustainable resolution to issues within the ServiceNow platform.
+ Use knowledge and experience in software development, application support, systems engineering and networking to proactively prevent issues from reoccurring.
+ Drive internal stakeholders and partner teams to improve the reliability, scalability and performance of the infrastructure through improved system design.
+ Drive and contribute to a culture of intolerance to manual activity, which results in an automation environment delivering repeatable and scalable response to system issues.
**To be successful in this role you have:**
+ Experience in leveraging or critically thinking about how to integrate AI into work processes, decision-making, or problem-solving. This may include using AI-powered tools, automating workflows, analyzing AI-driven insights, or exploring AI's potential impact on the function or industry.
+ 3- 5 Yrs of experience in Linux systems.
+ Coding in any development/scripting languages like Javascript, Python, C++, Java
+ Networking skills and IP addressing.
+ MySQL database administration.
+ Monitoring of performance/availability in systems, applications and networks.
+ Uncompromising attention to detail.
+ Ability to work in shifts that cover one weekend day.
JV20
**Work Personas**
We approach our distributed world of work with flexibility and trust. Work personas (flexible, remote, or required in office) are categories that are assigned to ServiceNow employees depending on the nature of their work and their assigned work location. Learn more here ( . To determine eligibility for a work persona, ServiceNow may confirm the distance between your primary residence and the closest ServiceNow office using a third-party service.
**Equal Opportunity Employer**
ServiceNow is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, national origin or nationality, ancestry, age, disability, gender identity or expression, marital status, veteran status, or any other category protected by law. In addition, all qualified applicants with arrest or conviction records will be considered for employment in accordance with legal requirements.
**Accommodations**
We strive to create an accessible and inclusive experience for all candidates. If you require a reasonable accommodation to complete any part of the application process, or are unable to use this online application and need an alternative method to apply, please contact for assistance.
**Export Control Regulations**
For positions requiring access to controlled technology subject to export control regulations, including the U.S. Export Administration Regulations (EAR), ServiceNow may be required to obtain export control approval from government authorities for certain individuals. All employment is contingent upon ServiceNow obtaining any export license or other approval that may be required by relevant export control authorities.
From Fortune. ©2025 Fortune Media IP Limited. All rights reserved. Used under license.
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Hyderabad, Andhra Pradesh Amgen

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Join Amgen's Mission of Serving Patients**
At Amgen, if you feel like you're part of something bigger, it's because you are. Our shared mission-to serve patients living with serious illnesses-drives all that we do.
Since 1980, we've helped pioneer the world of biotech in our fight against the world's toughest diseases. With our focus on four therapeutic areas -Oncology, Inflammation, General Medicine, and Rare Disease- we reach millions of patients each year. As a member of the Amgen team, you'll help make a lasting impact on the lives of patients as we research, manufacture, and deliver innovative medicines to help people live longer, fuller happier lives.
Our award-winning culture is collaborative, innovative, and science based. If you have a passion for challenges and the opportunities that lay within them, you'll thrive as part of the Amgen team. Join us and transform the lives of patients while transforming your career.
Site Reliability Engineer
**What you will do**
Let's do this. Let's change the world. In this vital role you will responsible for the reliability, stability, performance, scalability, and security of platforms that support Amgen's digital products and engineering teams. This hands-on role focuses on supporting cloud-based infrastructure, automating operations, maintaining observability, and improving platform reliability through code.
You'll work closely with senior engineers and cross-functional teams to support CI/CD workflows, container platforms, incident response, and enterprise tooling-all while adopting modern SRE principles and practices.
This role is ideal for engineers who have foundational site reliability experience and are looking to expand their skills in a cloud-native, enterprise-scale environment.
**Roles & Responsibilities:**
**Infrastructure & Platform Support**
+ Provision and manage cloud infrastructure using Infrastructure as Code (IaC)
+ Support container orchestration platforms, ensuring availability, access control, and resource management
+ Assist in configuring and maintaining CI/CD pipelines and environments
**Monitoring & Incident Response**
+ Set up and maintain observability tools to track system health and performance
+ Participate in alert tuning, incident resolution, and root cause analysis
+ Support integration of observability platforms with incident response workflows
**Automation & Platform Operations**
+ Automate routine platform tasks such as provisioning, patching, and configuration
+ Write scripts to improve platform reliability, reduce manual work, and enforce compliance
+ Participate in platform upgrades, maintenance windows, and service validation efforts
**AI Enablement & Intelligence**
+ Support the adoption of AI-assisted operational tools for log analysis, anomaly detection, and predictive alerts
+ Collaborate with senior engineers to evaluate AI/ML-based observability and automation platforms
+ Assist in integrating AI-driven insights into dashboards, alerts, or incident workflows
+ Stay current with emerging AI trends in infrastructure and site reliability, and contribute to tool evaluations and pilots
**Collaboration & Enablement**
+ Work with development, QA, and security teams to ensure reliable and secure deployments
+ Document operational procedures, playbooks, and system runbooks
+ Learn and support enterprise collaboration platforms and internal tooling
+ Participate in Agile and SAFe delivery processes-including sprint planning, stand-ups, retrospectives, and PI planning-to ensure security and platform reliability are embedded across development cycles.
**What we expect of you**
We are all different, yet we all use our unique contributions to serve patients. The (vital attribute) professional we seek is a (type of person) with these qualifications.
**Basic Qualifications:**
+ Master's degree / Bachelor's degree and 5 to 9 years in Computer Science, IT or related field
+ 4 years of hands-on related experience in site reliability, DevOps, or platform engineering roles
+ Hands-on experience with cloud platforms preferably AWS
+ Familiarity with Kubernetes or container orchestration technologies
+ Exposure to CI/CD practices and pipeline automation
+ Experience troubleshooting Linux systems, processes, and services
**Preferred Qualifications:**
**Must-Have Skills:**
+ Practical experience with **cloud platforms** (e.g., AWS, Azure, or GCP), including compute, networking, IAM, and storage services
+ Familiarity with **container orchestration platforms** (e.g., Kubernetes, Docker), including basic workload deployment and troubleshooting
+ Experience using **Infrastructure as Code (IaC)** tools such as **Terraform** or **CloudFormation**
+ Working knowledge of **Linux administration** , including system services, package management, and file system structures
+ Hands-on exposure to **CI/CD platforms** (e.g., GitLab CI, Jenkins, GitHub Actions) and pipeline troubleshooting
+ Proficiency in **scripting or automation languages** like **Python** , **Bash** , or **Go**
+ Exposure to **observability tooling** (e.g., **Dynatrace** , **Prometheus** , or **Grafana** ) for monitoring and alerting
+ Familiarity with **incident management practices** and tools (e.g., runbooks, escalation workflows, basic alert tuning)
+ Version control skills using **Git** and understanding of branching strategies
+ Experience supporting or integrating **enterprise collaboration platforms** (e.g., Jira, Confluence, ServiceNow)
+ Interest and basic understanding of **AI/ML tools** used in infrastructure and operations (e.g., anomaly detection, intelligent alerting, log analysis)
**Good-to-Have Skills:**
+ Experience using Infrastructure as Code (IaC) tools like Terraform or CloudFormation
+ Familiarity with IT incident response workflows and ticketing platforms
+ Knowledge of secrets management, configuration management tools (e.g., Ansible), or logging frameworks
+ Exposure to **AI-assisted tooling** (e.g., AIOps platforms, AI-enhanced alerting, anomaly detection)
**Professional Certifications (Preferred)**
+ Cloud DevOps Certification (AWS/Azure/GCP)
+ Certified Kubernetes Administrator (CKA) or Security Specialist (CKS)
+ CI/CD Platform Certification
+ ITIL Foundation or equivalent service management certification
**Soft Skills:**
+ Strong analytical and troubleshooting skills
+ Collaborative and proactive mindset
+ Effective communication and documentation practices
+ Curiosity and willingness to adopt new tools and methods, including AI integrations
+ Ability to manage time and prioritize tasks in dynamic environments
**Shift Information:** This position is an onsite role and may require working during later hours to align with business hours. Candidates must be willing and able to work outside of standard hours as required to meet business needs.
**What you can expect of us**
As we work to develop treatments that take care of others, we also work to care for your professional and personal growth and well-being. From our competitive benefits to our collaborative culture, we'll support your journey every step of the way.
In addition to the base salary, Amgen offers competitive and comprehensive Total Rewards Plans that are aligned with local industry standards.
**Apply now and make a lasting impact with the Amgen team.**
**careers.amgen.com**
As an organization dedicated to improving the quality of life for people around the world, Amgen fosters an inclusive environment of diverse, ethical, committed and highly accomplished people who respect each other and live the Amgen values to continue advancing science to serve patients. Together, we compete in the fight against serious disease.
Amgen is an Equal Opportunity employer and will consider all qualified applicants for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability status, or any other basis protected by applicable law.
We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Hyderabad, Andhra Pradesh Sonata Software

Posted 3 days ago

Job Viewed

Tap Again To Close

Job Description

Hello Connetions


Greetings of the day!


We have immediate openings for SRE


Role - Site Reliability Engineer

Experience - 7 to 12yrs

Work Location -Hyderabad

Notice Period -immediate


Interested candidates can share your CVs to -

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Hyderabad, Andhra Pradesh Tata Consultancy Services

Posted 3 days ago

Job Viewed

Tap Again To Close

Job Description

We are currently seeking a for a position SRE Engineer in Hyderabad.


**Job ID: **


**Apply Here:** (TCS iBegin )


**Job Description:**


  • Proven experience as a DevOps/SRE Engineer
  • Expertise in managing and optimizing GCP or Azure cloud-native services and AI/ML integration.
  • Experience or knowledge of Container technology such as Docker, Buildah and Kubernetes (GKE, AKS)
  • Must have 2+ scripting and programming experience (Python, Bash, .)
  • Proficiency in infrastructure-as-code tools, particularly Terraform and ArgoCD
  • Familiarity with observability tools such as Prometheus, Grafana, OpenTelemetry
  • Solid understanding of CI/CD concepts
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Hyderabad, Andhra Pradesh NationsBenefits India

Posted 3 days ago

Job Viewed

Tap Again To Close

Job Description

Job Title: Site Reliability Engineer (SRE) | Fintech | Kubernetes | Datadog | 24/7 Support

Department: Site Reliability Engineering

Location: Hyderabad, India

Employment Type: Full-Time

Notice period: 0-15 Days


We’re hiring a Site Reliability Engineer to join our SRE team focused on maintaining the performance, reliability, and availability of our fintech platforms.


Key Responsibilities:

  • Triage and resolve production incidents; respond to alerts from Datadog
  • Monitor and troubleshoot Kubernetes workloads and cloud environments
  • Develop automation tools using C#, Java, Python, PowerShell, or Bash
  • Support and improve CI/CD pipelines and service uptime
  • Participate in 24/7 on-call rotations , including weekends and holidays
  • Ensure compliance with PCI DSS , ISO 27001 , and other fintech standards

Requirements:

  • 3+ years in SRE, DevOps, or related roles
  • Hands-on with monitoring tools , Kubernetes , and scripting/programming
  • Strong incident management and root cause analysis skills
  • Experience in high-transaction or regulated environments
  • Excellent communication and cross-team collaboration skills


Nice to Have: Cloud experience (AWS, Azure, GCP), Helm, CI/CD tools, ITIL/agile exposure

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Hyderabad, Andhra Pradesh Talentiser

Posted 3 days ago

Job Viewed

Tap Again To Close

Job Description

YOUR IMPACT:

Reliability, Automation, and Observability As a hybrid Site Reliability Engineer/DevOps Engineer, you'll be a key driver in ensuring the stability, performance, and scalability of our mission-critical SaaS platform. You'll apply engineering principles to operational challenges, constantly striving to eliminate toil through automation.


Operational Excellence & Reliability

● Provide day-to-day management of system alerts, check system health, and escalate issues as necessary to maintain high availability.

● Actively participate in a 24x7 on-call rotation for critical SaaS platform incidents, and be available in case of emergencies.

● Lead the incident response process, ensuring fast and effective mitigation and resolution of production issues.

● Perform thorough Root Cause Analysis (RCA) and lead blameless post-mortems to identify systemic weaknesses and create a corrective action plan to prevent recurrence.

● Collaborate with engineering teams to set and enforce error budgets (derived from SLOs, or Service Level Objectives), ensuring a healthy balance between development speed and system stability.


Platform Automation & Infrastructure Development

● Automate routine operational tasks to reduce manual effort and "toil" and increase overall team efficiency.

● Design, deploy, and maintain cloud infrastructure using Infrastructure as Code (IaC), specifically leveraging Terraform and Helm for deployment to EKS/K8s clusters.

● Improve existing infrastructure health by developing and implementing checks and scripts to proactively correct known issues and self-heal the platform.

● Maintain, develop, and evolve our Continuous Integration/Continuous Delivery (CI/CD) deployment code and pipelines.

● Learn and maintain existing infrastructure running under Docker and Docker Swarm while driving migration strategies toward EKS/K8s.

● Implement and integrate new technologies and services into our Cloud Infrastructure to enhance platform capabilities and resilience.


Monitoring & Observability

● Design and implement comprehensive Observability strategies across all three pillars: Metrics, Logs, and Traces.

● Proactively create and refine robust monitoring and alerting configurations within the EKS/K8s ecosystem.

● Utilize and maintain our Observability platform, Datadog, to gather performance data, create complex synthetic tests, and visualize system health via dashboards.

● Leverage existing monitoring solutions such as Grafana and Prometheus while planning and executing the migration or integration of data into a unified platform.

● Document all issues, remediation steps, system architecture, and runbooks to facilitate knowledge transfer and rapid incident response.

● Collaborate closely with Support, Customer Success, Migration, and Professional Services teams to provide the highest level of SaaS service and minimize customer impact during changes.

● Apply a real customer focus when planning deployments/updates, always considering the impact on the end-user before making changes.


YOUR EXPERIENCE: Essential Skills and Qualifications

Hands-on AWS Cloud Engineer experience, with expert working knowledge of the AWS Cloud ecosystem, including a good understanding of AWS IAM roles and policies.

● Proficiency with container orchestration technologies: EKS/Kubernetes (K8s).

● Demonstrable experience with Infrastructure as Code (IaC) tools, specifically Terraform and Helm. ● Working experience with Docker and maintaining systems using Docker Swarm.

● Expertise in setting up and managing logging and monitoring solutions. Direct experience with Datadog is highly preferred, with experience in setting up APM, infrastructure monitoring, and custom dashboards.

● Experience with existing monitoring solutions such as Grafana and Prometheus is required.

● Proficient in a Linux environment and strong skills in Bash and/or Python scripting for automation and troubleshooting.

● A strong understanding of web technologies, including REST APIs, Systems Architecture, Design, and Databases.

● Experience in Product/Application Support for high-availability SaaS-based products.

● Experience in designing, implementing, and operating in a DevSecOps environment.

● Excellent oral and written communication skills, with the ability to clearly explain complex technical issues and RCAs to both technical and customer-facing audiences.

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Hyderabad, Andhra Pradesh Sonata Software

Posted 3 days ago

Job Viewed

Tap Again To Close

Job Description

Category

Details

Role

Site Reliability Engineer (SRE) III – Data Engineering

Location

Hyderabad-

Employment Type

Full Time

Experience

7–12 years in site reliability, cloud-based data infrastructure, data pipeline observability, automation, and high-availability engineering within EdTech platforms (2U)

Primary Skills (Must-Have)

AWS, CI/CD, Jenkins, IAAC, Terraform, Kubernetes

Secondary Skills (Good-to-Have)

AWS systems; Dataiku data, Platform updates and patching

Tools & Platforms

Data Warehousing & Processing: Snowflake, Redshift, Apache Airflow, dbt

CI/CD & Deployment: Jenkins, GitHub Actions, AWS CodePipeline, Terraform

Cloud & Event Processing: AWS Lambda, API Gateway, SNS/SQS, Kafka, Step Functions

Monitoring & Logging: DataDog, AWS CloudWatch, Prometheus, Splunk

Incident Management: PagerDuty, Opsgenie, AWS Health Dashboard

Collaboration & Code Review: GitHub, Jira, Confluence

Key Responsibilities

Data Pipeline Reliability & Observability:

- Maintain and optimize highly available, fault-tolerant infrastructure for data pipelines, ETL jobs, and real-time data processing

- Implement end-to-end monitoring of Airflow DAGs, Snowflake queries, and AWS-based data workflows

- Automate data pipeline health checks, error handling, and auto-remediation strategies

Infrastructure & Cloud Automation:

- Deploy and manage AWS-based data infrastructure using Terraform and CloudFormation

- Optimize Kubernetes (EKS) clusters for processing large-scale datasets and real-time analytics

- Ensure high availability and cost-efficient scaling for Redshift, Snowflake, and data storage solutions

Performance, Monitoring & Incident Response:

- Implement real-time monitoring, logging, and alerting using DataDog, AWS CloudWatch, and Prometheus

- Define and track SLOs, SLIs, and error budgets to improve data reliability and uptime

- Conduct Root Cause Analysis (RCA), security audits, and post-mortems for incidents

Security & Compliance:

- Ensure GDPR, CCPA, and SOC 2 compliance for data storage, access controls, and retention policies

- Implement AWS security best practices (IAM, KMS, Shield, WAF) to secure data access and encryption

- Secure API gateways, authentication mechanisms, and data lake permissions to prevent unauthorized access

Collaboration & Leadership:

- Work closely with data engineers, analytics teams, and DevOps engineers to enhance data platform reliability

- Participate in incident response drills, disaster recovery planning, and security compliance reviews

- Advocate for best practices in automation, cost optimization, and cloud-native data solutions

This advertiser has chosen not to accept applicants from your region.
Be The First To Know

About the latest Devops engineers Jobs in Hyderabad !

Site Reliability Engineer

Hyderabad, Andhra Pradesh Sonata Software

Posted 3 days ago

Job Viewed

Tap Again To Close

Job Description

Role: Site Reliability Engineer

Location: Hyderabad

Notice Period: Immediate to 20 Days

Employment Type: Full Time

Experience

  • 7–12 years in site reliability, cloud-based data infrastructure, data pipeline observability, automation, and high-availability engineering within EdTech platforms (2U)
  • Primary Skills (Must-Have)
  • AWS, CI/CD, Jenkins, IAAC, Terraform, Kubernetes
  • Secondary Skills (Good-to-Have)
  • AWS systems; Dataiku data, Platform updates and patching
  • Tools & Platforms
  • Data Warehousing & Processing: Snowflake, Redshift, Apache Airflow, dbt
  • CI/CD & Deployment: Jenkins, GitHub Actions, AWS CodePipeline, Terraform
  • Cloud & Event Processing: AWS Lambda, API Gateway, SNS/SQS, Kafka, Step Functions
  • Monitoring & Logging: DataDog, AWS CloudWatch, Prometheus, Splunk
  • Incident Management: PagerDuty, Opsgenie, AWS Health Dashboard
  • Collaboration & Code Review: GitHub, Jira, Confluence

Key Responsibilities

Data Pipeline Reliability & Observability:

- Maintain and optimize highly available, fault-tolerant infrastructure for data pipelines, ETL jobs, and real-time data processing

- Implement end-to-end monitoring of Airflow DAGs, Snowflake queries, and AWS-based data workflows

- Automate data pipeline health checks, error handling, and auto-remediation strategies


Infrastructure & Cloud Automation:

- Deploy and manage AWS-based data infrastructure using Terraform and CloudFormation

- Optimize Kubernetes (EKS) clusters for processing large-scale datasets and real-time analytics

- Ensure high availability and cost-efficient scaling for Redshift, Snowflake, and data storage solutions


Performance, Monitoring & Incident Response:

- Implement real-time monitoring, logging, and alerting using DataDog, AWS CloudWatch, and Prometheus

- Define and track SLOs, SLIs, and error budgets to improve data reliability and uptime

- Conduct Root Cause Analysis (RCA), security audits, and post-mortems for incidents


Security & Compliance:

- Ensure GDPR, CCPA, and SOC 2 compliance for data storage, access controls, and retention policies

- Implement AWS security best practices (IAM, KMS, Shield, WAF) to secure data access and encryption

- Secure API gateways, authentication mechanisms, and data lake permissions to prevent unauthorized access


Collaboration & Leadership:

- Work closely with data engineers, analytics teams, and DevOps engineers to enhance data platform reliability

- Participate in incident response drills, disaster recovery planning, and security compliance reviews

- Advocate for best practices in automation, cost optimization, and cloud-native data solutions

This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Hyderabad, Andhra Pradesh SID Global Solutions

Posted 3 days ago

Job Viewed

Tap Again To Close

Job Description

Job Role: Site Reliability Engineer (SRE) – GCP

Experience: 3+ years

Location: Hyderabad


About SIDGS:

SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management, Microservices, DevOps, Cloud, Service Mesh, Artificial Intelligence, RPA domains.

We create innovative solutions in Digital, API Management, Cloud and DevOps space in partnership with Google. We understand that every business has a unique set of challenges and opportunities, and we leverage our unique industry insights, honed through decades of combined experience in the technology sector, to deliver the products, solutions, and services necessary to achieve best customer satisfaction and delivering positive impact to the communities.

Location: Hyderabad (Work from Office only)

Job Type: Full Time

  • The Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems.
  • We need someone who can join within 0-30 days only.
  • We are looking for someone who passionate about learning and developing their skills in system reliability, automation, and incident response. You will work closely with senior SREs, DevOps teams, and other stakeholders to ensure the services meet the highest standards of reliability and performance.

Key Responsibilities:

  • Monitor system performance and availability across GCP and Anthos environments.
  • Respond to incidents, perform root cause analysis, and implement fixes.
  • Escalate issues to senior team members as needed.
  • Assist in developing and maintaining automation scripts and tools to improve efficiency.
  • Collaborate with senior SREs to identify and implement improvements in system performance and reliability.
  • Document processes, incident reports, and troubleshooting guides.
  • Continuously learn and apply new skills and technologies relevant to the SRE role.
  • Participate in training sessions and workshops to enhance knowledge.
  • Basic knowledge of monitoring tools

Skills:

  • Relevant experience 1-2 years of experience in 24x7 support of enterprise level applications
  • Strong problem-solving skills and attention to detail.
  • Excellent communication and teamwork abilities.
  • Willingness to learn and adapt in a fast-paced environment
  • Knowledge of CI/CD pipelines and tools (e.g., Jenkins, GitLab CI).
  • Basic understanding of Linux/Unix systems and networking concepts.
  • Familiarity with Kubernetes and container orchestration.
  • Nice to have experience in Apigee environment
  • Preferred Qualifications: Graduate in Computers, Engineering or similar educational qualification
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Hyderabad, Andhra Pradesh ValueMomentum

Posted 3 days ago

Job Viewed

Tap Again To Close

Job Description

About the Role

We are seeking an experienced Site Reliability / Azure DevOps Engineer with Dynatrace Experience to join our engineering team and contribute to scalable CI/CD practices, infrastructure automation, and cloud operations. The ideal candidate will have deep expertise in Azure DevOps, Infrastructure as Code (IaC), Azure services, and modern DevOps practices.

Key Responsibilities

  • Design and maintain CI/CD pipelines using Azure DevOps.
  • Implement Infrastructure as Code using tools like ARM, Bicep, or Terraform.
  • Automate deployment and configuration of cloud resources on Azure.
  • Must have experience in Dynatrace.
  • Collaborate with development, QA, and cloud teams to streamline DevOps workflows.
  • Monitor, secure, and optimize infrastructure and pipelines for performance and cost-efficiency.
  • Troubleshoot build/deployment failures and perform root cause analysis.
  • Set up dashboards, alerts, and reporting for infrastructure health and deployment metrics.

Required Skills

  • Strong hands-on experience with Azure DevOps with Dynatrace. (Repos, Pipelines, Artifacts, Boards).
  • Proficiency in PowerShell and/or Bash scripting .
  • Experience with Infrastructure as Code using ARM templates, Bicep , or Terraform .
  • Familiarity with containerization (Docker, Kubernetes) and Azure Kubernetes Service (AKS) .
  • Experience integrating with GitHub Actions, SonarQube, Nexus/Artifactory is a plus.
  • Experience working with Azure services like App Services, Key Vault, Storage, Monitor, etc.
  • Exposure to Agile/Scrum methodology and DevSecOps principles.

Preferred Qualifications

  • Microsoft Azure certifications (e.g., AZ-400, AZ-104).
  • Experience with hybrid and multi-cloud environments.
  • Familiarity with security best practices in CI/CD pipelines.
This advertiser has chosen not to accept applicants from your region.
 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Devops Engineers Jobs View All Jobs in Hyderabad