10,951 Google Engineer jobs in India

Site Reliability Engineer

Bangalore, Karnataka IBM

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Introduction**
IBM's CIO Network Engineering Team has an exciting opportunity for a Networking Site Reliability Engineer for our corporate network. This role will be joining a collaborative, cross-organization environment with a cooperative "whatever-it takes" approach by each team member. Networking Site Reliability Engineers will focus on Availability, Performance, Automation, Efficiency & Change Management. This is a hands-on role with ability to work with SMEs across all networking product disciplines to drive stability and reliability concepts into every segment of our network with the goal to maintain the very best IBMer user experience. This role requires development skills to technically contribute to the quality of our network deployments and inspire both team members and product teams to embrace SRE concepts and disciplines to create a strong and successful culture. Strong candidates will have established experience in software development, networking, problem solving skills. Ideal candidates will have worked in both a software development and network engineering capacity.
**Your role and responsibilities**
- Involvement in every facet of product segment support - from the earliest stages of influencing product architecture, design and development to deployment, troubleshooting, and performance analysis - to ensure a reliable quality product in production.
- Ability to collaborate and communicate clearly on status and progress.
- Design and build tools and automation to manage a rapidly growing number of networking devices and services.
- Take initiative to do what must be done in order to keep critical network product segments operating.
- When required, perform general OS updates/patches, networking/server/database configuration changes, installs and automation.
- Participate in periodic on-call rotation in a 7X24, follow the sun environment.
**Required technical and professional expertise**
· Overall 10+ years of experience.
· Minimum 8 years of Network trouble shooting / designing experience
· Proven knowledge understanding of various network tools ( Netim, Thousand eyes, Solar winds etc)
· CISCO / Juniper professional level certification or equivalent knowledge
· Proven Network Engineering and Operations skills with experience in leading Network and Security platforms, Network Routing, Switching, and Security Firewall and Proxy technologies
· Do deep dive, trouble shoot and resolve network issues reported for various platforms ( Cisco, Juniper, Palo Alto etc)
· Basic understanding of scripting / programming using python etc.
· 2-3 years of demonstrated experience with Python, BASH or other scripting languages.
· Demonstrated and proven understanding of enterprise network architecture principles and practices.
· Experience using and coding to API frameworks.
· Experience with code repository and revision control systems (e.g GitHub).
· Exposure to identity integration technologies such as LDAP, SAML/SSO.
· Proven track record in enterprise network service management.
**Preferred technical and professional experience**
* Network Certifications (e.g. CCNA, CCNP, CCIE etc.)
* Familiarity with full software development life cycle (SDLC): Analysis, Design, Coding, Testing, Deploying, Training, Maintaining and Operational Support.
* Previous experience working in Agile concepts and methodologies.
* Ability to develop tools and features and functions based on client interviews and documented requirements.
IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Bangalore, Karnataka WESCO

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

The Site Reliability Engineer will be responsible for ensuring the availability, reliability, and performance of our customer-facing software applications. This role combines planning, engineering, monitoring, incident response, and administration to create highly scalable and fault-tolerant systems.
**Responsibilities:**
+ Ensure the high availability and reliability of the production environment by monitoring system health and performance
+ Provide primary operational support for large-scale distributed software applications
+ Facilitate incident resolution via triage, communication, engagement, escalation, and documentation
+ Partner with platform administration (both internal and external) to define and achieve stability and scalability objectives
+ Collaborate with technical and quality teams to improve services by identifying areas of risk and helping to define and proactively implement solutions
+ Drive continual improvement in system performance by setting service level objectives in collaboration with a performance center of practice and/or product development teams
+ Participate in system design, capacity planning, and platform management
+ Analyze and publish metrics from operating systems and applications to assist in performance tuning and fault finding
+ Pursue opportunities for automation and process improvements
**Qualifications:**
+ Bachelor's degree (or demonstrable equivalent work experience) in information technology
+ Experience providing first-level incident response and troubleshooting with technical teams to resolve end-user issues
+ Proficiency with enterprise system monitoring software (examples: NewRelic, Nagios, Solarwinds, Dynatrace, Datadog, Azure Monitor, Splunk)
+ Experience with cloud-based infrastructure, databases, and applications
+ Experience with performance tuning and fault finding in large-scale distributed systems.
+ Experience with designing, implementing, and managing performance testing practices, including specific tools and frameworks
+ Knowledge of disaster recovery planning and execution.
+ Ability to effectively work in a highly matrixed organization
+ Excellent verbal and written communication skills.
+ Strong understanding of coding, automation, and engineering principles to build resilient, self-healing systems
+ Familiarity with DevOps practices and tools
+ Jira (or equivalent work management)
+ Confluence (or equivalent knowledge management)
_#LI-KS1_
At Wesco, we build, connect, power and protect the world. As a leading provider of business-to-business distribution, logistics services and supply chain solutions, we create a world that you can depend on.
Our Company's greatest asset is our people. Wesco is committed to fostering a workplace where every individual is respected, valued, and empowered to succeed. We promote a culture that is grounded in teamwork and respect. With a workforce of over 20,000 people worldwide, we embrace the unique perspectives each person brings. Through comprehensive benefits ( and active community engagement, we create an environment where every team member has the opportunity to thrive.
Learn more about Working at Wesco here ( and apply online today!
Founded in 1922 and headquartered in Pittsburgh, Wesco is a publicly traded (NYSE: WCC) FORTUNE 500® company.
_Wesco International, Inc., including its subsidiaries and affiliates ("Wesco") provides equal employment opportunities to all employees and applicants for employment. Employment decisions are made without regard to race, religion, color, national or ethnic origin, sex, sexual orientation, gender identity or expression, age, disability, or other characteristics protected by law. US applicants only, we are an Equal Opportunity Employer._
_Los Angeles Unincorporated County Candidates Only: Qualified applicants with arrest or conviction records will be considered for employment in accordance with the Los Angeles County Fair Chance Ordinance and the California Fair Chance Act._
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Hyderabad, Andhra Pradesh ServiceNow, Inc.

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today - ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500®. Our intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and better ways to work. But this is just the beginning of our journey. Join us as we pursue our purpose to make the world work better for everyone.
**What you get to do in this role:**
+ Drive immediate relief and provide a sustainable resolution to issues within the ServiceNow platform.
+ Use knowledge and experience in software development, application support, systems engineering and networking to proactively prevent issues from reoccurring.
+ Drive internal stakeholders and partner teams to improve the reliability, scalability and performance of the infrastructure through improved system design.
+ Drive and contribute to a culture of intolerance to manual activity, which results in an automation environment delivering repeatable and scalable response to system issues.
**To be successful in this role you have:**
+ Experience in leveraging or critically thinking about how to integrate AI into work processes, decision-making, or problem-solving. This may include using AI-powered tools, automating workflows, analyzing AI-driven insights, or exploring AI's potential impact on the function or industry.
+ 3- 5 Yrs of experience in Linux systems.
+ Coding in any development/scripting languages like Javascript, Python, C++, Java
+ Networking skills and IP addressing.
+ MySQL database administration.
+ Monitoring of performance/availability in systems, applications and networks.
+ Uncompromising attention to detail.
+ Ability to work in shifts that cover one weekend day.
JV20
**Work Personas**
We approach our distributed world of work with flexibility and trust. Work personas (flexible, remote, or required in office) are categories that are assigned to ServiceNow employees depending on the nature of their work and their assigned work location. Learn more here ( . To determine eligibility for a work persona, ServiceNow may confirm the distance between your primary residence and the closest ServiceNow office using a third-party service.
**Equal Opportunity Employer**
ServiceNow is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, national origin or nationality, ancestry, age, disability, gender identity or expression, marital status, veteran status, or any other category protected by law. In addition, all qualified applicants with arrest or conviction records will be considered for employment in accordance with legal requirements.
**Accommodations**
We strive to create an accessible and inclusive experience for all candidates. If you require a reasonable accommodation to complete any part of the application process, or are unable to use this online application and need an alternative method to apply, please contact for assistance.
**Export Control Regulations**
For positions requiring access to controlled technology subject to export control regulations, including the U.S. Export Administration Regulations (EAR), ServiceNow may be required to obtain export control approval from government authorities for certain individuals. All employment is contingent upon ServiceNow obtaining any export license or other approval that may be required by relevant export control authorities.
From Fortune. ©2025 Fortune Media IP Limited. All rights reserved. Used under license.
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Akamai Technologies, Inc.

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Do you have a passion for cutting edge technologies and tackling system problems?**
**Are you a self-starting professional who thrives in a dynamic environment?**
**Join the Akamai SIA Mobile SRE team**
As Site Reliability Engineer you'll be responsible for the operational stability and performance of critical systems and services. Part of a Global team of SREs, you'll work together to provide best-in-class services. Supporting our mission-critical platform for our customers.
**Partner with the best**
The SRE will focus supporting, maintaining our development, pre-production, production environments. Be familiar with designing, implementing solutions at mid to large enterprise scale. Have experience monitoring large-scale production environments, be skilled in troubleshooting Cisco, Juniper, Windows and Linux systems. You will be working closely with our development teams to ensure the reliability, scalability, and performance of our applications and services.
As a Site Reliability Engineer, you will be responsible for:
+ Deploying, configuring, and operating network devices.
+ Monitoring production environments and respond to alerts (mandatory).
+ Identifying and implement platform improvements through automation and new technologies.
+ Troubleshooting and support customer and platform-related issues.
+ Participating in an on-call rotation with the SRE team.
**Do what you love**
To be successful in this role you will:
+ Have minimum 2 years of relevant experience and have experience working with IPSec Site-to-Site VPNs (Juniper/Cisco).
+ Have hands-on expertise with Routing, Switching, Datacentre, and Service Provider technologies (BGP, ISIS/OSPF, EVPN, VXLAN, LDP, RSVP, MPLS).
+ Have proficiency in scripting and automation (Python/Ansible).
+ Have familiarity with monitoring tools such as OpenNMS, Nagios, Grafana and knowledge of firewalls (Cisco ASA/Juniper SRX).
+ Have basic troubleshooting skills for Linux and Windows environments.
+ Have mobility/Telco experience is good but not mandatory.
**Work in a way that works for you**
FlexBase, Akamai's Global Flexible Working Program, is based on the principles that are helping us create the best workplace in the world. When our colleagues said that flexible working was important to them, we listened. We also know flexible working is important to many of the incredible people considering joining Akamai. FlexBase, gives 95% of employees the choice to work from their home, their office, or both (in the country advertised). This permanent workplace flexibility program is consistent and fair globally, to help us find incredible talent, virtually anywhere. We are happy to discuss working options for this role and encourage you to speak with your recruiter in more detail when you apply.
Learn ( what makes Akamai a great place to work
Connect with us on social and see what life at Akamai is like!
**We power and protect life online, by solving the toughest challenges, together.**
At Akamai, we're curious, innovative, collaborative and tenacious. We celebrate diversity of thought and we hold an unwavering belief that we can make a meaningful difference. Our teams use their global perspectives to put customers at the forefront of everything they do, so if you are people-centric, you'll thrive here.
**Working for you**
At Akamai, we will provide you with opportunities to grow, flourish, and achieve great things. Our benefit options are designed to meet your individual needs for today and in the future. We provide benefits surrounding all aspects of your life:
+ Your health
+ Your finances
+ Your family
+ Your time at work
+ Your time pursuing other endeavors
Our benefit plan options are designed to meet your individual needs and budget, both today and in the future.
**About us**
Akamai powers and protects life online. Leading companies worldwide choose Akamai to build, deliver, and secure their digital experiences helping billions of people live, work, and play every day. With the world's most distributed compute platform from cloud to edge we make it easy for customers to develop and run applications, while we keep experiences closer to users and threats farther away.
**Join us**
Are you seeking an opportunity to make a real difference in a company with a global reach and exciting services and clients? Come join us and grow with a team of people who will energize and inspire you!
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Pune, Maharashtra Ensono

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

Site Reliability EngineerPune, IndiaJR
**About Ensono**
Ensono is an expert technology adviser and managed service provider. As a relentless ally, we accelerate clients' digital transformation to achieve business outcomes that stand to last. Our dedicated team helps organizations optimize today's systems across any hybrid environment with services such as consulting, mainframe and application modernization, public cloud migration and cloud-native development. With certified experts in AWS, Azure and Google Cloud and recognized as Microsoft Datacenter Transformation Partner of the Year, Ensono has over 3500+ associates globally and is headquartered in greater Chicago.
We care about your success, offering comprehensive strategic and managed services for mission-critical applications. Our Advisory and Consulting services can help upfront with an application strategy or find the right places for your applications - whether it's public, multi or hybrid cloud, or mainframe. And because we span across all mission-critical platforms, we can meet you wherever you are in your digital transformation journey, with 24/7 support when you need it. We are your relentless ally, flexing with you when challenges emerge so you don't feel stuck in place. With cross-platform certifications and decades of experience, our technology experts have become an extension of your team so you're continuously innovating - doing more with less while remaining secure. And that's just the beginning **.**
**Technical Skills:**
SRE
+ Commercial experience and proficiency with industry standard:
1.
1. IAC tooling (Terraform preferably, or ARM/bicep and CloudFront)
2. Core CI/CD Tooling (Azure DevOps, GitHub Actions or Gitlab - experience with Harness is beneficial)
3. Monitoring Tooling (DataDog, Splunk, NewRelic, Azure Monitor)
+ Commercial experience in at least one core technology (Dotnet, Java, Javascript)
1.
1. Troubleshooting issues and identifying systemic failings indicated by incidents/failures
2. Implementing fixes
3. Proposing solutions for reducing toil
+ Providing leadership in the Incident resolution process, including creating and maintaining documentation, and providing key input to Post-mortem analysis
+ Improving Service Requests and Change Management processes, both technically and through stakeholder management).
+ Participate in the process for, and Proactively mitigate risks in a Security management process (Vulnerabilities in Code, Infrastructure, Dependencies)
+ Lead discussion in client-facing meetings and discussions around the SRE process, and identifying areas for increasing SRE footprint.
+ Engaging with suppliers and 3rd parties for support, requests and opportunities
+ SRE Foundation certificate (DevOps Institute) and an Azure 'Associate'-level certification highly beneficial or required during probationary period.
Experience: 4+ yrs
JR
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Bangalore, Karnataka Iron Mountain

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

At Iron Mountain we know that work, when done well, makes a positive impact for our customers, our employees, and our planet. That's why we need smart, committed people to join us. Whether you're looking to start your career or make a change, talk to us and see how you can elevate the power of your work at Iron Mountain.
We provide expert, sustainable solutions in records and information management, digital transformation services, data centers, asset lifecycle management, and fine art storage, handling, and logistics. We proudly partner every day with our 225,000 customers around the world to preserve their invaluable artifacts, extract more from their inventory, and protect their data privacy in innovative and socially responsible ways.
Are you curious about being part of our growth story while evolving your skills in a culture that will welcome your unique contributions? If so, let's start the conversation.
Category: Information Technology
Iron Mountain is a global leader in storage and information management services trusted by more than 225,000 organizations in 60 countries. We safeguard billions of our customers' assets, including critical business information, highly sensitive data, and invaluable cultural and historic artifacts. Take a look at our history here.
Iron Mountain helps lower cost and risk, comply with regulations, recover from disaster, and enable digital and sustainable solutions, whether in information management, digital transformation, secure storage and destruction, data center operations, cloud services, or art storage and logistics. Please see our Values and Code of Ethics for a look at our principles and aspirations in elevating the power of our work together.
If you have a physical or mental disability that requires special accommodations, please let us know by sending an email to See the Supplement to learn more about Equal Employment Opportunity.
Iron Mountain is committed to a policy of equal employment opportunity. We recruit and hire applicants without regard to race, color, religion, sex (including pregnancy), national origin, disability, age, sexual orientation, veteran status, genetic information, gender identity, gender expression, or any other factor prohibited by law.
To view the Equal Employment Opportunity is the Law posters and the supplement, as well as the Pay Transparency Policy Statement, CLICK HERE
**Requisition:** J
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Gurgaon, Haryana S&P Global

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

**About the Role:**
**Grade Level (for internal use):**
10
Department overview
S&P Global provides innovative products and services that enhance transparency, reduce risk, and improve operational efficiency. Our customers include banks, hedge funds, asset managers, central banks, regulators, auditors, fund administrators and insurance companies. We develop large scale technology platforms and enterprise software to produce global financial data with focus on analysis and regulatory requirements.
Position Summary:
We are seeking a proactive and innovative Site Reliability Engineer to join our growing team. In this role, you will be a key player in ensuring the reliability, scalability, and performance of our critical systems. You will move beyond traditional monitoring to implement advanced observability, leverage AIOps for predictive insights, and use Chaos Engineering to proactively uncover system weaknesses. This is an opportunity to help shape a modern SRE culture, automate away toil, and empower our development teams to build more resilient applications from the ground up.
Key Responsibilities
1. Observability & Proactive System Health
+ Design, build, and maintain a comprehensive observability platform using tools like Splunk and OpenTelemetry to provide deep insights into system health and performance.
+ Leverage AIOps principles and platforms to enhance anomaly detection, automate event correlation, and enable predictive alerting, reducing mean time to detection (MTTD).
+ Develop and manage robust alerting strategies and SLO-based dashboards to ensure critical issues are addressed before they impact customers.
+ Drive a data-driven culture by providing engineering teams with the visibility they need to understand the impact of their code in production.
2. Reliability & Resilience Engineering
+ Design, implement, and conduct Chaos Engineering experiments to proactively identify and remediate system weaknesses, architectural flaws, and potential cascading failures.
+ Partner with software engineering teams throughout the application lifecycle to architect for high availability, disaster recovery, and fault tolerance.
+ Define, measure, and evangelize Service Level Indicators (SLIs) and Service Level Objectives (SLOs), and manage the associated error budgets to balance reliability with feature velocity.
+ Analyze and lead blameless post-mortems for incidents, ensuring that root causes are addressed and preventative measures are implemented to avoid recurrence.
3. Performance & Efficiency Optimization
+ Analyze performance metrics and distributed traces to identify and resolve latency bottlenecks across our infrastructure and applications.
+ Implement cost optimization (FinOps) strategies by identifying and eliminating resource waste, optimizing cloud service usage, and promoting efficient architecture patterns.
+ Work with development teams to conduct performance testing and ensure new features do not introduce performance regressions.
4. Automation & Platform Engineering
+ Identify and aggressively automate manual operational tasks (toil) by developing scripts, tools, and self-healing systems.
+ Enhance and maintain our Infrastructure as Code (IaC) modules, promoting reusable patterns and best practices with Terraform.
+ Improve and secure CI/CD pipelines (e.g., GitHub Actions, Azure DevOps) to enable safe, automated, and rapid deployment and rollback procedures.
Requirements and Qualifications
Core Technical Skills
+ Experience: 4+ years in a Site Reliability, DevOps, or Cloud Engineering role, with demonstrable experience in a large-scale production environment.
+ Cloud Proficiency: Deep experience with AWS services (EKS, ECS, EC2, S3, RDS, Lambda) and managing production workloads in the cloud.
+ Observability: Proficient in application observability, monitoring, and logging. Hands-on experience with tools like Splunk, OpenTelemetry, Prometheus, Grafana, or Datadog is essential.
+ Infrastructure as Code (IaC): Strong experience with Terraform for provisioning and managing cloud infrastructure.
+ Containerization: Solid understanding of Containerization Technology particularly with managed services like EKS or ECS.
+ CI/CD: Experience building and maintaining CI/CD pipelines using tools like GitHub Actions, Azure DevOps, or Jenkins.
+ Scripting & Automation: Strong scripting skills in languages like Python, Bash, or PowerShell for automation and tooling. Familiarity with a higher-level language such as C# (.NET) is a plus.
+ Modern Practices: Experience with or a demonstrated understanding of AIOps concepts and Chaos Engineering principles and tools (e.g., Gremlin, AWS Fault Injection Simulator).
Professional Attributes
+ SRE Mindset: A true understanding of Site Reliability Engineering principles, including SLOs, error budgets, and the value of eliminating toil.
+ Problem-Solving: Excellent troubleshooting and problem-solving skills, with a methodical approach to resolving complex technical issues under pressure.
+ Collaboration: Ability to work effectively with development teams, product managers, and other stakeholders, communicating complex technical ideas clearly.
+ Ownership & Drive: A strong sense of ownership, urgency, and a passion for building and maintaininghighly available, performant, and reliable systems.
+ Agile Experience: Comfortable working in an agile environment and contributing to team sprints and planning.
+ On-Call: Willingness to participate in a scheduled on-call rotation
Education & Certifications
+ Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent practical experience.
+ AWS certification (e.g., AWS Certified Solutions Architect, DevOps Engineer) is highly preferred.
**About S&P Global Market Intelligence**
At S&P Global Market Intelligence, a division of S&P Global we understand the importance of accurate, deep and insightful information. Our team of experts delivers unrivaled insights and leading data and technology solutions, partnering with customers to expand their perspective, operate with confidence, and make decisions with conviction.
For more information, visit .
**What's In It For** **You?**
**Our Purpose:**
Progress is not a self-starter. It requires a catalyst to be set in motion. Information, imagination, people, technology-the right combination can unlock possibility and change the world.
Our world is in transition and getting more complex by the day. We push past expected observations and seek out new levels of understanding so that we can help companies, governments and individuals make an impact on tomorrow. At S&P Global we transform data into Essential Intelligence®, pinpointing risks and opening possibilities. We Accelerate Progress.
**Our People:**
We're more than 35,000 strong worldwide-so we're able to understand nuances while having a broad perspective. Our team is driven by curiosity and a shared belief that Essential Intelligence can help build a more prosperous future for us all.
From finding new ways to measure sustainability to analyzing energy transition across the supply chain to building workflow solutions that make it easy to tap into insight and apply it. We are changing the way people see things and empowering them to make an impact on the world we live in. We're committed to a more equitable future and to helping our customers find new, sustainable ways of doing business. We're constantly seeking new solutions that have progress in mind. Join us and help create the critical insights that truly make a difference.
**Our Values:**
**Integrity, Discovery, Partnership**
At S&P Global, we focus on Powering Global Markets. Throughout our history, the world's leading organizations have relied on us for the Essential Intelligence they need to make confident decisions about the road ahead. We start with a foundation of **integrity** in all we do, bring a spirit of **discovery** to our work, and collaborate in close **partnership** with each other and our customers to achieve shared goals.
**Benefits:**
We take care of you, so you can take care of business. We care about our people. That's why we provide everything you-and your career-need to thrive at S&P Global.
Our benefits include:
+ Health & Wellness: Health care coverage designed for the mind and body.
+ Flexible Downtime: Generous time off helps keep you energized for your time on.
+ Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills.
+ Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs.
+ Family Friendly Perks: It's not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families.
+ Beyond the Basics: From retail discounts to referral incentive awards-small perks can make a big difference.
For more information on benefits by country visit: Hiring and Opportunity at S&P Global:**
At S&P Global, we are committed to fostering a connected and engaged workplace where all individuals have access to opportunities based on their skills, experience, and contributions. Our hiring practices emphasize fairness, transparency, and merit, ensuring that we attract and retain top talent. By valuing different perspectives and promoting a culture of respect and collaboration, we drive innovation and power global markets.
**Recruitment Fraud Alert:**
If you receive an email from a spglobalind.com domain or any other regionally based domains, it is a scam and should be reported to . S&P Global never requires any candidate to pay money for job applications, interviews, offer letters, "pre-employment training" or for equipment/delivery of equipment. Stay informed and protect yourself from recruitment fraud by reviewing our guidelines, fraudulent domains, and how to report suspicious activity here ( .
---
**Equal Opportunity Employer**
S&P Global is an equal opportunity employer and all qualified candidates will receive consideration for employment without regard to race/ethnicity, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, marital status, military veteran status, unemployment status, or any other status protected by law. Only electronic job submissions will be considered for employment.
If you need an accommodation during the application process due to a disability, please send an email to:   and your request will be forwarded to the appropriate person. 
**US Candidates Only:** The EEO is the Law Poster   describes discrimination protections under federal law. Pay Transparency Nondiscrimination Provision - - Middle Professional Tier I (EEO Job Group)
**Job ID:**
**Posted On:**
**Location:** Gurgaon, Haryana, India
This advertiser has chosen not to accept applicants from your region.
Be The First To Know

About the latest Google engineer Jobs in India !

Site Reliability Engineer

Noida, Uttar Pradesh S&P Global

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

**About the Role:**
**Grade Level (for internal use):**
10
Department overview
S&P Global provides innovative products and services that enhance transparency, reduce risk, and improve operational efficiency. Our customers include banks, hedge funds, asset managers, central banks, regulators, auditors, fund administrators and insurance companies. We develop large scale technology platforms and enterprise software to produce global financial data with focus on analysis and regulatory requirements.
Position Summary:
We are seeking a proactive and innovative Site Reliability Engineer to join our growing team. In this role, you will be a key player in ensuring the reliability, scalability, and performance of our critical systems. You will move beyond traditional monitoring to implement advanced observability, leverage AIOps for predictive insights, and use Chaos Engineering to proactively uncover system weaknesses. This is an opportunity to help shape a modern SRE culture, automate away toil, and empower our development teams to build more resilient applications from the ground up.
Key Responsibilities
1. Observability & Proactive System Health
+ Design, build, and maintain a comprehensive observability platform using tools like Splunk and OpenTelemetry to provide deep insights into system health and performance.
+ Leverage AIOps principles and platforms to enhance anomaly detection, automate event correlation, and enable predictive alerting, reducing mean time to detection (MTTD).
+ Develop and manage robust alerting strategies and SLO-based dashboards to ensure critical issues are addressed before they impact customers.
+ Drive a data-driven culture by providing engineering teams with the visibility they need to understand the impact of their code in production.
2. Reliability & Resilience Engineering
+ Design, implement, and conduct Chaos Engineering experiments to proactively identify and remediate system weaknesses, architectural flaws, and potential cascading failures.
+ Partner with software engineering teams throughout the application lifecycle to architect for high availability, disaster recovery, and fault tolerance.
+ Define, measure, and evangelize Service Level Indicators (SLIs) and Service Level Objectives (SLOs), and manage the associated error budgets to balance reliability with feature velocity.
+ Analyze and lead blameless post-mortems for incidents, ensuring that root causes are addressed and preventative measures are implemented to avoid recurrence.
3. Performance & Efficiency Optimization
+ Analyze performance metrics and distributed traces to identify and resolve latency bottlenecks across our infrastructure and applications.
+ Implement cost optimization (FinOps) strategies by identifying and eliminating resource waste, optimizing cloud service usage, and promoting efficient architecture patterns.
+ Work with development teams to conduct performance testing and ensure new features do not introduce performance regressions.
4. Automation & Platform Engineering
+ Identify and aggressively automate manual operational tasks (toil) by developing scripts, tools, and self-healing systems.
+ Enhance and maintain our Infrastructure as Code (IaC) modules, promoting reusable patterns and best practices with Terraform.
+ Improve and secure CI/CD pipelines (e.g., GitHub Actions, Azure DevOps) to enable safe, automated, and rapid deployment and rollback procedures.
Requirements and Qualifications
Core Technical Skills
+ Experience: 4+ years in a Site Reliability, DevOps, or Cloud Engineering role, with demonstrable experience in a large-scale production environment.
+ Cloud Proficiency: Deep experience with AWS services (EKS, ECS, EC2, S3, RDS, Lambda) and managing production workloads in the cloud.
+ Observability: Proficient in application observability, monitoring, and logging. Hands-on experience with tools like Splunk, OpenTelemetry, Prometheus, Grafana, or Datadog is essential.
+ Infrastructure as Code (IaC): Strong experience with Terraform for provisioning and managing cloud infrastructure.
+ Containerization: Solid understanding of Containerization Technology particularly with managed services like EKS or ECS.
+ CI/CD: Experience building and maintaining CI/CD pipelines using tools like GitHub Actions, Azure DevOps, or Jenkins.
+ Scripting & Automation: Strong scripting skills in languages like Python, Bash, or PowerShell for automation and tooling. Familiarity with a higher-level language such as C# (.NET) is a plus.
+ Modern Practices: Experience with or a demonstrated understanding of AIOps concepts and Chaos Engineering principles and tools (e.g., Gremlin, AWS Fault Injection Simulator).
Professional Attributes
+ SRE Mindset: A true understanding of Site Reliability Engineering principles, including SLOs, error budgets, and the value of eliminating toil.
+ Problem-Solving: Excellent troubleshooting and problem-solving skills, with a methodical approach to resolving complex technical issues under pressure.
+ Collaboration: Ability to work effectively with development teams, product managers, and other stakeholders, communicating complex technical ideas clearly.
+ Ownership & Drive: A strong sense of ownership, urgency, and a passion for building and maintaininghighly available, performant, and reliable systems.
+ Agile Experience: Comfortable working in an agile environment and contributing to team sprints and planning.
+ On-Call: Willingness to participate in a scheduled on-call rotation
Education & Certifications
+ Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent practical experience.
+ AWS certification (e.g., AWS Certified Solutions Architect, DevOps Engineer) is highly preferred.
**About S&P Global Market Intelligence**
At S&P Global Market Intelligence, a division of S&P Global we understand the importance of accurate, deep and insightful information. Our team of experts delivers unrivaled insights and leading data and technology solutions, partnering with customers to expand their perspective, operate with confidence, and make decisions with conviction.
For more information, visit .
**What's In It For** **You?**
**Our Purpose:**
Progress is not a self-starter. It requires a catalyst to be set in motion. Information, imagination, people, technology-the right combination can unlock possibility and change the world.
Our world is in transition and getting more complex by the day. We push past expected observations and seek out new levels of understanding so that we can help companies, governments and individuals make an impact on tomorrow. At S&P Global we transform data into Essential Intelligence®, pinpointing risks and opening possibilities. We Accelerate Progress.
**Our People:**
We're more than 35,000 strong worldwide-so we're able to understand nuances while having a broad perspective. Our team is driven by curiosity and a shared belief that Essential Intelligence can help build a more prosperous future for us all.
From finding new ways to measure sustainability to analyzing energy transition across the supply chain to building workflow solutions that make it easy to tap into insight and apply it. We are changing the way people see things and empowering them to make an impact on the world we live in. We're committed to a more equitable future and to helping our customers find new, sustainable ways of doing business. We're constantly seeking new solutions that have progress in mind. Join us and help create the critical insights that truly make a difference.
**Our Values:**
**Integrity, Discovery, Partnership**
At S&P Global, we focus on Powering Global Markets. Throughout our history, the world's leading organizations have relied on us for the Essential Intelligence they need to make confident decisions about the road ahead. We start with a foundation of **integrity** in all we do, bring a spirit of **discovery** to our work, and collaborate in close **partnership** with each other and our customers to achieve shared goals.
**Benefits:**
We take care of you, so you can take care of business. We care about our people. That's why we provide everything you-and your career-need to thrive at S&P Global.
Our benefits include:
+ Health & Wellness: Health care coverage designed for the mind and body.
+ Flexible Downtime: Generous time off helps keep you energized for your time on.
+ Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills.
+ Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs.
+ Family Friendly Perks: It's not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families.
+ Beyond the Basics: From retail discounts to referral incentive awards-small perks can make a big difference.
For more information on benefits by country visit: Hiring and Opportunity at S&P Global:**
At S&P Global, we are committed to fostering a connected and engaged workplace where all individuals have access to opportunities based on their skills, experience, and contributions. Our hiring practices emphasize fairness, transparency, and merit, ensuring that we attract and retain top talent. By valuing different perspectives and promoting a culture of respect and collaboration, we drive innovation and power global markets.
**Recruitment Fraud Alert:**
If you receive an email from a spglobalind.com domain or any other regionally based domains, it is a scam and should be reported to . S&P Global never requires any candidate to pay money for job applications, interviews, offer letters, "pre-employment training" or for equipment/delivery of equipment. Stay informed and protect yourself from recruitment fraud by reviewing our guidelines, fraudulent domains, and how to report suspicious activity here ( .
---
**Equal Opportunity Employer**
S&P Global is an equal opportunity employer and all qualified candidates will receive consideration for employment without regard to race/ethnicity, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, marital status, military veteran status, unemployment status, or any other status protected by law. Only electronic job submissions will be considered for employment.
If you need an accommodation during the application process due to a disability, please send an email to:   and your request will be forwarded to the appropriate person. 
**US Candidates Only:** The EEO is the Law Poster   describes discrimination protections under federal law. Pay Transparency Nondiscrimination Provision - - Middle Professional Tier I (EEO Job Group)
**Job ID:**
**Posted On:**
**Location:** Gurgaon, Haryana, India
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Hyderabad, Andhra Pradesh Amgen

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

**Join Amgen's Mission of Serving Patients**
At Amgen, if you feel like you're part of something bigger, it's because you are. Our shared mission-to serve patients living with serious illnesses-drives all that we do.
Since 1980, we've helped pioneer the world of biotech in our fight against the world's toughest diseases. With our focus on four therapeutic areas -Oncology, Inflammation, General Medicine, and Rare Disease- we reach millions of patients each year. As a member of the Amgen team, you'll help make a lasting impact on the lives of patients as we research, manufacture, and deliver innovative medicines to help people live longer, fuller happier lives.
Our award-winning culture is collaborative, innovative, and science based. If you have a passion for challenges and the opportunities that lay within them, you'll thrive as part of the Amgen team. Join us and transform the lives of patients while transforming your career.
Site Reliability Engineer
**What you will do**
Let's do this. Let's change the world. In this vital role you will responsible for the reliability, stability, performance, scalability, and security of platforms that support Amgen's digital products and engineering teams. This hands-on role focuses on supporting cloud-based infrastructure, automating operations, maintaining observability, and improving platform reliability through code.
You'll work closely with senior engineers and cross-functional teams to support CI/CD workflows, container platforms, incident response, and enterprise tooling-all while adopting modern SRE principles and practices.
This role is ideal for engineers who have foundational site reliability experience and are looking to expand their skills in a cloud-native, enterprise-scale environment.
**Roles & Responsibilities:**
**Infrastructure & Platform Support**
+ Provision and manage cloud infrastructure using Infrastructure as Code (IaC)
+ Support container orchestration platforms, ensuring availability, access control, and resource management
+ Assist in configuring and maintaining CI/CD pipelines and environments
**Monitoring & Incident Response**
+ Set up and maintain observability tools to track system health and performance
+ Participate in alert tuning, incident resolution, and root cause analysis
+ Support integration of observability platforms with incident response workflows
**Automation & Platform Operations**
+ Automate routine platform tasks such as provisioning, patching, and configuration
+ Write scripts to improve platform reliability, reduce manual work, and enforce compliance
+ Participate in platform upgrades, maintenance windows, and service validation efforts
**AI Enablement & Intelligence**
+ Support the adoption of AI-assisted operational tools for log analysis, anomaly detection, and predictive alerts
+ Collaborate with senior engineers to evaluate AI/ML-based observability and automation platforms
+ Assist in integrating AI-driven insights into dashboards, alerts, or incident workflows
+ Stay current with emerging AI trends in infrastructure and site reliability, and contribute to tool evaluations and pilots
**Collaboration & Enablement**
+ Work with development, QA, and security teams to ensure reliable and secure deployments
+ Document operational procedures, playbooks, and system runbooks
+ Learn and support enterprise collaboration platforms and internal tooling
+ Participate in Agile and SAFe delivery processes-including sprint planning, stand-ups, retrospectives, and PI planning-to ensure security and platform reliability are embedded across development cycles.
**What we expect of you**
We are all different, yet we all use our unique contributions to serve patients. The (vital attribute) professional we seek is a (type of person) with these qualifications.
**Basic Qualifications:**
+ Master's degree / Bachelor's degree and 5 to 9 years in Computer Science, IT or related field
+ 4 years of hands-on related experience in site reliability, DevOps, or platform engineering roles
+ Hands-on experience with cloud platforms preferably AWS
+ Familiarity with Kubernetes or container orchestration technologies
+ Exposure to CI/CD practices and pipeline automation
+ Experience troubleshooting Linux systems, processes, and services
**Preferred Qualifications:**
**Must-Have Skills:**
+ Practical experience with **cloud platforms** (e.g., AWS, Azure, or GCP), including compute, networking, IAM, and storage services
+ Familiarity with **container orchestration platforms** (e.g., Kubernetes, Docker), including basic workload deployment and troubleshooting
+ Experience using **Infrastructure as Code (IaC)** tools such as **Terraform** or **CloudFormation**
+ Working knowledge of **Linux administration** , including system services, package management, and file system structures
+ Hands-on exposure to **CI/CD platforms** (e.g., GitLab CI, Jenkins, GitHub Actions) and pipeline troubleshooting
+ Proficiency in **scripting or automation languages** like **Python** , **Bash** , or **Go**
+ Exposure to **observability tooling** (e.g., **Dynatrace** , **Prometheus** , or **Grafana** ) for monitoring and alerting
+ Familiarity with **incident management practices** and tools (e.g., runbooks, escalation workflows, basic alert tuning)
+ Version control skills using **Git** and understanding of branching strategies
+ Experience supporting or integrating **enterprise collaboration platforms** (e.g., Jira, Confluence, ServiceNow)
+ Interest and basic understanding of **AI/ML tools** used in infrastructure and operations (e.g., anomaly detection, intelligent alerting, log analysis)
**Good-to-Have Skills:**
+ Experience using Infrastructure as Code (IaC) tools like Terraform or CloudFormation
+ Familiarity with IT incident response workflows and ticketing platforms
+ Knowledge of secrets management, configuration management tools (e.g., Ansible), or logging frameworks
+ Exposure to **AI-assisted tooling** (e.g., AIOps platforms, AI-enhanced alerting, anomaly detection)
**Professional Certifications (Preferred)**
+ Cloud DevOps Certification (AWS/Azure/GCP)
+ Certified Kubernetes Administrator (CKA) or Security Specialist (CKS)
+ CI/CD Platform Certification
+ ITIL Foundation or equivalent service management certification
**Soft Skills:**
+ Strong analytical and troubleshooting skills
+ Collaborative and proactive mindset
+ Effective communication and documentation practices
+ Curiosity and willingness to adopt new tools and methods, including AI integrations
+ Ability to manage time and prioritize tasks in dynamic environments
**Shift Information:** This position is an onsite role and may require working during later hours to align with business hours. Candidates must be willing and able to work outside of standard hours as required to meet business needs.
**What you can expect of us**
As we work to develop treatments that take care of others, we also work to care for your professional and personal growth and well-being. From our competitive benefits to our collaborative culture, we'll support your journey every step of the way.
In addition to the base salary, Amgen offers competitive and comprehensive Total Rewards Plans that are aligned with local industry standards.
**Apply now and make a lasting impact with the Amgen team.**
**careers.amgen.com**
As an organization dedicated to improving the quality of life for people around the world, Amgen fosters an inclusive environment of diverse, ethical, committed and highly accomplished people who respect each other and live the Amgen values to continue advancing science to serve patients. Together, we compete in the fight against serious disease.
Amgen is an Equal Opportunity employer and will consider all qualified applicants for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability status, or any other basis protected by applicable law.
We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Chennai, Tamil Nadu UPS

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

**Avant de postuler à un emploi, sélectionnez votre langue de préférence parmi les options disponibles en haut à droite de cette page.**
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
**Job Summary:**
We are seeking a skilled and proactive **Site Reliability Engineer (SRE)** with 5-8 years of experience and deep expertise in **Google Cloud Platform (GCP)** . The ideal candidate will be responsible for the reliability, availability, and performance of cloud-based applications and infrastructure. You will collaborate with development, operations, and security teams to build and maintain scalable, secure, and highly available systems.
**Key Responsibilities:**
+ Design, develop, and maintain **reliable, scalable, and highly available systems** on GCP.
+ Build and manage **CI/CD pipelines** , infrastructure as code (IaC), and monitoring solutions.
+ Proactively monitor and manage **system performance, uptime, and capacity** using observability tools.
+ Troubleshoot and resolve **infrastructure and application-level issues** in real-time.
+ Implement and maintain **disaster recovery** , **failover mechanisms** , and **backup strategies** .
+ Automate repetitive tasks and processes to improve **efficiency and reduce toil** .
+ Participate in **on-call rotations** , incident management, and root cause analysis (RCA).
+ Ensure compliance with **security standards, privacy regulations, and governance policies** .
+ Collaborate with cross-functional teams to support **DevOps and SRE best practices** .
+ Drive improvements in **SLAs, SLOs, and error budgets** through data-driven insights.
**Required Qualifications:**
+ 5-8 years of relevant experience as an SRE, DevOps Engineer, or Cloud Infrastructure Engineer.
+ Strong hands-on experience with **Google Cloud Platform (GCP)** - Compute Engine, GKE, Cloud Functions, Cloud Storage, IAM, BigQuery, etc.
+ Proficiency in **Infrastructure as Code** tools like **Terraform** , **Deployment Manager** , or **CloudFormation** .
+ Experience with **Kubernetes** , **Docker** , and container orchestration.
+ Proficiency in scripting languages like **Python** , **Shell** , or **Go** .
+ Deep understanding of **monitoring and logging tools** such as **Prometheus** , **Grafana** , **Stackdriver** , or **Datadog** .
+ Knowledge of **CI/CD tools** such as Jenkins, GitLab CI, or Cloud Build.
+ Experience with **incident response** , **postmortem analysis** , and **site reliability principles** .
+ Strong problem-solving and communication skills.
**Preferred Qualifications:**
+ GCP certifications (e.g., **Professional Cloud DevOps Engineer** , **Cloud Architect** ).
+ Exposure to **multi-cloud environments** or hybrid cloud infrastructure.
+ Familiarity with **Agile** and **ITIL** frameworks.
+ Experience working in regulated environments with compliance standards (e.g., ISO, SOC2).
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
This advertiser has chosen not to accept applicants from your region.
 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Google Engineer Jobs