1,502 Reliability Engineering jobs in India

Manager, Site Reliability Engineering

Hyderabad, Andhra Pradesh ServiceNow, Inc.

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today - ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500®. Our intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and better ways to work. But this is just the beginning of our journey. Join us as we pursue our purpose to make the world work better for everyone.
**What you get to do in this role:**
As a Manager of the SRE team your responsibilities will be:
+ Team management, career development, project prioritization and performance review.
+ Drive a culture of intolerance to manual activities that promotes automation efforts.
+ Drive initiatives with partner teams to improve the reliability of the infrastructure.
+ Act as crisis management to orchestrate actions towards sustainable solutions.
+ Analysis and evaluation of existing processes to drive continuous improvement and efficiencies.
+ Provide training and support to partner teams that interface with SRE.
+ Onboarding of new hires to enable their success in their roles.
+ Onboarding of new technologies, systems and automations into the team.
**To be successful in this role you have:**
+ Experience in leveraging or critically thinking about how to integrate AI into work processes, decision-making, or problem-solving. This may include using AI-powered tools, automating workflows, analyzing AI-driven insights, or exploring AI's potential impact on the function or industry.
+ 10 + Yrs of Overall experience
+ Hands-on technical skills in Linux, databases, systems and coding.
+ Experience in team management.
+ Design and implementation of monitoring solutions for large and scalable environments.
+ Experience with cloud operations, follow-the-sun and geographic distributed teams.
+ Experience working in software, platform and infrastructure delivered as a service.
+ Knowledge of principles and methods involving ITIL v3.
+ Outstanding interpersonal skills and strong communication skills, both written and verbal.
+ Uncompromising attention to detail.
We also have pluses. These are not a 'must', but please highlight them on your resume if you have:
+ RHCE, CCNA, ITIL or other industry certifications
JV20
**Work Personas**
We approach our distributed world of work with flexibility and trust. Work personas (flexible, remote, or required in office) are categories that are assigned to ServiceNow employees depending on the nature of their work and their assigned work location. Learn more here ( . To determine eligibility for a work persona, ServiceNow may confirm the distance between your primary residence and the closest ServiceNow office using a third-party service.
**Equal Opportunity Employer**
ServiceNow is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, national origin or nationality, ancestry, age, disability, gender identity or expression, marital status, veteran status, or any other category protected by law. In addition, all qualified applicants with arrest or conviction records will be considered for employment in accordance with legal requirements.
**Accommodations**
We strive to create an accessible and inclusive experience for all candidates. If you require a reasonable accommodation to complete any part of the application process, or are unable to use this online application and need an alternative method to apply, please contact for assistance.
**Export Control Regulations**
For positions requiring access to controlled technology subject to export control regulations, including the U.S. Export Administration Regulations (EAR), ServiceNow may be required to obtain export control approval from government authorities for certain individuals. All employment is contingent upon ServiceNow obtaining any export license or other approval that may be required by relevant export control authorities.
From Fortune. ©2025 Fortune Media IP Limited. All rights reserved. Used under license.
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineering Lead

Mumbai, Maharashtra RELX INC

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

Would you like to be part of a team that delivers high-quality software to our customers?
Are you a visible champion with a 'can do' attitude and enthusiasm that inspires others?
About the Business
LexisNexis Risk Solutions is the essential partner in the assessment of risk. Within our Business Services vertical, we offer a multitude of solutions focused on helping businesses of all sizes drive higher revenue growth, maximize operational efficiencies, and improve customer experience. Our solutions help our customers solve difficult problems in the areas of Anti-Money Laundering/Counter Terrorist Financing, Identity Authentication & Verification, Fraud and Credit Risk mitigation and Customer Data Management. You can learn more about LexisNexis Risk at the link below, the Team
This Team performs complex research, design, and software development assignments within a software functional area or product line, and provides direct input to project plans, schedules, and methodology in the development of cross-functional software products .
About the Role
.
We are seeking a highly skilled and experienced Lead Site Reliability Engineer (SRE) to drive reliability, scalability, and performance across our cloud infrastructure, with a strong emphasis on cloud security and compliance. This role blends reliability engineering with security best practices to ensure our cloud infrastructure is not only scalable and resilient but also secure and compliant.
Responsibilities:
+ B uild and maintain secure, scalable, and resilient systems using principles from software development, IT operations, and cybersecurity.
+ Implement and manage continuous security monitoring and threat detection to protect systems and data.
+ Collaborate with development teams to integrate security and observability tools into CI/CD pipelines, automating security checks.
+ Develop and maintain tools, scripts, and pipelines for infrastructure management, code deployment, and system monitoring.
+ Address vulnerabilities in code libraries and infrastructure (e.g., OS packages) through patching and remediation.
+ Review and remediate findings from cloud security tools (e.g., Wiz, Azure/AWS Security Recommendations).
+ Provide centralized support for common cloud configuration patterns while addressing application-specific issues.
+ Lead efforts to remediate vulnerabilities across platforms and cloud providers, working with centralized teams for efficiency.
+ Partner with application teams to resolve specific security findings and improve overall system resilience.
Requirements:
+ Experience (typically 10+ years) in DevOps, Site Reliability Engineering (SRE), or Cloud Engineering.
+ Strong understanding of DevOps practices and infrastructure-as-code tools (e.g., Terraform).
+ Proficiency in scripting or programming languages (e.g., Python, Bash, Go).
+ Hands-on experience with cloud platforms such as Azure, AWS, or Google Cloud Platform (GCP).
+ Familiarity with cloud security, vulnerability management, and patching practices.
+ Experience with threat modeling and vulnerability assessments for code and infrastructure.
+ Ability to interpret and act on cloud security recommendations and configuration findings.
Learn more about the LexisNexis Risk team and how we work ( are committed to providing a fair and accessible hiring process. If you have a disability or other need that requires accommodation or adjustment, please let us know by completing our Applicant Request Support Form or please contact .
Criminals may pose as recruiters asking for money or personal information. We never request money or banking details from job applicants. Learn more about spotting and avoiding scams here .
Please read our Candidate Privacy Policy .
We are an equal opportunity employer: qualified applicants are considered for and treated during employment without regard to race, color, creed, religion, sex, national origin, citizenship status, disability status, protected veteran status, age, marital status, sexual orientation, gender identity, genetic information, or any other characteristic protected by law.
USA Job Seekers:
EEO Know Your Rights .
RELX is a global provider of information-based analytics and decision tools for professional and business customers, enabling them to make better decisions, get better results and be more productive.
Our purpose is to benefit society by developing products that help researchers advance scientific knowledge; doctors and nurses improve the lives of patients; lawyers promote the rule of law and achieve justice and fair results for their clients; businesses and governments prevent fraud; consumers access financial services and get fair prices on insurance; and customers learn about markets and complete transactions.
Our purpose guides our actions beyond the products that we develop. It defines us as a company. Every day across RELX our employees are inspired to undertake initiatives that make unique contributions to society and the communities in which we operate.
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineering Lead

Chennai, Tamil Nadu RELX INC

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

Would you like to be part of a team that delivers high-quality software to our customers?
Are you a visible champion with a 'can do' attitude and enthusiasm that inspires others?
About the Business
LexisNexis Risk Solutions is the essential partner in the assessment of risk. Within our Business Services vertical, we offer a multitude of solutions focused on helping businesses of all sizes drive higher revenue growth, maximize operational efficiencies, and improve customer experience. Our solutions help our customers solve difficult problems in the areas of Anti-Money Laundering/Counter Terrorist Financing, Identity Authentication & Verification, Fraud and Credit Risk mitigation and Customer Data Management. You can learn more about LexisNexis Risk at the link below, the Team
This Team performs complex research, design, and software development assignments within a software functional area or product line, and provides direct input to project plans, schedules, and methodology in the development of cross-functional software products .
About the Role
.
We are seeking a highly skilled and experienced Lead Site Reliability Engineer (SRE) to drive reliability, scalability, and performance across our cloud infrastructure, with a strong emphasis on cloud security and compliance. This role blends reliability engineering with security best practices to ensure our cloud infrastructure is not only scalable and resilient but also secure and compliant.
Responsibilities:
+ B uild and maintain secure, scalable, and resilient systems using principles from software development, IT operations, and cybersecurity.
+ Implement and manage continuous security monitoring and threat detection to protect systems and data.
+ Collaborate with development teams to integrate security and observability tools into CI/CD pipelines, automating security checks.
+ Develop and maintain tools, scripts, and pipelines for infrastructure management, code deployment, and system monitoring.
+ Address vulnerabilities in code libraries and infrastructure (e.g., OS packages) through patching and remediation.
+ Review and remediate findings from cloud security tools (e.g., Wiz, Azure/AWS Security Recommendations).
+ Provide centralized support for common cloud configuration patterns while addressing application-specific issues.
+ Lead efforts to remediate vulnerabilities across platforms and cloud providers, working with centralized teams for efficiency.
+ Partner with application teams to resolve specific security findings and improve overall system resilience.
Requirements:
+ Experience (typically 10+ years) in DevOps, Site Reliability Engineering (SRE), or Cloud Engineering.
+ Strong understanding of DevOps practices and infrastructure-as-code tools (e.g., Terraform).
+ Proficiency in scripting or programming languages (e.g., Python, Bash, Go).
+ Hands-on experience with cloud platforms such as Azure, AWS, or Google Cloud Platform (GCP).
+ Familiarity with cloud security, vulnerability management, and patching practices.
+ Experience with threat modeling and vulnerability assessments for code and infrastructure.
+ Ability to interpret and act on cloud security recommendations and configuration findings.
Learn more about the LexisNexis Risk team and how we work ( are committed to providing a fair and accessible hiring process. If you have a disability or other need that requires accommodation or adjustment, please let us know by completing our Applicant Request Support Form or please contact .
Criminals may pose as recruiters asking for money or personal information. We never request money or banking details from job applicants. Learn more about spotting and avoiding scams here .
Please read our Candidate Privacy Policy .
We are an equal opportunity employer: qualified applicants are considered for and treated during employment without regard to race, color, creed, religion, sex, national origin, citizenship status, disability status, protected veteran status, age, marital status, sexual orientation, gender identity, genetic information, or any other characteristic protected by law.
USA Job Seekers:
EEO Know Your Rights .
RELX is a global provider of information-based analytics and decision tools for professional and business customers, enabling them to make better decisions, get better results and be more productive.
Our purpose is to benefit society by developing products that help researchers advance scientific knowledge; doctors and nurses improve the lives of patients; lawyers promote the rule of law and achieve justice and fair results for their clients; businesses and governments prevent fraud; consumers access financial services and get fair prices on insurance; and customers learn about markets and complete transactions.
Our purpose guides our actions beyond the products that we develop. It defines us as a company. Every day across RELX our employees are inspired to undertake initiatives that make unique contributions to society and the communities in which we operate.
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineering Manager

Bengaluru, Karnataka Tata Consultancy Services

Posted 5 days ago

Job Viewed

Tap Again To Close

Job Description

Role**: Manager, Site Reliability Engineering

Required Technical Skill Set: Manager, Site Reliability Engineering

Desired Experience Range: 12 - 18 yrs

Notice Period: Immediate to 90Days only

Location of Requirement: Bangalore

We are currently planning to do a Virtual Interview


Job Description:

Describe what the person will do in the role - how he/she will impact the organization.

As the Manager of Site Reliability Engineering on the Infrastructure Reliability team, you will be responsible for building and leading a high-performing team dedicated to ensuring our infrastructure is reliable, scalable, and efficient. Your primary focus will be on people management, strategic planning, and technical leadership. You will mentor and guide your team members, fostering their professional growth and creating a culture of ownership and operational excellence. You will define the team's vision and roadmap, aligning it with the company's broader goals, and work with cross-functional partners to prioritize and execute projects. You will oversee the development of SRE solutions across our globally distributed environments and empowering your team to improve service resiliency, automate processes, and conduct effective incident response and capacity planning to guarantee the highest level of uptime and Quality of Service (QoS) for our internal customers.


Responsibilities and Duties of the Role:

Summarize job responsibilities, core deliverables and major duties. What is required for the position to exist?

-Focus on major areas of work, typically 20% or more of role

% of Time

  • Lead, mentor, and grow a team of software and infrastructure automation engineers.
  • Develop and execute the roadmap for the Infrastructure Reliability Engineering team.
  • Collaborate with engineering and operations teams to identify and prioritize reliability improvements.
  • Drive the design and implementation of tools and automation for infrastructure testing and self-healing.
  • Establish and monitor key performance indicators (KPIs) for infrastructure reliability.


10%Minimum and Preferred. Inclusive of Licenses/Certs (include functional experience as well as behavioral attributes and/or leadership capabilities)

Basic Qualifications

  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience.
  • 12+ years of experience in a software engineering or infrastructure role.
  • 5+ years of experience in a leadership or management role.
  • Lead a team of Infrastructure Reliability Engineers on projects for users and be directly responsible for uptime.
  • Own end-to-end availability and performance of key services and build automation to prevent problem recurrence. Automate response to all non-exceptional service conditions.
  • Set the standard for excellence by mentoring team members and establishing trust through superior technical delivery.
  • Proficiency in Kubernetes administration and modern CI/CD techniques and Infrastructure as Code (IaC).
  • Deep understanding of Linux operating systems and TCP/IP fundamentals.
  • Experience with monitoring, metrics gathering, APM, container management, and log collection tools.
  • Creative problem solver with excellent debugging skills and great documentation abilities.
  • Strong understanding of networking, storage, security, and compute technologies.

Preferred Qualifications

  • Experience building and leading a Site Reliability Engineering (SRE) or Infrastructure Reliability team.
  • Expertise with complex system architectures and infrastructures.
  • Proficiency in one or more programming languages (e.g., Python, Go, Java).
  • Passion for automation, scalability, and building reliable systems from the ground up.
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineering Manager

Bengaluru, Karnataka Synechron

Posted 5 days ago

Job Viewed

Tap Again To Close

Job Description

Good-day,

We have immediate opportunity for Senior Site Reliability Engineer.

Job Role: Senior Site Reliability Engineer

Job Location: Synechron ( Bengaluru/ Pune)

Experience- 10 to 15 years

Notice : Immediate Joiner

About Company:

At Synechron, we believe in the power of digital to transform businesses for the better. Our global consulting firm combines creativity and innovative technology to deliver industry-leading digital solutions. Synechron’s progressive technologies and optimization strategies span end-to-end Artificial Intelligence, Consulting, Digital, Cloud & DevOps, Data, and Software Engineering, servicing an array of noteworthy financial services and technology firms. Through research and development initiatives in our FinLabs we develop solutions for modernization, from Artificial Intelligence and Blockchain to Data Science models, Digital Underwriting, mobile-first applications and more. Over the last 20+ years, our company has been honoured with multiple employer awards, recognizing our commitment to our talented teams. With top clients to boast about, Synechron has a global workforce of 13,950+, and has 52 offices in 20 countries within key global markets. For more information on the company, please visit our website or LinkedIn community.


Diversity, Equity, and Inclusion

Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and an affirmative-action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.

All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.


JOB DESCRIPTION


Experience & qualifications:


10 to 15 years of IT experience with hands on experience in CICD pipeline, DevOps tools & site reliability practices.

Excellent database knowledge with indexing, performance tuning and query optimization skills

Excellent skills in Linux , networking and firewall configurations.

Experience in production monitoring systems

Experience in DevOps tools like Docker, Git, Kubernetes etc.

Experience in DevOps - infrastructure as code, containerization, and automation of infrastructure provisioning

Experience in studying and suggesting improvements on application workload efficiency.

Strong admin level experience in application monitoring tools like Splunk.

Strong configuration experience in job scheduler tools like Control M or Autosys

Experience in collaborating with multiple teams to understand and improve application performance

Preferably from payments and cards domain.

Very good understanding of version control systems like GitHub.

Excellent knowledge of ITIL version 4 with thorough understanding of problem management life cycle, root cause analysis and permanent resolution, change management & incident management process.


Job roles & responsibilities

  • Automate software build and deployment processes.
  • Build CICD pipeline development that are tailor made as per needs & requirements of the environment.
  • Takes accountability for permanent resolution of recurring issues within defined SLAs
  • provide technical solutions to complex issues working with solution architects.
  • Identify system bottlenecks in terms of Operating System , Database and network, work with the infrastructure teams for permanent resolution for the same.
  • Work closely with infrastructure managers, system architects on improving platform stability, and robustness.
  • Understands existing change implementation process for upstream systems and proposes improvement ideas.
  • Work on improving system reliability with key parameters like Latency, traffic, errors & saturation.
  • Understands end user issues and liaise with Partners & stakeholders (internal and external) to co-ordinate on permanent resolution for business user pain points.
  • Study existing manual support processes , propose automated solutions to enhance productivity and save manual efforts.
  • Review of DR exercise processes and provide recommendations to reduce any outage events during the exercises.
  • Building monitoring systems that alerts on symptoms rather than on outages
  • Review knowledge base like standard operating procedures , wiki page articles for content quality and guide team on constantly updating with live cases.
  • Study and implement measures to enhance system availability .

To expedite the application process, I would appreciate it if you could provide the following information at your earliest convenience:

  1. Tentative Date to Join (if selected):
  2. Current Location:
  3. Preferred Location:
  4. Current Salary:
  5. Expected Salary:
  6. Reason for Change:
  7. Total Experience:
  8. Relevant Experience:
  9. Share official email confirmation of Notice Period or Last Working Day :
  10. Primary Skills (Hands-on):
  11. Secondary Skills:

Please send your updated resume to my email at or reach out to me via WhatsApp at .

This advertiser has chosen not to accept applicants from your region.

Senior Director - Product Reliability Engineering

Bengaluru, Karnataka Ohmium

Posted 5 days ago

Job Viewed

Tap Again To Close

Job Description

Key Responsibilities

  • Develop and lead the end-to-end product reliability strategy across design, testing, manufacturing, and field performance for electrolyzer systems.
  • Collaborate deeply with design engineering, R&D, and systems engineering to embed reliability into early design stages (FMEA, DfR, HALT, etc.).
  • Partner with operations and quality teams to monitor manufacturing consistency, identify systemic issues, and implement corrective/preventive actions (CAPA).
  • Work closely with field service and customer success to collect, analyze, and respond to real-world performance data and customer feedback.
  • Own and drive reliability testing programs, including accelerated life testing, environmental testing, stress testing, and statistical reliability validation.
  • Lead root cause analysis (RCA) on field and test failures, ensuring closed-loop learning feeds back into future designs.
  • Build and scale a high-performing reliability engineering team with strong analytical and experimental capabilities.
  • Present reliability metrics and insights to executive leadership, supporting product roadmap and risk management decisions.
  • Ensure compliance with relevant industry standards, safety regulations, and certification requirements.


Desired Qualification :

  • Bachelor’s degree in Mechanical, Electrical, Chemical Engineering or related discipline
  • 15+ years of experience in product reliability, systems engineering, or product development, ideally with automotive industry.
  • Demonstrated success leading cross-functional reliability programs for complex, electro-mechanical products.
  • Deep knowledge of reliability engineering tools and methodologies (FMEA, Weibull analysis, HALT/HASS, DFMEA, etc.).
  • Background in automotive reliability, or industrial energy equipment is a strong plus.
  • Proven ability to bridge engineering and operational teams to drive measurable product improvements.
  • Strong data analysis skills; proficiency with reliability modeling tools and statistical software.
  • Exceptional communication, leadership, and organizational skills
This advertiser has chosen not to accept applicants from your region.

Junior Site Reliability Engineering Apprentice

530001 Visakhapatnam, Andhra Pradesh ₹15000 Monthly WhatJobs

Posted 3 days ago

Job Viewed

Tap Again To Close

Job Description

apprenticeship
Embark on a transformative career journey with our exclusive apprenticeship program in Site Reliability Engineering, based in the vibrant city of Visakhapatnam, Andhra Pradesh, IN . This is an unparalleled opportunity for motivated individuals with a passion for technology to gain hands-on experience in critical infrastructure management and high-availability systems. Our client is seeking aspiring engineers who are eager to learn, contribute, and grow within a dynamic and supportive team environment. As a Junior SRE Apprentice, you will be exposed to the fundamental principles of cloud computing, network operations, and system automation. Your responsibilities will include assisting senior engineers in monitoring system performance, troubleshooting operational issues, and contributing to the development of automation scripts. You will learn to manage and maintain large-scale distributed systems, ensuring their reliability, scalability, and efficiency. The role involves working with cutting-edge technologies in areas such as containerization (Docker, Kubernetes), cloud platforms (AWS, Azure, GCP), and infrastructure-as-code tools (Terraform, Ansible). Key learning areas will encompass CI/CD pipelines, incident response, performance tuning, and security best practices. This apprenticeship is designed to provide a comprehensive understanding of the SRE discipline, preparing you for a successful career in this high-demand field. The ideal candidate will possess strong analytical and problem-solving skills, a basic understanding of Linux environments, and a genuine enthusiasm for continuous learning. While prior experience is not essential, a foundational knowledge of programming or scripting languages will be advantageous. You will be expected to collaborate effectively with development and operations teams, participate actively in team meetings, and complete assigned training modules and projects. This role offers a unique blend of on-the-job training and structured learning, providing a clear pathway for career progression. Join us in Visakhapatnam, Andhra Pradesh, IN , and take the first step towards becoming a highly skilled Site Reliability Engineer.
This advertiser has chosen not to accept applicants from your region.
Be The First To Know

About the latest Reliability engineering Jobs in India !

Software Engineer I - Reliability Engineering , ITC

Karnataka, Karnataka Nike

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

**Site Reliability Engineer I**
India Technology Center
**WHO YOU'LL WORK WITH**
As a Software Engineer specializing in Resilience Engineering, you will play a critical role in ensuring the maximum availability, observability, reliability, security, and performance of Nike's digital experiences. This position requires a proactive approach to maintaining robust, consumer-facing systems that support millions of users worldwide.
In this role, you will focus on in-depth problem analysis, identify infrastructure and code-level defects, establish observability processes for key performance indicators (KPIs), and collaborate closely with product delivery teams to design sustainable solutions to production challenges. Your expertise will be vital to enhancing Nike's commitment to a seamless and resilient digital experience.
**WHO WE ARE LOOKING FOR**
Nike is seeking talented and driven full stack developers with expertise in cloud infrastructure and services. The ideal candidate will possess:
+ A Bachelor's degree in Computer Science, Information Systems, or a related field
+ Alternatively, 1-3 years of relevant professional experience in lieu of a degree will be considered
+ Proven experience in designing and developing applications using Java, Node.js, or similar languages
+ Familiarity with front-end frameworks (e.g., React, Angular) is advantageous
+ Experience with modern programming languages such as Scala, Python, or Golang is preferred
+ A solid understanding of DNS, networking, virtualization, and Linux operating systems
+ Demonstrated expertise in building and managing scalable, cloud-based microservices, ideally on AWS
+ Experience with Docker or serverless architectures
+ Proficiency in at least one NoSQL database (e.g., DynamoDB, Cassandra)
+ Strong understanding of RESTful APIs
+ Familiarity with service management, agile, and observability tools such as ServiceNow, Jira, Jenkins, Splunk, New Relic, and SignalFX
**WHAT YOU'LL WORK ON**
+ Observing, diagnosing, and quickly resolving production issues with precision to minimize service interruptions
+ Developing and implementing real-time monitoring solutions that deliver essential insights into system health and key performance indicators
+ Communicating technical issues and their business impacts clearly, ensuring alignment across teams and effective response strategies
+ Reporting high-value metrics and insights to leadership, demonstrating the impact of site reliability on consumer experience and overall business objectives
+ Managing IT service processes such as Incident, Problem, Change, and Knowledge Management to maintain service quality and reliability
+ Collaborating closely with both business and technical teams to analyze system performance, troubleshoot consumer-reported issues, and proactively optimize system efficiency
+ Leading initiatives to enhance application reliability for high-demand consumer web and mobile platforms, ensuring consistent performance
+ Leveraging negotiation and influence to foster alignment and drive collaborative solutions across multiple teams
+ Promoting a culture of growth by coaching, mentoring, and sharing knowledge, supporting continuous improvement and resilience across the team
Join us in delivering resilient, high-performance digital solutions that will empower millions of consumers around the world. Your skills and insights will be pivotal in driving Nike's digital transformation.
NIKE, Inc. is committed to employing a diverse workforce. Qualified applicants will receive consideration without regard to race, color, religion, sex, national origin, age, sexual orientation, gender identity, gender expression, protected veteran status, or disability. NIKE is committed to working with and providing reasonable accommodation to individuals with disabilities. If, because of a medical condition or disability, you need a reasonable accommodation for any part of the employment process, please call and let us know the nature of your request, your location and your contact information.
This advertiser has chosen not to accept applicants from your region.

Systems Engineer III, Site Reliability Engineering

Bengaluru, Karnataka Google

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

Systems Engineer III, Site Reliability Engineering
_corporate_fare_ Google _place_ Bengaluru, Karnataka, India
**Mid**
Experience driving progress, solving problems, and mentoring more junior team members; deeper expertise and applied knowledge within relevant area.
**Minimum qualifications:**
+ Bachelor's degree in Computer Science, a related field, or equivalent practical experience.
+ 2 years of experience working with administration (e.g. filesystems, inodes, system calls) or networking (e.g. TCP/IP, routing, network topologies and hardware, SDN).
+ 2 years of experience with data structures/algorithms and software development in one or more programming languages (e.g., Python, C++, Java).
**Preferred qualifications:**
+ Master's degree in Computer Science or Engineering.
+ Experience in Linux system administration, networking fundamentals or system design.
**About the job**
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google's services-both our internally critical and our externally-visible systems-have reliability, uptime appropriate to users' needs and a fast rate of improvement. Additionally SRE's will keep an ever-watchful eye on our systems capacity and performance.
Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google, while using your expertise in coding, algorithms, complexity analysis and large-scale system design.
SRE's culture of intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.
**To learn more:** check out our books onSite Reliability Engineering ( or read acareer profile ( about why a Software Engineer chose to join SRE.
Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running. From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. We keep our networks up and running, ensuring our users have the best and fastest experience possible.
**Responsibilities**
+ Collaborate with our partner teams to identify risks inside our corporate systems and potential solutions to simplify or reduce risk to the productivity or business processes.
+ Ensure the availability of Core Enterprise Network Services, including DNS, DHCP, and RADIUS, supporting Googlers in offices worldwide.
+ Manage low-level infrastructure issues, encompassing networking, system administration, and system design.
+ Ensure our services deliver reliable networking connectivity to offices worldwide. Defend our service Service Level Objectives (SLOs), participating in a sustainable tier-one on-call rotation and supporting the culture.
+ Apply new technologies such as AI to solve traditional system engineering issues in exciting new ways.
Information collected and processed as part of your Google Careers profile, and any job applications you choose to submit is subject to Google'sApplicant and Candidate Privacy Policy (./privacy-policy) .
Google is proud to be an equal opportunity and affirmative action employer. We are committed to building a workforce that is representative of the users we serve, creating a culture of belonging, and providing an equal employment opportunity regardless of race, creed, color, religion, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition (including breastfeeding), expecting or parents-to-be, criminal histories consistent with legal requirements, or any other basis protected by law. See alsoGoogle's EEO Policy ( ,Know your rights: workplace discrimination is illegal ( ,Belonging at Google ( , andHow we hire ( .
If you have a need that requires accommodation, please let us know by completing ourAccommodations for Applicants form ( .
Google is a global company and, in order to facilitate efficient collaboration and communication globally, English proficiency is a requirement for all roles unless stated otherwise in the job posting.
To all recruitment agencies: Google does not accept agency resumes. Please do not forward resumes to our jobs alias, Google employees, or any other organization location. Google is not responsible for any fees related to unsolicited resumes.
Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also and If you have a need that requires accommodation, please let us know by completing our Accommodations for Applicants form:
This advertiser has chosen not to accept applicants from your region.

Graduate-level Tutor (Safety & Reliability Engineering)

New
Skyline Academic

Posted today

Job Viewed

Tap Again To Close

Job Description

Skyline Academic supports university students with expert, one-to-one tutoring and assignment coaching especially in engineering disciplines. We’re expanding our Safety & Reliability offering and seeking an experienced tutor to mentor graduate level students.


The role: You’ll deliver 1:1 online tutorials, assignment/project feedback, and exam preparation for graduate students. You’ll clarify complex concepts, demonstrate methods, and coach students to apply rigorous, industry-aligned practices.


Courses

  • Offshore Structural Integrity
  • Fire And Explosion Engineering
  • Safety & Risk Management


Top responsibilities

  • Guide students on research methods, data analysis, and academic writing.
  • Review coursework/dissertations; provide structured, actionable feedback.
  • Maintain clear session notes and student progress plans.


Minimum & preferred qualifications

  • Minimum: MSc in Safety/Reliability/Systems/Industrial/Mechanical Engineering (or closely related).


Key skills & competencies

  • Reliability statistics (Weibull, Monte Carlo, Bayesian), RAM analysis.
  • Techniques: FTA, FMEA, HAZOP, RCM; risk frameworks (ISO 31000; IEC 61508/61511).
  • Software: MATLAB, Python/R, ReliaSoft/Isograph, @Risk, Minitab.
  • Strong pedagogy: clear explanations, scaffolding, formative assessment; excellent written feedback.


What we offer

  • Remote, contract work with flexible hours (UK & Middle East time zones).
  • Competitive pay; session-based compensation.
  • Supportive team, curated materials, and professional development.


This advertiser has chosen not to accept applicants from your region.
 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Reliability Engineering Jobs