Infrastructure Specialist-System Administration

Hyderabad, Andhra Pradesh IBM

Posted 4 days ago

Job Viewed

Tap Again To Close

Job Description

**Introduction**
A career in IBM Consulting is rooted by long-term relationships and close collaboration with clients across the globe. You'll work with visionaries across multiple industries to improve the hybrid cloud and AI journey for the most innovative and valuable companies in the world. Your ability to accelerate impact and make meaningful change for your clients is enabled by our strategic partner ecosystem and our robust technology platforms across the IBM portfolio
**Your role and responsibilities**
As a Software Developer you'll participate in many aspects of the software development lifecycle, such as design, code implementation, testing, and support. You will create software that enables your clients' hybrid-cloud and AI journeys.
Your primary responsibilities include:
* Envision, design, and build the Software infrastructure that keeps the solutions running.
* Challenge ideas, identify problems and create efficient solutions.
* Develop flexible, maintainable, and scalable application components.
* Collaborate with development peers and lead the way staying up to date with tools and technology trends
**Required technical and professional expertise**
* Min 4 yrs of exp.
* We are seeking a skilled Telecom Operations Specialist / Support Engineer to join our team.
* The ideal candidate will have a strong background in Linux Shell Scripting, SQL Database Querying, and hands-on experience with OSS/BSS systems in the telecommunications sector.
* Familiarity with Cramer tools, defect triage, and incident management processes are key to ensuring optimal performance and efficient resolution of network and service-related issues.
* Exposure to COTS products used in telecom operations is highly desirable
**Preferred technical and professional experience**
* Manage and monitor system resources, databases
* Develop and maintain shell scripts for automating repetitive tasks.
* Streamlining system operations in a Linux environment
IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
This advertiser has chosen not to accept applicants from your region.

L3 Server Engineer – Major Incident Management

Hyderabad, Andhra Pradesh Nextbridge IT Solutions

Posted today

Job Viewed

Tap Again To Close

Job Description

Company Description

Nextbridge IT Solutions is a US-based IT solution firm specializing in connecting exceptional talent with organizations driving transformation in infrastructure, cloud, and emerging technologies. We partner closely with clients to understand their technical needs and organizational goals, delivering tailored solutions through highly skilled professionals. Our culture values forward-thinking, accountability, and agility, encouraging continuous growth and supporting long-term success. Join us to shape the future together.

Role Description

This is a remote contract role for a L3 Server Engineer – Major Incident Management. The L3 Server Engineer will be responsible for managing and resolving major incidents, providing expert troubleshooting, and ensuring uptime and performance of infrastructure. Duties include handling operating systems, supporting databases, and overseeing overall IT infrastructure. The role also requires effective communication and collaboration with other IT professionals and stakeholders to ensure swift resolution of incidents.

Key Responsibilities

  • Serve as the primary technical escalation point for all server-related Major Incidents (MIM) and P1 events.
  • Lead technical triage on bridge calls and in war rooms, coordinating efforts between L2 support, application teams, vendors, and other cross-functional stakeholders.
  • Perform advanced, real-time troubleshooting to diagnose and resolve complex issues across Windows Server, Linux, and VMware virtualization platforms.
  • Drive the restoration of critical infrastructure services with a focus on minimizing business impact.
  • Author and deliver comprehensive Root Cause Analysis (RCA) and detailed post-incident reports.
  • Partner with the Problem Management team to identify trends, implement proactive solutions, and prevent incident recurrence.
  • Mentor and provide technical guidance to L1/L2 support teams to improve overall incident response capabilities.
  • Participate in a 24x7 on-call rotation to provide critical support when needed.

Qualifications

  • 5–8 years of hands-on experience in enterprise server administration and high-severity incident response.
  • Expert-level knowledge of Windows Server (2016/2019/2022) and Linux (RHEL, Ubuntu).
  • Deep expertise with virtualization technologies, specifically VMware ESXi/vSphere in a large-scale environment.
  • Solid understanding of core infrastructure concepts: TCP/IP networking, SAN/NAS storage, and enterprise backup/recovery solutions.
  • Hands-on experience with enterprise monitoring platforms (e.g., SolarWinds, Datadog, Nagios).
  • Proficiency with an ITSM tool, preferably ServiceNow, for incident lifecycle management.
  • Demonstrated ability to remain calm, focused, and organized during high-pressure situations.
  • ITIL v3/v4 Foundation certification is required.

Preferred

  • Advanced certifications such as MCSE, VCP, RHCE.
  • ITIL Intermediate/Expert or related certifications.
  • Experience with public cloud platforms (Azure, AWS) and hybrid cloud environments.
  • Scripting and automation skills (PowerShell, Bash) for diagnostics and reporting.

Key Competencies

  • Crisis Management: Able to lead effectively in high-pressure, time-critical situations.
  • Collaboration: Works seamlessly with vendors, internal teams, and stakeholders to achieve common goals.
  • Analytical Mindset: Possesses superior troubleshooting and Root Cause Analysis (RCA) capabilities.
  • Communication: Delivers clear, concise, and timely updates during incidents, tailored for both technical and business audiences.
  • Proactive Mindset: Focuses on prevention and continuous service improvement, not just reactive resolution.

Remote Work Environment

  • This is a fully remote position. A company-provided Virtual Desktop (VDI) will be used for all work.
  • Candidates are expected to provide their own reliable computer (laptop or desktop) and at least one monitor capable of accessing the VDI.
  • A dedicated and quiet workspace is essential to maintain a professional environment during critical incident bridge calls.
This advertiser has chosen not to accept applicants from your region.

L3 Server Engineer – Major Incident Management

Hyderabad, Andhra Pradesh Nextbridge IT Solutions

Posted today

Job Viewed

Tap Again To Close

Job Description

Company Description

Nextbridge IT Solutions is a US-based IT solution firm specializing in connecting exceptional talent with organizations driving transformation in infrastructure, cloud, and emerging technologies. We partner closely with clients to understand their technical needs and organizational goals, delivering tailored solutions through highly skilled professionals. Our culture values forward-thinking, accountability, and agility, encouraging continuous growth and supporting long-term success. Join us to shape the future together.


Role Description

This is a remote contract role for a L3 Server Engineer – Major Incident Management. The L3 Server Engineer will be responsible for managing and resolving major incidents, providing expert troubleshooting, and ensuring uptime and performance of infrastructure. Duties include handling operating systems, supporting databases, and overseeing overall IT infrastructure. The role also requires effective communication and collaboration with other IT professionals and stakeholders to ensure swift resolution of incidents.


Key Responsibilities

  • Serve as the primary technical escalation point  for all server-related Major Incidents (MIM) and P1 events.
  • Lead technical triage on bridge calls and in war rooms, coordinating efforts between L2 support, application teams, vendors, and other cross-functional stakeholders.
  • Perform advanced, real-time troubleshooting to diagnose and resolve complex issues across Windows Server, Linux, and VMware virtualization platforms.
  • Drive the restoration of critical infrastructure services with a focus on minimizing business impact.
  • Author and deliver comprehensive Root Cause Analysis (RCA) and detailed post-incident reports.
  • Partner with the Problem Management team to identify trends, implement proactive solutions, and prevent incident recurrence.
  • Mentor and provide technical guidance to L1/L2 support teams to improve overall incident response capabilities.
  • Participate in a 24x7 on-call rotation to provide critical support when needed.


Qualifications

  • 5–8 years of hands-on experience in enterprise server administration and high-severity incident response.
  • Expert-level knowledge of Windows Server (2016/2019/2022)  and Linux (RHEL, Ubuntu) .
  • Deep expertise with virtualization technologies, specifically VMware ESXi/vSphere  in a large-scale environment.
  • Solid understanding of core infrastructure concepts: TCP/IP networking, SAN/NAS storage, and enterprise backup/recovery solutions .
  • Hands-on experience with enterprise monitoring platforms (e.g., SolarWinds, Datadog, Nagios).
  • Proficiency with an ITSM tool, preferably ServiceNow , for incident lifecycle management.
  • Demonstrated ability to remain calm, focused, and organized during high-pressure situations.
  • ITIL v3/v4 Foundation certification is required.

Preferred

  • Advanced certifications such as MCSE, VCP, RHCE .
  • ITIL Intermediate/Expert or related certifications.
  • Experience with public cloud platforms (Azure, AWS ) and hybrid cloud environments.
  • Scripting and automation skills (PowerShell, Bash ) for diagnostics and reporting.


Key Competencies

  • Crisis Management :  Able to lead effectively in high-pressure, time-critical situations.
  • Collaboration :  Works seamlessly with vendors, internal teams, and stakeholders to achieve common goals.
  • Analytical Mindset :  Possesses superior troubleshooting and Root Cause Analysis (RCA) capabilities.
  • Communication :  Delivers clear, concise, and timely updates during incidents, tailored for both technical and business audiences.
  • Proactive Mindset :  Focuses on prevention and continuous service improvement, not just reactive resolution.


Remote Work Environment

  • This is a fully remote position. A company-provided Virtual Desktop (VDI) will be used for all work.
  • Candidates are expected to provide their own reliable computer (laptop or desktop) and at least one monitor capable of accessing the VDI.
  • A dedicated and quiet workspace is essential to maintain a professional environment during critical incident bridge calls.
This advertiser has chosen not to accept applicants from your region.

L3 Server Engineer – Major Incident Management

Secunderabad, Andhra Pradesh Nextbridge IT Solutions

Posted 4 days ago

Job Viewed

Tap Again To Close

Job Description

Company Description

Nextbridge IT Solutions is a US-based IT solution firm specializing in connecting exceptional talent with organizations driving transformation in infrastructure, cloud, and emerging technologies. We partner closely with clients to understand their technical needs and organizational goals, delivering tailored solutions through highly skilled professionals. Our culture values forward-thinking, accountability, and agility, encouraging continuous growth and supporting long-term success. Join us to shape the future together.


Role Description

This is a remote contract role for a L3 Server Engineer – Major Incident Management. The L3 Server Engineer will be responsible for managing and resolving major incidents, providing expert troubleshooting, and ensuring uptime and performance of infrastructure. Duties include handling operating systems, supporting databases, and overseeing overall IT infrastructure. The role also requires effective communication and collaboration with other IT professionals and stakeholders to ensure swift resolution of incidents.


Key Responsibilities

  • Serve as the primary technical escalation point for all server-related Major Incidents (MIM) and P1 events.
  • Lead technical triage on bridge calls and in war rooms, coordinating efforts between L2 support, application teams, vendors, and other cross-functional stakeholders.
  • Perform advanced, real-time troubleshooting to diagnose and resolve complex issues across Windows Server, Linux, and VMware virtualization platforms.
  • Drive the restoration of critical infrastructure services with a focus on minimizing business impact.
  • Author and deliver comprehensive Root Cause Analysis (RCA) and detailed post-incident reports.
  • Partner with the Problem Management team to identify trends, implement proactive solutions, and prevent incident recurrence.
  • Mentor and provide technical guidance to L1/L2 support teams to improve overall incident response capabilities.
  • Participate in a 24x7 on-call rotation to provide critical support when needed.


Qualifications

  • 5–8 years of hands-on experience in enterprise server administration and high-severity incident response.
  • Expert-level knowledge of Windows Server (2016/2019/2022)  and Linux (RHEL, Ubuntu) .
  • Deep expertise with virtualization technologies, specifically VMware ESXi/vSphere in a large-scale environment.
  • Solid understanding of core infrastructure concepts: TCP/IP networking, SAN/NAS storage, and enterprise backup/recovery solutions .
  • Hands-on experience with enterprise monitoring platforms (e.g., SolarWinds, Datadog, Nagios).
  • Proficiency with an ITSM tool, preferably ServiceNow, for incident lifecycle management.
  • Demonstrated ability to remain calm, focused, and organized during high-pressure situations.
  • ITIL v3/v4 Foundation certification is required.

Preferred

  • Advanced certifications such as MCSE, VCP, RHCE .
  • ITIL Intermediate/Expert or related certifications.
  • Experience with public cloud platforms (Azure, AWS ) and hybrid cloud environments.
  • Scripting and automation skills (PowerShell, Bash ) for diagnostics and reporting.


Key Competencies

  • Crisis Management : Able to lead effectively in high-pressure, time-critical situations.
  • Collaboration : Works seamlessly with vendors, internal teams, and stakeholders to achieve common goals.
  • Analytical Mindset : Possesses superior troubleshooting and Root Cause Analysis (RCA) capabilities.
  • Communication : Delivers clear, concise, and timely updates during incidents, tailored for both technical and business audiences.
  • Proactive Mindset : Focuses on prevention and continuous service improvement, not just reactive resolution.


Remote Work Environment

  • This is a fully remote position. A company-provided Virtual Desktop (VDI) will be used for all work.
  • Candidates are expected to provide their own reliable computer (laptop or desktop) and at least one monitor capable of accessing the VDI.
  • A dedicated and quiet workspace is essential to maintain a professional environment during critical incident bridge calls.
This advertiser has chosen not to accept applicants from your region.

L3 Server Engineer – Major Incident Management

Hyderabad, Andhra Pradesh Nextbridge IT Solutions

Posted 4 days ago

Job Viewed

Tap Again To Close

Job Description

Company Description

Nextbridge IT Solutions is a US-based IT solution firm specializing in connecting exceptional talent with organizations driving transformation in infrastructure, cloud, and emerging technologies. We partner closely with clients to understand their technical needs and organizational goals, delivering tailored solutions through highly skilled professionals. Our culture values forward-thinking, accountability, and agility, encouraging continuous growth and supporting long-term success. Join us to shape the future together.


Role Description

This is a remote contract role for a L3 Server Engineer – Major Incident Management. The L3 Server Engineer will be responsible for managing and resolving major incidents, providing expert troubleshooting, and ensuring uptime and performance of infrastructure. Duties include handling operating systems, supporting databases, and overseeing overall IT infrastructure. The role also requires effective communication and collaboration with other IT professionals and stakeholders to ensure swift resolution of incidents.


Key Responsibilities

  • Serve as the primary technical escalation point for all server-related Major Incidents (MIM) and P1 events.
  • Lead technical triage on bridge calls and in war rooms, coordinating efforts between L2 support, application teams, vendors, and other cross-functional stakeholders.
  • Perform advanced, real-time troubleshooting to diagnose and resolve complex issues across Windows Server, Linux, and VMware virtualization platforms.
  • Drive the restoration of critical infrastructure services with a focus on minimizing business impact.
  • Author and deliver comprehensive Root Cause Analysis (RCA) and detailed post-incident reports.
  • Partner with the Problem Management team to identify trends, implement proactive solutions, and prevent incident recurrence.
  • Mentor and provide technical guidance to L1/L2 support teams to improve overall incident response capabilities.
  • Participate in a 24x7 on-call rotation to provide critical support when needed.


Qualifications

  • 5–8 years of hands-on experience in enterprise server administration and high-severity incident response.
  • Expert-level knowledge of Windows Server (2016/2019/2022)  and Linux (RHEL, Ubuntu) .
  • Deep expertise with virtualization technologies, specifically VMware ESXi/vSphere in a large-scale environment.
  • Solid understanding of core infrastructure concepts: TCP/IP networking, SAN/NAS storage, and enterprise backup/recovery solutions .
  • Hands-on experience with enterprise monitoring platforms (e.g., SolarWinds, Datadog, Nagios).
  • Proficiency with an ITSM tool, preferably ServiceNow, for incident lifecycle management.
  • Demonstrated ability to remain calm, focused, and organized during high-pressure situations.
  • ITIL v3/v4 Foundation certification is required.

Preferred

  • Advanced certifications such as MCSE, VCP, RHCE .
  • ITIL Intermediate/Expert or related certifications.
  • Experience with public cloud platforms (Azure, AWS ) and hybrid cloud environments.
  • Scripting and automation skills (PowerShell, Bash ) for diagnostics and reporting.


Key Competencies

  • Crisis Management : Able to lead effectively in high-pressure, time-critical situations.
  • Collaboration : Works seamlessly with vendors, internal teams, and stakeholders to achieve common goals.
  • Analytical Mindset : Possesses superior troubleshooting and Root Cause Analysis (RCA) capabilities.
  • Communication : Delivers clear, concise, and timely updates during incidents, tailored for both technical and business audiences.
  • Proactive Mindset : Focuses on prevention and continuous service improvement, not just reactive resolution.


Remote Work Environment

  • This is a fully remote position. A company-provided Virtual Desktop (VDI) will be used for all work.
  • Candidates are expected to provide their own reliable computer (laptop or desktop) and at least one monitor capable of accessing the VDI.
  • A dedicated and quiet workspace is essential to maintain a professional environment during critical incident bridge calls.
This advertiser has chosen not to accept applicants from your region.

Sr Specialist Systems Administration - Cloud and Linux System Administrator

Hyderabad, Andhra Pradesh AT&T

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Job Description:**
Title: Sr Specialist Systems Administration
Role: Cloud and Linux System Administrator
Experience Level: 8+ Years
Job Summary/ Description
Network Cloud Operations is responsible for Operations, Administration and Maintenance of Server Infrastructure in AT&T AIC (AT&T Integrated Cloud) and NC (Network Cloud) environment. This Job requires Software Update and Deployment of Network Cloud Environment, Firmware updates, Working on Tickets, Changes where the environment is built on Linux, Open stack, Kubernetes environment in 24*7 operations.
Roles and Responsibilities:
Performs work focusing on software, maintenance, and operations of systems used by AT&T or its clients to conduct business.
Performs feasibility assessments, creates requirements, manages projects, and integrates and tests technical solutions for software.
Primary / Mandatory skills:
This position requires a broad-spectrum skill set across Linux, KVM, OpenStack, Kubernetes, Containers, Cloud Infrastructure platforms, Windows, and Monitoring of the Server Infrastructure. The candidate should have minimum of 6-10 years of working experience on Production Environment.
Should have excellent trouble-shooting and analytical skills in Systems Administration of Linux, Cloud,
OpenStack and Kubernetes environment.
Should be well versed and worked on Operations and Maintenance of Open stack Modules like Keystone, Nova, Neutron, Swift, Cinder, Heat, Glance, Horizon, and Fuel.
Should be well versed with concepts like High Availability, DRS, Fault Tolerance, Scalability, Reliability, shared resources etc. including Handling of Tier 1/ Tier 2 Tickets in AT&T AIC/ NC Server Environment.
Creating, reviewing, approving, and implementing changes in AIC/ NC Server Environment.
Restoration of Applications/services in AIC/NC infrastructure from Hardware/ OpenStack/ OS perspective in Outages/ Service degradation scenarios.
Experience on Python & Shell scripting.
Upgradation & Deployment of AIC/ NC Server infrastructure (Linux, OpenStack & Kubernetes).
Knowledge and Understanding of CI/CD Pipeline in Jenkins, GitHub
Co-ordination with OS vendors, Development Teams, Production Support teams for Root Cause Analysis, configuration changes etc.
And Co-ordination with Hardware vendors (Dell, HP) for faulty Hardware replacement.
Should have very high level of Customer Focus, to provide effortless Customer Experience. Should display a positive attitude working with various peers/ customers / clients/ vendors etc.
The Candidate should be ready to work in 24*7 rotational shift Operations.
Should have strong analytical/logical/problem solving skills.
Should have to work across diverse teams across regions/cultures.
Certification in Systems Administration of Linux, Open Stack and Kubernetes is preferable.
The candidate should have high level of Co-ordination and Communication skills (verbal and written).
Education Qualification: Engineering/ Graduation/ Equivalent in Computer Science, Electronics and Communications.
**Weekly Hours:**
40
**Time Type:**
Regular
**Location:**
Hyderabad, India
It is the policy of AT&T to provide equal employment opportunity (EEO) to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state or local law. In addition, AT&T will provide reasonable accommodations for qualified individuals with disabilities. AT&T is a fair chance employer and does not initiate a background check until an offer is made.
AT&T will consider for employment qualified applicants in a manner consistent with the requirements of federal, state and local laws
We expect employees to be honest, trustworthy, and operate with integrity. Discrimination and all unlawful harassment (including sexual harassment) in employment is not tolerated. We encourage success based on our individual merits and abilities without regard to race, color, religion, national origin, gender, sexual orientation, gender identity, age, disability, marital status, citizenship status, military status, protected veteran status or employment status
This advertiser has chosen not to accept applicants from your region.
Be The First To Know

About the latest Server management Jobs in Hyderabad !

 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Server Management Jobs View All Jobs in Hyderabad