AI & Cloud Infrastructure Lead

Chennai, Tamil Nadu Oracle

Posted today

Job Viewed

Tap Again To Close

Job Description

**Job Description**
**Required Skills and Qualifications:**
+ **Experience:**
+ Proven experience as an Oracle DBA with a deep understanding of Oracle Database management (version 11g, 12c, 19c) in both on-premises and cloud environments.
+ Strong experience working with **Oracle Cloud Infrastructure (OCI)** and cloud-based database services.
+ **Skills:**
+ Expert knowledge in database tuning, backup, recovery, security, and troubleshooting.
+ Hands-on experience with Oracle Enterprise Manager (OEM), RMAN, and SQL*Plus.
+ Proficient in cloud architecture and services related to Oracle Cloud.
+ Familiarity with cloud-native tools like **Oracle Autonomous Database** and **Oracle Exadata Cloud Service** .
+ Strong scripting skills (Shell, Python, etc.) for automation and management.
+ In-depth understanding of database replication, clustering, and high availability configurations.
+ Experience with performance monitoring and tuning using tools like **AWR** , **ADDM** , and **ASH** .
+ **Certifications:**
+ Oracle Certified Professional (OCP) is highly preferred.
+ Oracle Cloud certification is a plus (OCI Architect or Oracle Autonomous Database certification).
+ **Soft Skills:**
+ Excellent problem-solving skills and attention to detail.
+ Strong communication and collaboration skills.
+ Ability to work independently and in a team environment.
+ Ability to manage multiple priorities and meet deadlines.
**Preferred Qualifications:**
+ Academics : B.Tech/B.E./MCA/M.Sc
+ Experience in **Oracle RAC (Real Application Clusters)** and **Data Guard** configuration and management.
+ Experience with migration strategies from on-premises Oracle environments to Oracle Cloud.
+ Experience in Oracle Cloud networking and security best practices.
**Responsibilities**
**Key Responsibilities:**
+ **Database Administration:** Install, configure, manage, and maintain Oracle databases (Oracle 11g/12c/19c) both on-premises and in Oracle Cloud.
+ **Performance Tuning:** Perform database performance tuning, troubleshooting, and optimization for both traditional on-premises and cloud environments.
+ **Backup and Recovery:** Implement and manage backup and disaster recovery solutions for Oracle databases, ensuring data integrity and availability.
+ **Cloud Management:** Utilize Oracle Cloud Infrastructure (OCI) to deploy, monitor, and manage databases, implementing best practices for high availability, scalability, and security.
+ **Security & Compliance:** Manage database security configurations, enforce policies, and monitor for vulnerabilities. Ensure databases meet industry compliance standards.
+ **Capacity Planning:** Monitor database capacity, performance, and system utilization. Plan for scaling databases in the cloud to meet business needs.
+ **Automation & Scripting:** Automate routine DBA tasks such as backups, patches, and database monitoring using tools like shell scripting, Python, and Oracle Enterprise Manager.
+ **Troubleshooting:** Provide database troubleshooting and resolution for issues related to database connectivity, performance, and security.
+ **Collaboration:** Work with development teams to optimize SQL queries and database schemas for better application performance.
+ **Upgrades & Patching:** Plan and execute upgrades and patches to Oracle database systems, ensuring minimal downtime and data consistency.
+ **Documentation & Reporting:** Maintain up-to-date documentation of database environments, configurations, and procedures. Provide regular reports on database performance and security.
Career Level - IC3
**About Us**
As a world leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's challenges. We've partnered with industry-leaders in almost every sector-and continue to thrive after 40+ years of change by operating with integrity.
We know that true innovation starts when everyone is empowered to contribute. That's why we're committed to growing an inclusive workforce that promotes opportunities for all.
Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We're committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing or by calling in the United States.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
This advertiser has chosen not to accept applicants from your region.

Google Cloud Infrastructure Support Engineer

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Before you apply to a job, select your language preference from the options available at the top right of this page.**
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow-people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
**Job Description:**
Google Cloud Infrastructure Support Engineer will be responsible for ensuring the reliability, performance, and security of our Google Cloud Platform (GCP) infrastructure. Work closely with cross-functional teams to troubleshoot issues, optimize infrastructure, and implement best practices for cloud architecture. Experience with Terraform for deploying and managing infrastructure templates. Administer BigQuery environments, including managing datasets, access controls, and optimize query performance. Be familiar with Vertex AI for monitoring and managing machine learning model deployments. Knowledge of GCP's Kubernetes Engine and its integration with the cloud ecosystem. Understanding of cloud security best practices and experience with implementing security measures. Knowledge of setting up and managing data clean rooms within BigQuery. Understanding of the Analytics Hub platform and how it integrates with data clean rooms to facilitate sensitive data-sharing use cases. Knowledge of DataPlex and how it integrates with other Google Cloud services such as BigQuery, Dataproc Metastore, and Data Catalog.
**Key Responsibilities:**
+ Provide technical support for our Google Cloud Platform infrastructure, including compute, storage, networking, and security services.
+ Monitor system performance and proactively identify and resolve issues to ensure maximum uptime and reliability.
+ Collaborate with cross-functional teams to design, implement, and optimize cloud infrastructure solutions.
+ Automate repetitive tasks and develop scripts to streamline operations and improve efficiency.
+ Document infrastructure configurations, processes, and procedures.
**Qualifications:**
Required:
+ Strong understanding of GCP services, including Compute Engine, Kubernetes Engine, Cloud Storage, VPC networking, and IAM.
+ Experience with BigQuery and VertexAI
+ Proficiency in scripting languages such as Python, Bash, or PowerShell.
+ Experience with infrastructure as code tools such as Terraform or Google Deployment Manager.
+ Strong communication and collaboration skills.
+ Bachelor's Degree in Computer Science or related discipline, or the equivalent in education and work experience
Preferred:
+ Google Cloud certification (e.g., Google Cloud Certified - Professional Cloud Architect, Google Cloud Certified - Professional Cloud DevOps Engineer)
**Employee Type:**
Permanent
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.
This advertiser has chosen not to accept applicants from your region.

Google Cloud Infrastructure Support Engineer

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Avant de postuler à un emploi, sélectionnez votre langue de préférence parmi les options disponibles en haut à droite de cette page.**
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
Google Cloud Infrastructure Support Engineer will be responsible for ensuring the reliability, performance, and security of our Google Cloud Platform (GCP) infrastructure. Work closely with cross-functional teams to troubleshoot issues, optimize infrastructure, and implement best practices for cloud architecture. Experience with Terraform for deploying and managing infrastructure templates. Administer BigQuery environments, including managing datasets, access controls, and optimize query performance. Be familiar with Vertex AI for monitoring and managing machine learning model deployments. Knowledge of GCP's Kubernetes Engine and its integration with the cloud ecosystem. Understanding of cloud security best practices and experience with implementing security measures. Knowledge of setting up and managing data clean rooms within BigQuery. Understanding of the Analytics Hub platform and how it integrates with data clean rooms to facilitate sensitive data-sharing use cases. Knowledge of DataPlex and how it integrates with other Google Cloud services such as BigQuery, Dataproc Metastore, and Data Catalog.
**Key Responsibilities:**
+ Provide technical support for our Google Cloud Platform infrastructure, including compute, storage, networking, and security services.
+ Monitor system performance and proactively identify and resolve issues to ensure maximum uptime and reliability.
+ Collaborate with cross-functional teams to design, implement, and optimize cloud infrastructure solutions.
+ Automate repetitive tasks and develop scripts to streamline operations and improve efficiency.
+ Document infrastructure configurations, processes, and procedures.
**Qualifications:**
Required:
+ Strong understanding of GCP services, including Compute Engine, Kubernetes Engine, Cloud Storage, VPC networking, and IAM.
+ Experience with BigQuery and VertexAI
+ Proficiency in scripting languages such as Python, Bash, or PowerShell.
+ Experience with infrastructure as code tools such as Terraform or Google Deployment Manager.
+ Strong communication and collaboration skills.
+ Bachelor's Degree in Computer Science or related discipline, or the equivalent in education and work experience
Preferred:
+ Google Cloud certification (e.g., Google Cloud Certified - Professional Cloud Architect, Google Cloud Certified - Professional Cloud DevOps Engineer)
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
This advertiser has chosen not to accept applicants from your region.

Oracle Cloud Infrastructure Architect @Chennai (preferred)NOT looking for Admin folks/Mumbai/Pune/

New
Chennai, Tamil Nadu Live Connections

Posted today

Job Viewed

Tap Again To Close

Job Description

Role:


OCI Architect would lead the design, implementation, and management of robust cloud solutions on Oracle Cloud Infrastructure (OCI). This role demands deep technical expertise, strategic thinking, and hands-on experience in provisioning, migration, and optimization of OCI environments.

He/she would Architect and implement scalable, secure, and cost-efficient solutions on OCI

Lead environment provisioning, cloud migration, and deployment activities

Design and develop OCI infrastructure covering compute, networking, storage, and security

Automate provisioning and configuration using Infrastructure-as-Code tools

Enforce OCI security best practices and ensure compliance with regulatory standards


AsK:


10-15yrs of experience ONLY. Should be an SME on OCI Core Services like Compute, Storage, Networking, IAM

Strong understanding of OCI Security & Compliance frameworks

Proven experience designing OCI architectures for enterprise workloads

Proficiency in Terraform or OCI Resource Manager for IaC

Hands-on experience with DevOps practices and CI/CD pipelines

Familiarity with cloud observability tools for monitoring and performance tuning.

This advertiser has chosen not to accept applicants from your region.

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Avant de postuler à un emploi, sélectionnez votre langue de préférence parmi les options disponibles en haut à droite de cette page.**
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
This advertiser has chosen not to accept applicants from your region.

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Before you apply to a job, select your language preference from the options available at the top right of this page.**
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow-people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
**Job Description:**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Employee Type:**
Permanent
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.
This advertiser has chosen not to accept applicants from your region.

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Avant de postuler à un emploi, sélectionnez votre langue de préférence parmi les options disponibles en haut à droite de cette page.**
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
This advertiser has chosen not to accept applicants from your region.
Be The First To Know

About the latest Infrastructure as code tools Jobs in Chennai !

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Before you apply to a job, select your language preference from the options available at the top right of this page.**
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow-people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
**Job Description:**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Employee Type:**
Permanent
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.
This advertiser has chosen not to accept applicants from your region.

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Avant de postuler à un emploi, sélectionnez votre langue de préférence parmi les options disponibles en haut à droite de cette page.**
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
**Job Summary:**
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
**Key Responsibilities:**
**Cloud Infrastructure & Platform Engineering**
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
**Automation & Reliability**
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
**Security, Governance & Compliance**
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
**Monitoring, Observability & Cost Optimization**
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
**Collaboration & Enablement**
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
**Required Qualifications**
**Education**
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
**Experience**
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
**Preferred Experience**
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
**Preferred Certifications:**
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
This advertiser has chosen not to accept applicants from your region.

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Before you apply to a job, select your language preference from the options available at the top right of this page.**
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow-people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
**Job Description:**
**Job Summary:**
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
**Key Responsibilities:**
**Cloud Infrastructure & Platform Engineering**
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
**Automation & Reliability**
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
**Security, Governance & Compliance**
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
**Monitoring, Observability & Cost Optimization**
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
**Collaboration & Enablement**
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
**Required Qualifications**
**Education**
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
**Experience**
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
**Preferred Experience**
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
**Preferred Certifications:**
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Employee Type:**
Permanent
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.
This advertiser has chosen not to accept applicants from your region.

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Infrastructure As Code Tools Jobs View All Jobs in Chennai