29 Terraform jobs in Chennai

Terraform

Chennai, Tamil Nadu Colan Infotech Private Limited

Posted 6 days ago

Job Viewed

Tap Again To Close

Job Description

full-time

Overall Responsibilities:

Design and implement Infrastructure as Code (IaC) solutions using Terraform.

Develop and maintain reusable Terraform modules for cloud infrastructure automation.

Collaborate with cloud architects, DevOps, and security teams to optimize cloud deployments.

Ensure scalability, security, and compliance of infrastructure solutions.

Improve cloud infrastructure reliability through automated provisioning and monitoring.

Document best practices, standards, and Terraform coding guidelines.

Provide mentorship and guidance to junior Terraform developers and cloud engineers.

Technical Responsibilities:

Write and manage Terraform scripts for provisioning AWS, Azure, and GCP infrastructure.

Optimize Terraform configurations for high availability, cost efficiency, and security.

Integrate Terraform with CI/CD pipelines using GitHub Actions, Jenkins, and AWS CodePipeline.

Manage state files and remote backends using Terraform Cloud or S3 with DynamoDB locking.

Implement Role-Based Access Control (RBAC) and security best practices in IaCdeployments.

Troubleshoot and debug Terraform-related issues in production and staging environments.

Automate infrastructure testing using tools like Terratestand Checkov.

Contribute to infrastructure governance, enforcing policies via Sentinel or Open Policy Agent (OPA). Tools & Technologies(as applicable):Terraform, AWS, Azure, GCP, Terraform Cloud, AWS CloudFormation, Kubernetes, Docker, Ansible, Jenkins, GitHub Actions, AWS CodePipeline, Prometheus, Datadog, Sentinel, Open Policy Agent (OPA), Terratest, Checkov

Preferred Experience:

Extensive experience in designing, developing, and managing Terraform-based infrastructure.

Strong knowledge of cloud platforms (AWS, Azure, GCP) and their best practices.

Experience integrating Terraform with DevOps pipelines and automation workflows.

Hands-on expertise in Terraform security, governance, and policy enforcement.

Familiarity with containerization and orchestration using Kubernetes and Docker.

Experience in cloud migration projects and hybrid cloud architectures. 

This advertiser has chosen not to accept applicants from your region.

Golang+Terraform

Chennai, Tamil Nadu Tata Consultancy Services

Posted 3 days ago

Job Viewed

Tap Again To Close

Job Description

Must-Have


  • Proficiency in GO Language
  • Proficiency in Terraform/HCL (Hashi Corp Language)
  • Proficiency in SDLC concepts (bitBucket, Jira, Jenkins, etc.)
  • Fundamental knowledge of IT Service Management/ITIL processes and tools
  • REST API connectivity programming concepts
  • Appreciation for clean and well documented code
  • Use of source control products and methodologies
  • 10+ years of total IT experience
  • 8+ years of experience in middleware integration
  • 4+ years of experience designing and deploying integration services solutions using MuleSoft
  • Advanced knowledge of SOA Design patterns for building middleware systems ground up using Message Routing, Content Enrichment, Message Filtering, Message Transformation, Guaranteed delivery, Message sequencing, Batch message processing, error handling and reconciliation mechanisms.
  • Experienced in engineering/upgrading complex solutions in line with demanding client requirements
  • Experienced in J2EE, Java Servlets, JMS, EJB, Design Patterns, Web services and building integration projects
  • Strong troubleshooting skills
  • Excellent communication skills; candidate must be able to concisely communicate complex technical topics
  • Proficient in the Agile, Scrum, and iterative SDLC practices

Good-to-Have

  • Build integration flows, Terraform modules and/or APIs using MuleSoft Anypoint Studio 3.x and 4.x, Java, JMS, Structured Query Language (SQL)
  • Fundamental knowledge of IT Service Management/ITIL processes and tools
  • REST API connectivity programming concepts
  • Appreciation for clean and well documented code
  • Use of source control products and methodologies

Exp Range: 5 TO 10

Location: Chennai/Hyderabad/Bangalore/Mumbai/Indore

This advertiser has chosen not to accept applicants from your region.

DevOps Engineer - Kubernetes / OpenShift, Terraform and Azure / AWS DevOps

Chennai, Tamil Nadu Cognizant

Posted today

Job Viewed

Tap Again To Close

Job Description

Primary: CI/CD, Containerization, Kubernetes (OR) OpenShift, Terraform with preferably Azure DevOps if not AWS DevOps. Secondary: Datadog
**Job Summary**
We are seeking an experienced Infra Dev Specialist with 4 to 8 years of experience to join our team. The ideal candidate will have expertise in Docker Ansible Kubernetes linux CICD Azure Devops Terraform and Shellscript. This role is hybrid with no travel required and involves working closely with our development and operations teams to ensure seamless infrastructure management and deployment.
**Responsibilities**
+ Develop and maintain scalable infrastructure solutions using Docker Ansible Kubernetes linux CICD Azure Devops Terraform and shell script to support business applications.
+ Implement and manage Docker Containers to ensure efficient application deployment and operation.
+ Collaborate with cross-functional teams to design and optimize Kubernetes Work Models for improved performance.
+ Monitor and troubleshoot infrastructure issues to minimize downtime and ensure system reliability.
+ Automate deployment processes to enhance efficiency and reduce manual intervention.
+ Provide technical guidance and support to development teams to ensure best practices in containerization.
+ Conduct regular system audits to identify potential improvements and implement necessary changes.
+ Document infrastructure configurations and processes to maintain a comprehensive knowledge base.
+ Stay updated with the latest industry trends and technologies to continuously improve infrastructure solutions.
+ Ensure compliance with security policies and procedures to protect company data and assets.
+ Participate in capacity planning to ensure infrastructure scalability and performance.
+ Optimize resource utilization to reduce costs and improve system efficiency.
+ Contribute to the companys purpose by ensuring robust and reliable infrastructure that supports business growth.
**Qualifications**
+ Possess strong experience in Kubernetes Services and Docker Container management.
+ Demonstrate proficiency in Kubernetes Work Model design and implementation.
+ Exhibit excellent problem-solving skills with a focus on infrastructure optimization.
+ Have a solid understanding of automation tools and techniques for deployment processes.
+ Show ability to work collaboratively in a hybrid work environment.
+ Display strong communication skills to effectively interact with cross-functional teams.
+ Maintain a proactive approach to learning and adapting to new technologies.
**Certifications Required**
Certified Kubernetes Administrator (CKA)
Cognizant is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law.
This advertiser has chosen not to accept applicants from your region.

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Avant de postuler à un emploi, sélectionnez votre langue de préférence parmi les options disponibles en haut à droite de cette page.**
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
This advertiser has chosen not to accept applicants from your region.

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Before you apply to a job, select your language preference from the options available at the top right of this page.**
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow-people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
**Job Description:**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Employee Type:**
Permanent
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.
This advertiser has chosen not to accept applicants from your region.

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Avant de postuler à un emploi, sélectionnez votre langue de préférence parmi les options disponibles en haut à droite de cette page.**
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
This advertiser has chosen not to accept applicants from your region.

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Before you apply to a job, select your language preference from the options available at the top right of this page.**
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow-people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
**Job Description:**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Employee Type:**
Permanent
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.
This advertiser has chosen not to accept applicants from your region.
Be The First To Know

About the latest Terraform Jobs in Chennai !

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Avant de postuler à un emploi, sélectionnez votre langue de préférence parmi les options disponibles en haut à droite de cette page.**
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
**Job Summary:**
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
**Key Responsibilities:**
**Cloud Infrastructure & Platform Engineering**
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
**Automation & Reliability**
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
**Security, Governance & Compliance**
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
**Monitoring, Observability & Cost Optimization**
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
**Collaboration & Enablement**
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
**Required Qualifications**
**Education**
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
**Experience**
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
**Preferred Experience**
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
**Preferred Certifications:**
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
This advertiser has chosen not to accept applicants from your region.

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Before you apply to a job, select your language preference from the options available at the top right of this page.**
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow-people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
**Job Description:**
**Job Summary:**
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
**Key Responsibilities:**
**Cloud Infrastructure & Platform Engineering**
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
**Automation & Reliability**
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
**Security, Governance & Compliance**
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
**Monitoring, Observability & Cost Optimization**
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
**Collaboration & Enablement**
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
**Required Qualifications**
**Education**
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
**Experience**
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
**Preferred Experience**
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
**Preferred Certifications:**
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Employee Type:**
Permanent
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.
This advertiser has chosen not to accept applicants from your region.

GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD

Chennai, Tamil Nadu UPS

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

**Avant de postuler à un emploi, sélectionnez votre langue de préférence parmi les options disponibles en haut à droite de cette page.**
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **8+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
This advertiser has chosen not to accept applicants from your region.
 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Terraform Jobs View All Jobs in Chennai