11 Bash jobs in India
Bash Engineer
Posted today
Job Viewed
Job Description
About Company:
Our client is a Palo Alto–based AI infrastructure and talent platform founded in 2018. It helps companies connect with remote software developers using AI-powered vetting and matching technology. Originally branded as the “Intelligent Talent Cloud,”enabled companies to “spin up their engineering dream team in the cloud” by sourcing and managing vetted global talent.
In recent years, they have evolved to support AI infrastructure and AGI workflows, offering services in model training, fine-tuning, and deployment—powered by their internal AI platform, ALAN, and backed by a vast talent network. They reported $300 million in revenue and reached profitability. Their growth is driven by demand for annotated training data from AI labs, including major clients like OpenAI, Google, Anthropic, and Meta.
Job Title: Bash Engineer
Location: Pan India
Experience: 3+ yrs.
Employment Type: Contract to hire
Work Mode: Remote
Notice Period: - Immediate joiners
What does day-to-day look like:
- Develop, maintain, and optimize Bash scripts for automation, deployment, monitoring, and system orchestration.
- Automate system-level operations — environment setup, build configuration, log collection, and service management.
- Collaborate with cross-functional teams to integrate Bash automation
- Develop high-quality software solutions and comprehensive test suites, ensuring code robustness, correctness, and coverage across edge cases.
- Craft clear, unambiguous technical specifications and problem statements, balancing creativity and engineering precision.
- Apply structured metadata and documentation to development tasks, capturing taxonomy, difficulty, domain relevance, and reliability metrics.
- Participate in peer reviews and quality assurance processes to uphold rigorous engineering standards and system consistency.
- Deliver work in a maintainable, modular, and scalable format ready for production integration or downstream application use.
Requirements:
- 3+ years of experience in writing Bash scripts for automation, process orchestration, and system-level tasks.
- Excellent troubleshooting and debugging skills in complex multi-system environments.
- Demonstrated ability to write clean, modular, and reusable scripts following best practices — including parameterization, error handling, logging, and exit codes.
- Deep experience integrating Bash scripts into CI/CD pipelines (e.g., Jenkins, GitHub Actions, GitLab CI/CD, CircleCI, or Azure DevOps).
- Strong familiarity with containerization and virtualization tools (Docker, Podman, or Kubernetes CLI) for environment automation and build consistency.
- Understanding of secure scripting practices, including safe handling of credentials, environment variables, and sensitive data.
- Excellent problem-solving, debugging, and performance optimization skills in multi-environment setups.
- Strong written and verbal communication skills, with the ability to produce clear documentation and explain automation workflows
- Experience with test automation, benchmark creation, or complex systems evaluation is a strong plus.
- Familiarity with modern software data formats (e.g., JSON, YAML) and version-controlled codebases.
Bash Engineer
Posted today
Job Viewed
Job Description
About Company:
Our client is a Palo Alto–based AI infrastructure and talent platform founded in 2018. It helps companies connect with remote software developers using AI-powered vetting and matching technology. Originally branded as the “Intelligent Talent Cloud”, it enabled companies to “spin up their engineering dream team in the cloud” by sourcing and managing vetted global talent.
In recent years, they have evolved to support AI infrastructure and AGI workflows, offering services in model training, fine-tuning, and deployment—powered by their internal AI platform, ALAN, and backed by a vast talent network. They reported $300 million in revenue and reached profitability. Their growth is driven by demand for annotated training data from AI labs, including major clients like OpenAI, Google, Anthropic, and Meta.
Job Title: Bash Engineer
Location: Pan India
Experience: 3+ yrs.
Employment Type: Contract to hire
Work Mode: Remote
Notice Period: - Immediate joiners
Job Description:
Requirements:
- 3+ years of experience in writing Bash scripts for automation, process orchestration, and system-level tasks.
- Excellent troubleshooting and debugging skills in complex multi-system environments.
- Demonstrated ability to write clean, modular, and reusable scripts following best practices — including parameterisation, error handling, logging, and exit codes.
- Deep experience integrating Bash scripts into CI/CD pipelines (e.g., Jenkins, GitHub Actions, GitLab CI/CD, CircleCI, or Azure DevOps).
- Strong familiarity with containerisation and virtualisation tools (Docker, Podman, or Kubernetes CLI) for environment automation and build consistency.
- Understanding of secure scripting practices, including safe handling of credentials, environment variables, and sensitive data.
- Excellent problem-solving, debugging, and performance optimisation skills in multi-environment setups.
- Strong written and verbal communication skills, with the ability to produce clear documentation and explain automation workflows
- Experience with test automation, benchmark creation, or complex systems evaluation is a strong plus.
- Familiarity with modern software data formats (e.g., JSON, YAML) and version-controlled codebases.
GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD
Posted 2 days ago
Job Viewed
Job Description
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD
Posted 2 days ago
Job Viewed
Job Description
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow-people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
**Job Description:**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Employee Type:**
Permanent
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.
GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD
Posted 2 days ago
Job Viewed
Job Description
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD
Posted 2 days ago
Job Viewed
Job Description
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow-people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
**Job Description:**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Employee Type:**
Permanent
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.
GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD
Posted 2 days ago
Job Viewed
Job Description
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
**Job Summary:**
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
**Key Responsibilities:**
**Cloud Infrastructure & Platform Engineering**
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
**Automation & Reliability**
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
**Security, Governance & Compliance**
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
**Monitoring, Observability & Cost Optimization**
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
**Collaboration & Enablement**
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
**Required Qualifications**
**Education**
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
**Experience**
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
**Preferred Experience**
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
**Preferred Certifications:**
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
Be The First To Know
About the latest Bash Jobs in India !
GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD
Posted 2 days ago
Job Viewed
Job Description
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow-people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
**Job Description:**
**Job Summary:**
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
**Key Responsibilities:**
**Cloud Infrastructure & Platform Engineering**
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
**Automation & Reliability**
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
**Security, Governance & Compliance**
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
**Monitoring, Observability & Cost Optimization**
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
**Collaboration & Enablement**
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
**Required Qualifications**
**Education**
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
**Experience**
+ **5+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
**Preferred Experience**
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
**Preferred Certifications:**
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Employee Type:**
Permanent
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.
GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD
Posted 2 days ago
Job Viewed
Job Description
Découvrez votre prochaine opportunité au sein d'une organisation qui compte parmi les 500 plus importantes entreprises mondiales. Envisagez des opportunités innovantes, découvrez notre culture enrichissante et travaillez avec des équipes talentueuses qui vous poussent à vous développer chaque jour. Nous savons ce qu'il faut faire pour diriger UPS vers l'avenir : des personnes passionnées dotées d'une combinaison unique de compétences. Si vous avez les qualités, de la motivation, de l'autonomie ou le leadership pour diriger des équipes, il existe des postes adaptés à vos aspirations et à vos compétences d'aujourd'hui et de demain.
**Fiche de poste :**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **8+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Type de contrat:**
en CDI
_Chez UPS, égalité des chances, traitement équitable et environnement de travail inclusif sont des valeurs clefs auxquelles nous sommes attachés._
GCP Infrastructure Engineer - Google Cloud, Terraform, Python, Bash, GKE, CI/CD
Posted 2 days ago
Job Viewed
Job Description
Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day. We know what it takes to lead UPS into tomorrow-people with a unique combination of skill + passion. If you have the qualities and drive to lead yourself or teams, there are roles ready to cultivate your skills and take you to the next level.
**Job Description:**
Job Summary:
We are seeking a highly skilled GCP Infrastructure Engineer to design, build, and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role, you will leverage Google Cloud Platform (GCP) Vertex AI, IBM Watsonx, and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure, scalable, and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle - from design and provisioning to automation, monitoring, and optimization - while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.
Key Responsibilities:
Cloud Infrastructure & Platform Engineering
+ Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
+ Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
+ Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
+ Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
+ Ensure business continuity through backup, disaster recovery, and multi-region deployments.
Automation & Reliability
+ Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
+ Adopt GitOps practices (Flux) for infrastructure lifecycle management.
+ Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
+ Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.
Security, Governance & Compliance
+ Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
+ Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
+ Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.
Monitoring, Observability & Cost Optimization
+ Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
+ Define KPIs to monitor system health, performance, and adoption across AI workloads.
+ Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.
Collaboration & Enablement
+ Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
+ Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
+ Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.
Required Education
Bachelor's or master's degree in computer science, Software Engineering, or a related field.
Required Experience
+ **8+ years** of experience in cloud infrastructure engineering, **DevOps,** or platform engineering.
+ Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
+ Strong hands-on expertise with **Google Cloud Platform (GCP),** especially **Vertex** **AI.**
+ Experience with **IBM Watsonx for AI application** deployment and management.
+ Proven skills in **Docker, Kubernetes (GKE),** and container orchestration at scale.
+ Proficiency in **Python, Bash,** or other relevant scripting languages.
+ Strong understanding of cloud networking, IAM, and security best practices.
+ Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
+ Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
+ Excellent problem-solving, debugging, and communication skills.
Preferred Experience
+ Experience in MLOps practices for model deployment, monitoring, and retraining.
+ Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).
+ Hands-on experience with feature stores (Vertex AI Feature Store, Feast) and ML observability tools (EvidentlyAI, Fiddler).
+ Knowledge of distributed training frameworks (Horovod, DeepSpeed, PyTorch Distributed).
+ Contributions to open-source projects in infrastructure, MLOps, or GenAI.
+ Experience managing infrastructure in regulated industries.
Preferred Certifications:
+ Google Cloud Certified - Professional Cloud Architect
+ Google Cloud Certified - Machine Learning Engineer
+ Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
+ IBM Certified Watsonx Generative AI Engineer - Associate
+ IBM Certified Solution Architect - Cloud Pak for Data
+ Other relevant certifications in AI, Machine Learning, or Cloud-Native technologies.
**Employee Type:**
Permanent
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.