Overview:
Develop and manage scalable AI infrastructure strategies for cloud and on-prem environments to support enterprise AI workloads.
Key Responsibilities:
Architect infrastructure for AI training, inferencing, and deployment.
Design cost-efficient, secure, high-performance AI environments.
Implement MLOps pipelines and Infrastructure as Code (IaC).
Manage orchestration using Kubernetes and real-time monitoring.
Ensure seamless integration with enterprise IT systems.
Drive governance, compliance, and performance tuning.
Experience & Skills:
8+ years in AI infrastructure/cloud computing.
Hands-on with Terraform, Ansible, Kubernetes, Docker.
Deep experience with cloud platforms (AWS, Azure, GCP).
Proven performance and cost optimization strategies for AI systems.