AI DevOps Engineer
Wall Street Consulting Services LLC
Warren Township · New Jersey · United States
Contract
10+
1d ago
94%
Strong
Job description
Job Title: AI DevOps Engineer
Location: Warren NJ
Duration : Long term
Experience Required: 10 + Years
Industry: Insurance / Financial Services
Job Summary
We are seeking a highly skilled AI DevOps Engineer to support and enhance AI/ML platform operations, cloud infrastructure automation, CI/CD pipelines, and MLOps practices for MSIG. The ideal candidate will have strong expertise in DevOps, cloud platforms, containerization, infrastructure automation, and AI/ML deployment pipelines. This role will collaborate closely with Data Scientists, ML Engineers, Software Developers, and Infrastructure teams to operationalize scalable AI solutions.
Key Responsibilities
AI/ML Platform & MLOps
Design, implement, and maintain scalable AI/ML infrastructure and MLOps pipelines.
Automate model deployment, retraining, monitoring, and versioning processes.
Manage end-to-end ML lifecycle including model packaging, deployment, and production support.
Integrate ML workflows with CI/CD pipelines for seamless deployment.
Support model governance, monitoring, drift detection, and rollback mechanisms.
DevOps & Cloud Engineering
Build and manage CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI/CD, or Azure DevOps.
Automate infrastructure provisioning using Terraform, CloudFormation, or ARM templates.
Manage Kubernetes clusters and containerized applications using Docker and Kubernetes/OpenShift/EKS/AKS/GKE.
Implement Infrastructure as Code (IaC) and configuration management best practices.
Ensure high availability, scalability, and reliability of AI applications.
Cloud & Infrastructure
Work with cloud platforms such as AWS, Azure, or GCP.
Configure and maintain cloud-native AI services and compute resources.
Implement monitoring, logging, and alerting using tools such as Prometheus, Grafana, ELK, Datadog, or CloudWatch.
Optimize infrastructure performance and cloud costs.
Security & Compliance
Implement DevSecOps best practices for AI environments.
Ensure compliance with enterprise security standards and regulatory requirements.
Manage IAM, secrets management, vulnerability scanning, and container security.
Collaboration & Support
Collaborate with AI/ML teams to productionize machine learning models.
Troubleshoot deployment and infrastructure issues across environments.
Participate in architecture discussions and operational planning.
Provide production support and incident resolution.
Required Skills
Technical Skills
Strong experience with DevOps and MLOps practices.
Expertise in:
Docker
Kubernetes/OpenShift
Jenkins / GitHub Actions / GitLab CI
Terraform / IaC tools
Linux Administration
Python or Shell scripting
Experience with AI/ML deployment frameworks:
MLflow
Kubeflow
SageMaker
Vertex AI
Azure ML
Cloud experience in AWS, Azure, or GCP.
Experience with monitoring/logging tools:
Prometheus
Grafana
ELK Stack
Splunk
Knowledge of networking, security, and cloud architecture.
AI/ML Knowledge
Understanding of machine learning workflows and model lifecycle.
Experience deploying AI/ML models into production environments.
Familiarity with LLMOps / Generative AI deployment is a plus.
Exposure to vector databases, GPU workloads, and AI inferencing platforms preferred.
Preferred Qualifications
Experience in Insurance or Financial Services domain.
Knowledge of Data Engineering pipelines and streaming platforms like Kafka.
Experience with GPU infrastructure and AI acceleration platforms.
Familiarity with Responsible AI and AI governance frameworks.
Relevant certifications in AWS/Azure/GCP or Kubernetes preferred.