At IMO Health, we combine strengths in software development, artificial intelligence, and clinical expertise to create AI-driven solutions that enhance access to reliable health information, support clinical decision-making, and improve patient outcomes. We are seeking a Staff AI / MLOps Engineer to join our Software Engineering organization, owning the end-to-end machine learning lifecycle for production AI systems. This role is responsible for designing, building, deploying, operating, and evolving AI-powered systems that are scalable, reliable, observable, and maintainable in real-world clinical environments. This is a technical leadership role focused on operational excellence and architectural rigor. The ideal candidate is a hands-on engineer with deep experience across software engineering, MLOps, DevOps, cloud infrastructure, and data systems, capable of owning ML systems from initial design through long-term production operation, monitoring, retraining, and retirement. You will partner closely with data scientists, product teams, and platform engineers to ensure AI models successfully transition from research to durable, production-grade systems. WHAT YOU’LL DO:

Own the full ML lifecycle, including data ingestion, training, validation, deployment, monitoring, retraining, and retirement.

Transition AI/ML prototypes into scalable, production-ready systems with CI/CD pipelines, automation, and observability.

Lead system design and architecture discussions, providing guidance on ML systems, MLOps, and AI infrastructure.

Develop and maintain AI-driven applications and inference services, optimizing for performance, scalability, reliability, and cost.

Integrate LLMs, generative AI, and NLP solutions into IMO Health products, focusing on unstructured clinical data.

Implement monitoring, alerting, logging, and dashboards to ensure model quality, detect drift, and maintain operational SLAs.

Build, maintain, and optimize CI/CD pipelines, automation scripts, and Infrastructure-as-Code for production ML systems.

Apply containerization (Docker, Kubernetes) and cloud infrastructure best practices to manage production environments.

Mentor and guide engineers, enforce technical standards, and drive reduction of technical debt.

Conduct root cause analysis of production defects and implement durable fixes.

Advocate for non-functional requirements (availability, scalability, reliability, maintainability) and design systems accordingly.

Collaborate cross-functionally with Product, Data Science, Architecture, and Engineering teams to align AI solutions with business goals.

WHAT YOU’LL NEED:

8+ years of professional experience in software engineering, AI/ML engineering, or related roles, building and operating production-grade systems.

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field (or equivalent experience).

Strong foundation in computer science fundamentals (data structures, algorithms, design patterns, operating systems, networking).

Expert-level coding skills in Python or Java, with a strong emphasis on production-quality software engineering practices.

Hands-on experience owning ML systems in production, including deployment, monitoring, retraining, and optimization.

Experience designing and operating CI/CD pipelines, automation, and observability for ML systems.

Deep experience with cloud platforms (AWS or Azure), containerization, and Infrastructure-as-Code.

Experience with MLOps tools and workflows (e.g., MLflow, SageMaker, Kubeflow).

Experience integrating and deploying LLMs, generative AI, and agentic systems in production environments.

Working knowledge of NLP concepts (tokenization, embeddings, classification, sequence modeling); healthcare exposure is a plus.

Experience with Elasticsearch and vector databases for embedding-based search and retrieval.

Proven ability to translate business needs into scalable, reliable technical solutions, balancing technical debt and delivery velocity.

Strong system design skills for high-performance, distributed, and scalable systems.

Excellent communication and collaboration skills across cross-functional, distributed teams.

Self-starter who can operate autonomously and own complex systems end to end. 

NICE TO HAVE:

Experience with clinical or healthcare AI applications.

Familiarity with Hugging Face, PyTorch, TensorFlow, or other modern ML frameworks.

AWS Associate-level certification (Machine Learning Engineer or Solutions Architect).

Staff AI / MLOps Engineer - Clinical AI

TLDR