Senior MLOps Engineer

AI overview

Own the reliability, scalability, and velocity of model training and deployment for autonomy systems, converting experimental models into reliable production services.
Teleo is a robotics startup disrupting a trillion-dollar industry. Teleo converts construction heavy equipment, like loaders, dozers, excavators, trucks, etc. into autonomous robots. This technology allows a single operator to efficiently control multiple machines simultaneously, delivering substantial benefits to our customers while significantly enhancing operator safety and comfort. Teleo is founded by Vinay Shet and Rom Clément, experienced technology executives who led the development of Lyft’s Self Driving Car and Google Street View. Teleo is backed by YCombinator, Up Partners, F-Prime Capital, and a host of industry luminaries. Teleo’s product is already deployed on several continents and generating revenue.  Teleo is poised for rapid growth. This presents a unique opportunity to be part of a team that is creating a product with a profound impact on our customers, working on cutting-edge 100,000-pound autonomous robots, engineering intricate systems at the intersection of hardware, software, and AI, and joining the early stages of an exciting startup journey. About the Role Own the reliability, scalability, and velocity of model training and deployment for autonomy systems. Turn experimental models into dependable production services. Core Responsibilities
  • Design and operate end-to-end ML infrastructure: training, evaluation, deployment, monitoring
  • Build CI/CD for ML (model versioning, promotion, rollback, canarying)
  • Own model observability: drift detection, performance regression, data health
  • Optimize GPU utilization across training and inference (on-prem + cloud)
  • Support edge deployment (Jetson / Orin / x86 + GPU)
  • Work closely with perception and autonomy teams to reduce friction from research to production
  • Required Qualifications
  • 2+ years in MLOps / Infra / ML Platform
  • Deep experience with PyTorch, CUDA-aware workflows
  • Strong Linux + systems fundamentals
  • Proven experience deploying models at scale (not just notebooks)
  • Preferred Qualification
  • Training orchestration: Ray, Slurm, Kubernetes, Airflow
  • Model lifecycle: Weights & Biases,  MLflow, custom registries
  • Containers: Docker, multi-arch builds
  • Inference optimization: TensorRT, ONNX, Triton
  • Monitoring: metrics, logs, alerts for ML systems
  • Bonus Points
  • Experience with autonomy or robotics
  • Edge deployment constraints (latency, power, thermal)
  • Data versioning tools (DVC, LakeFS)
  • Teleo is an equal opportunity employer and we value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. All qualified people are encouraged to apply.
    Salary
    $200,000 – $250,000 per year
    Ace your job interview

    Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

    Operations Engineer Q&A's
    Report this job
    Apply for this job