Staff ML Engineer, Autonomy Vision Foundation Models

TLDR

Lead technical efforts in developing large-scale perception systems and vision foundation models while working alongside a diverse team of ML researchers and engineers in the autonomous driving field.

Woven by Toyota is enabling Toyota’s once-in-a-century transformation into a mobility company. Inspired by a legacy of innovating for the benefit of others, our mission is to challenge the current state of mobility through human-centric innovation — expanding what “mobility” means and how it serves society. Our work centers on four pillars: AD/ADAS, our autonomous driving and advanced driver assist technologies; Arene, our software development platform for software-defined vehicles; Woven City, a test course for mobility; and Cloud & AI, the digital infrastructure powering our collaborative foundation. Business-critical functions empower these teams to execute, and together, we’re working toward one bold goal: a world with zero accidents and enhanced well-being for all. THE TEAM At Woven by Toyota, we work on some of the most challenging problems in autonomy — from 3D geometric computer vision and perception to prediction, motion planning, and safe deployment of ML systems in real vehicles. Our AD/ADAS organization develops production‑grade autonomy and active safety technologies that operate in complex, uncertain real‑world environments and ship at global scale. You’ll collaborate daily with ML researchers, software engineers, robotics engineers, and hardware teams to design, build, and deploy perception and world‑understanding systems that directly influence how vehicles see, reason about, and interact with the world.   WHO ARE WE LOOKING FOR? We are seeking a Senior / Staff Machine Learning Engineer to help lead the development and deployment of vision foundation models and large‑scale perception systems for our autonomy stack. This is a high‑impact individual contributor role for an engineer who combines strong ML modeling skills with system‑level thinking and a pragmatic approach to scale. You will own critical pieces of the perception ML lifecycle from data and training paradigms to validation and deployment and help evolve our infrastructure and processes as our ambitions and scale grow.  You’ll be expected to make thoughtful technical trade‑offs, design for efficiency, and drive progress under real‑world infrastructure, safety, and production constraints. RESPONSIBILITIES
  • Lead the technical roadmap and execution for perception foundation models, vision‑language architectures, and world‑understanding components used across ADAS and autonomy applications.

  • Design and train large vision and multi‑modal models, including pre‑training, fine‑tuning, distillation, and adaptation for autonomy‑specific use cases.

  • Architect and implement scalable and efficient ML training pipelines, making pragmatic choices around distributed training, data sharding, and resource utilization as infrastructure evolves.

  • Discover and leverage heterogeneous in‑house data at Toyota scale to improve model robustness, long‑tail performance, and safety.

  • Build and maintain data‑driven evaluation and validation strategies, including scenario‑based testing and coverage‑driven metrics for perception systems.

  • Translate cutting‑edge research into production‑ready solutions that meet latency, robustness, and safety requirements.

  • Collaborate closely with planning, controls, data, simulation, and vehicle teams to deliver end‑to‑end autonomy capabilities.

  • Mentor and influence other engineers through technical leadership, code reviews, and design discussions, leading by example as a senior IC.

  • Contribute to a culture of high‑quality software engineering, including testing, CI, simulation, and in‑vehicle validation.

MINIMUM QUALIFICATIONS
  • M.S. or Ph.D. in Computer Science, Electrical Engineering, Robotics, or a related field — or equivalent industry experience.
  • 8+ years of experience building ML systems using frameworks such as PyTorch, JAX, or similar.
    Strong background in vision or multi‑modal perception models, including modern architectures (e.g., ViTs, VLMs, or related approaches).
  • Experience designing and training large‑scale ML models, including fine‑tuning and adapting pretrained models for real‑world applications.
  • Practical experience with distributed or scalable training workflows, such as data parallelism, model sharding, or similar techniques, and an understanding of how these evolve with scale.
  • Solid understanding of data pipelines, including data curation, sampling, sharding, and preprocessing for large datasets.
  • Strong programming skills in Python and working proficiency in C++.
  • Experience delivering production‑quality ML systems, including testing, monitoring, and performance optimization.
  • Ability to communicate technical concepts clearly and collaborate effectively across disciplines.
  • NICE TO HAVES
  • Hands‑on experience building perception stacks for autonomous or ADAS systems.
  • Exposure to foundation models, world models, or large vision‑language architectures.
  • Experience with multi‑modal learning, sensor fusion, or temporal modeling.
    Familiarity with distributed training at larger scales (e.g., multi‑node training, advanced parallelism, or performance debugging).
  • Publications or contributions to major conferences (CVPR, ICCV, ECCV, ICRA, NeurIPS, etc.).
  • Experience with runtime optimization, CUDA, or performance tuning on Linux‑based systems.
  • Experience deploying ML models to edge or automotive‑grade compute platforms.
  •  
    WHY JOIN WOVEN
    At Woven, you’ll have unusual ownership and influence for a senior IC:
  • Work on ML systems that directly impact real vehicles and real users, not just benchmarks.
  • Shape how foundation models are applied in safety‑critical autonomy systems.
  • Collaborate with world‑class engineers across software, hardware, and vehicle platforms.
  • Help define how our perception and ML infrastructure evolves as scale and ambition grow.
  • Build impactful ML systems under real‑world constraints.
  • The base pay for this position ranges from $161,000- $264,500 a year.
     
    Your base salary is one part of your total compensation. We offer a base salary, short term and long term incentives, and a comprehensive benefits package. The total compensation offered to an employee will be dependent upon the individual's skills, experience, qualifications, location, and level.

    WHAT WE OFFER
    We are committed to creating a modern work environment that supports our employees and their loved ones. We offer many options of the best programs to allow you to do your most meaningful work and to help you shape the future of mobility.
    ・Excellent health, wellness, dental and vision coverage
    ・A rewarding 401k program
    ・Flexible vacation policy
    ・Family planning and care benefits

    Our Commitment
    ・We are an equal opportunity employer and value diversity.
    ・Any information we receive from you will be used only in the hiring and onboarding process. Please see our privacy notice for more details.

    Woven by Toyota is at the forefront of transforming traditional automotive approaches into a comprehensive mobility experience. Focused on human-centric innovation, we develop cutting-edge mobility solutions across four key pillars: advanced driver assistance, cloud and AI technologies, and a futuristic city project, Woven City, all designed to enhance the safety and satisfaction of drivers globally.

    View all jobs
    Salary
    $161,000 – $264,500 per year
    Ace your job interview

    Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

    ML Engineer Q&A's
    Report this job
    Apply for this job