Senior Software Engineer - Compute

AI overview

Design and maintain a high-performance, large-scale distributed batch compute engine, solving complex distributed computing problems while collaborating with various engineering teams.

Who we are

Aurora’s mission is to deliver the benefits of self-driving technology safely, quickly, and broadly.


The Aurora Driver will create a new era in mobility and logistics, one that will bring a safer, more efficient, and more accessible future to everyone.

 

At Aurora, you will tackle massively complex problems alongside other passionate, intelligent individuals, growing as an expert while expanding your knowledge. For the latest news from Aurora, visit aurora.tech or follow us on LinkedIn.

 

Aurora hires talented people with diverse backgrounds who are ready to help build a transportation ecosystem that will make our roads safer, get crucial goods where they need to go, and make mobility more efficient and accessible for all. 

The journey to developing our cutting-edge self-driving truck Driver software requires a massive and continuous computational effort. Every day, Aurora's engineers initiate and manage millions of compute tasks, systematically processing and analyzing petabytes of critical data. This colossal computational workload spans various essential domains, including raw and derived data processing pipelines, large-scale simulations to test and validate the software in countless scenarios, and the intensive machine learning training jobs that are the core of our autonomous system's intelligence.

At the heart of orchestrating this monumental compute infrastructure is the Compute team. This team is dedicated to building and maintaining the foundational technology that solves the fundamental challenges of resource scheduling, task isolation, and distributed state consistency across our massive batch compute fleet. At our scale, traditional off-the-shelf orchestrators break. The Compute team builds the custom engine - BatchAPI - that manages the lifecycle of millions of tasks - built on top of K8s primitives but implements our own custom scheduler.

We deal with the 'unsolved' problems of distributed computing: maximizing hardware utilization while ensuring that a failure in one node doesn't cascade across the entire cluster. This engine is engineered to handle massive scale, ensuring reliability, efficiency, and rapid turnaround for our engineers.

Furthermore, the Compute team empowers engineers across the company to effectively harness this compute power. They develop and maintain the Batch Workflows Python SDK. This framework provides an intuitive, high-level interface that allows engineers to programmatically define, construct, deploy, monitor, and manage their complex computational workloads. The SDK abstracts away the complexities of the underlying infrastructure, enabling engineers to focus purely on the logic and goals of their data processing, simulation, or training tasks, thus accelerating the entire development cycle for the Aurora Driver.

In this role you will

  • Design, implement, and maintain core components of the high-performance, large-scale distributed batch compute engine (BatchAPI). Architect and optimize the scheduler, resource allocator, and execution engine of BatchAPI to handle bursty, heterogeneous workloads with minimal overhead.
  • Design low-latency APIs and resilient communication protocols that bridge our Python SDK with the Golang-based core engine.
  • Develop high-level workflow abstractions, enabling engineers across the company to programmatically define, deploy, and manage complex data processing, simulation, and ML training pipelines. 
  • Solve complex problems in distributed locking, throttling, and fair-share scheduling to ensure multi-tenant stability.
  • Drive continuous improvements in the performance, scalability, and resilience of the entire compute infrastructure, implementing robust monitoring and alerting systems to maintain operational excellence for critical workflows.
  • Collaborate closely with infrastructure and product engineering teams (e.g., Autonomy, Data, Simulation, Machine Learning) to gather requirements, provide expert consultation, and integrate compute workflows with key company systems.

Required qualifications

  • 5+ years of professional software engineering experience.
  • Deep expertise in Golang (for core systems) and Python (for SDK/API layering).
  • Strong understanding of distributed systems fundamentals (e.g., CAP theorem, consensus algorithms, or gossip protocols).
  • Experience with performance profiling and tuning (e.g., memory management, I/O bottlenecks, or network latency optimization).
  • Specialized knowledge of container orchestration systems like Kubernetes.
  • Proven track record of driving continuous performance, scalability, and resilience improvements in production environments managing critical data.
  • Familiarity with cloud provider compute and data services (e.g., AWS EKS, S3, RDS).

Desirable qualifications

  • Experience working with computational workloads specific to the autonomous vehicle, robotics, or large-scale machine learning domains (e.g., data processing for perception, simulation, or model training).
  • Demonstrated ability in creating and refining user-facing tools, including adeptness at incorporating user feedback, managing expectations, and effectively prioritizing development based on user needs.
  • Web UI development experience (Typescript, React)

 #LI- #Mid-Senior 

The base salary range for this position is  $162,000 - $260,000 per year. Aurora’s pay ranges are determined by role, level, and location. Within the range, the successful candidate’s starting base pay will be determined based on factors including job-related skills, experience, qualifications, relevant education or training, and market conditions. These ranges may be modified in the future. The successful candidate will also be eligible for an annual bonus, equity compensation, and benefits.

Working at Aurora
At Aurora, we bring together extraordinarily talented and experienced people united by the strength of our values. We operate with integrity, set outrageous goals, and build a culture where we win together — all without any jerks.

We believe in-person work increases collaboration, empathy and our ability to lead effectively. As a result, we operate in a hybrid work environment where Aurorans are in office at least 3 days per week.

Our Careers page provides insight into what it is like to work at Aurora, and you can find all the latest updates in our Newsroom.

Our commitment to safety

At the core of everything we do is our commitment to safety. Building best-in-class self-driving technology will take time, and we believe that each employee at Aurora has a role in contributing to safety, every step of the way. Aurora expects commitment to our safety policies from every employee, and seeks candidates who take an active responsibility, can contribute to building an atmosphere of trust, and invest in the organization’s long-term success by prioritizing working safely, no matter what.

Our commitment to inclusion

Aurora considers candidates without regard to their race, color, religion, national origin, age, sex, gender, gender identity, gender expression, sexual orientation, marital status, pregnancy status, parent or caregiver status, ancestry, political affiliation, veteran and/or military status, physical or mental disability, or any other status protected by federal or state law. Aurora considers qualified applicants with criminal histories, consistent with applicable federal, state, and local law. We are also committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, you may contact us at [email protected]

For California applicants, information collected and processed as part of your application and any job applications you choose to submit is subject to Aurora’s California Employment Privacy Policy.

Aurora Innovation is a healthcare SaaS specialist that automates patient-provider communication to enhance accessibility and streamline healthcare workflows.

View all jobs
Salary
$162,000 – $260,000 per year
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Senior Software Engineer Q&A's
Report this job
Apply for this job