(Senior) Software Engineer, Infrastructure (Kubernetes Platform)

AI overview

Design and optimize Kubernetes clusters in hybrid environments and support large-scale model training, contributing to observability and SRE practices.

Founded in 2016 in Silicon Valley, Pony.ai has quickly become a global leader in autonomous mobility and is a pioneer in extending autonomous mobility technologies and services at a rapidly expanding footprint of sites around the world. Operating Robotaxi, Robotruck and Personally Owned Vehicles (POV) business units, Pony.ai is an industry leader in the commercialization of autonomous driving and is committed to developing the safest autonomous driving capabilities on a global scale. Pony.ai’s leading position has been recognized, with CNBC ranking Pony.ai #10 on its CNBC Disruptor list of the 50 most innovative and disruptive tech companies of 2022. In June 2023, Pony.ai was recognized on the XPRIZE and Bessemer Venture Partners inaugural “XB100” 2023 list of the world’s top 100 private deep tech companies, ranking #12 globally. As of August 2023, Pony.ai has accumulated nearly 21 million miles of autonomous driving globally. Pony.ai went public at NASDAQ in November 2024.

Responsibilities

As a (Senior) Kubernetes Engineer, you will:

  • Design, operate, and optimize Kubernetes clusters across hybrid cloud environments (public cloud and on-prem datacenter).
  • Support diverse workloads including large-scale model training and low-latency inference services.
  • Develop, maintain, and extend Kubernetes platform features (operators, CRDs, APIs) to automate and productize internal use cases.
  • Own cluster lifecycle management including upgrades, patching, configuration, and governance.
  • Define and enforce best practices for service deployments, security policies, and operational guidelines.
  • Contribute to observability and SRE practices to ensure reliability at scale (SLOs, incident reviews, metrics-driven improvements).
  • Collaborate with storage, compute, and networking teams (CNI, ingress, service discovery) to enhance automation, availability, and performance.
    Provide technical mentorship, documentation, and on-call support for cluster-related incidents.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience.
  • 3+ years of hands-on experience managing Kubernetes clusters in production (EKS/GKE/AKS and/or bare-metal).
  • Strong Linux systems background and distributed systems fundamentals (scheduling, reliability, scaling).
  • Proven experience with hybrid cloud environments (AWS, GCP, Azure, and on-prem).
  • Expertise in containerization (Docker) and Infrastructure-as-Code tools (Terraform, Helm, Ansible, or similar).
  • Experience developing and maintaining Kubernetes platform features (operators, CRDs, APIs).
  • Solid knowledge of Kubernetes networking (CNI, ingress, service discovery), storage, and compute integrations.
  • Strong understanding of security best practices (RBAC, network policies, secrets).
  • Effective communication skills and ability to work cross-functionally in a fast-paced environment.

Preferred Experience

  • Programming skills in Go and/or Python for operator development, platform automation, and tooling.
  • Experience with observability and SRE practices (Prometheus, Grafana, ELK, Datadog; SLOs, incident response, postmortems).
  • Familiarity with workloads common to AI/ML systems (training, inference).

Compensation and Benefits

Base Salary Range: $120,000 - $240,000 Annually

Compensation may vary outside of this range depending on many factors, including the candidate’s qualifications, skills, competencies, experience, and location. Base pay is one part of the Total Compensation and this role may be eligible for bonuses/incentives and restricted stock units.

Also, we provide the following benefits to the eligible employees:

  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (Traditional and Roth 401k)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off (Vacation & Public Holidays)
  • Family Leave (Maternity, Paternity)
  • Short Term & Long Term Disability
  • Free Food & Snacks

Please click here for our privacy disclosure.

Perks & Benefits Extracted with AI

  • Health Insurance: Health Care Plan (Medical, Dental & Vision)
  • Free Food & Snacks: Free Food & Snacks
  • Paid Parental Leave: Family Leave (Maternity, Paternity)
  • Paid Time Off: Paid Time Off (Vacation & Public Holidays)

PONY.AIOur mission is to revolutionize the future of transportation by building the safest and most reliable technology for autonomous vehicles. Armed with the latest breakthroughs in artificial intelligence, we aim to deliver our technology at a global scale. We believe our work has the potential to transform lives and industries for the better.CULTUREWhen it comes to our technology, quality and reliability are hallmark attributes; we don’t believe in taking shortcuts. Our emphasis on craftsmanship enables us to deliver an autonomous driving solution that is highly sophisticated and best-in-class.When it comes to our people, teamwork, robust mentorship, and collaboration are several key pillars of our culture. We ensure every member of our team receives the support they need while tackling some of the biggest tech challenges that exist today. Here, our employees grow with the company. We truly believe that growing a successful company means growing a successful team.A GLOBAL PERSPECTIVEWe are deeply passionate about reaching a global audience, starting with our two home countries: China and the United States. With offices and development teams in Silicon Valley, Beijing, and Guangzhou, we are well on our way towards achieving that goal.

View all jobs
Salary
$120,000 – $240,000 per year
Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Software Engineer, Infrastructure Q&A's
Report this job
Apply for this job