Our mission at Tensorwave Cloud is to build seamless, secure, reliable, and resilient AI infrastructure at scale, eliminating barriers and challenging the status quo to empower builders and support AI innovation.

About the role

We are seeking a Site Reliability Engineer with a strong background in software engineering to build and maintain highly scalable, secure, and resilient infrastructure.

You’ll play a critical role in designing low-level systems, automating infrastructure with modern tooling, and ensuring platform reliability.

This role is ideal for someone who’s comfortable working at the intersection of systems programming and DevOps - writing code in Go, Javascript, Rust, C, or Zig while also managing infrastructure with NixOS, Kubernetes, and Terraform.

Responsibilities

Design, build, and maintain infrastructure systems using Linux and NixOS
Manage infrastructure-as-code with Terraform to provision and scale resources
Architect and operate Kubernetes clusters with a focus on performance, security, and automation
Write high-performance tooling and internal utilities in Go or Rust
Develop and maintain CI/CD pipelines for infrastructure and code deployments
Monitor system performance, resolve issues, and improve reliability through observability tooling
Collaborate closely with engineering teams to support deployment strategies and development workflows

Required Experience

Bachelor of Science in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience
5+ years in DevOps, Site Reliability, or Infrastructure Engineering roles
Proficiency in one or more low-level languages Rust or Go
Deep experience with Linux systems and configuration management
Hands-on experience with Terraform, Kubernetes, and containerized environments
Strong understanding of systems programming, performance tuning, and operating system internals
Familiarity with CI/CD practices and infrastructure monitoring/alerting tools

What We Bring

Mission driven company
Competitive Salary
Stock Options
100% paid Medical, Dental, and Vision insurance
Flexible PTO
Paid Holidays
401(k)
Parental Leave
Flexible Spending Account
Short Term Disability Insurance
Life and Voluntary Supplemental Insurance
Mental Health Benefits through Spring Health

We’re looking for resilient, adaptable people to join our team, people who believe in the mission and think at massive scale. The solutions that worked on a handful of devices will not work at Exascale. Be prepared to be pushed daily, to learn a lot, and literally build the future.

Tensorwave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, national origin, or veteran status.

Site Reliability Engineer

TLDR

Benefits