Principal Site Reliability Engineer

AI overview

Lead infrastructure strategy for a cutting-edge AI-driven SaaS platform, directly influencing product stability and growth with cloud technologies and automation tools.
  • Define and lead infrastructure and reliability strategy across the platform
  • Design scalable, resilient systems in collaboration with engineering teams
  • Optimize build, testing, and deployment processes for speed and stability
  • Establish and uphold best practices for CI/CD, monitoring, and observability
  • Lead incident response and drive continuous improvement post‑incident
  • Automate workflows to reduce operational toil and risk
  • Mentor engineers and foster a culture of operational excellence
  • Make strategic build‑vs‑buy decisions balancing speed, quality, and sustainability
  • At least 8 years of experience in Site Reliability Engineering or DevOps roles, including 2+ years in a Principal or Lead position
  • Proven experience in infrastructure modernization and scaling initiatives for high‑growth environments
  • Strong proficiency in Python
  • Deep expertise in cloud platforms and container orchestration tools such as AWS ECS and EKS
  • Solid experience in CI/CD pipeline design and optimization using tools like GitHub Actions and Buildkite
  • Proficiency in infrastructure‑as‑code tools such as Terraform
  • Strong knowledge of monitoring, observability, and performance optimization practices
  • Upper-Intermediate level of spoken and written English

WOULD BE A PLUS

  • Experience with monorepos (Turborepo, pnpm)
  • Familiarity with modern TypeScript tools (swc, biome, oxc)
  • Knowledge of NestJS, NextJS, and testing frameworks (Jest, Vitest)

PERSONAL PROFILE

  • Excellent leadership, communication, and decision‑making abilities
  • Ability to work independently and make pragmatic build‑vs‑buy decisions in fast‑paced environments

Build stunning career with Sigma Software! Find your dream job, send your CV and become one of us!

View all jobs
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Principal Site Reliability Engineer Q&A's
Report this job
Apply for this job