Senior Site Reliability Engineer

AI overview

Lead the design and automation of secure, high-availability SaaS systems while ensuring exceptional uptime and collaborating across teams to improve operational efficiency.

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer in the United States.

This role offers a hands-on opportunity to ensure the reliability, security, and scalability of enterprise SaaS systems. You will lead the design, automation, and monitoring of cloud-native platforms while maintaining exceptional uptime and performance. The position requires a strong background in Kubernetes, cloud infrastructure, and security compliance, with a focus on forward-looking system architecture as well as operational support. You will collaborate closely across engineering, product, and operations teams to optimize services, resolve incidents, and implement automation that improves efficiency. This is a high-impact role for professionals passionate about building resilient, secure, and scalable platforms in a dynamic environment.

Accountabilities:

  • Lead design reviews and implement secure, high-availability SaaS systems targeting 99.99% uptime.
  • Design, automate, test, and monitor cloud-native technologies to support a service platform.
  • Spend majority of time on system design and building while supporting operational and maintenance activities.
  • Investigate and resolve customer and operational issues proactively.
  • Automate measurement of operations SLAs and SLOs; document SOPs and Runbooks.
  • Participate in on-call rotations and incident response.
  • Collaborate across teams to design and operationalize systems adhering to security standards such as FedRAMP, SOC2, and ISO.
  • Contribute to special projects and continuous improvement initiatives.

Requirements

8–12 years of experience in site reliability engineering, cloud operations, or equivalent roles.

Proven experience managing complex Kubernetes environments in production.

Expertise with cloud automation tools such as CloudFormation, Terraform, aws-cli/CDK.

Proficiency in scripting languages such as Python, Bash, or Perl.

Solid Linux administration skills and knowledge of networking and security fundamentals.

Demonstrated experience operating production SaaS environments under security standards (FedRAMP, SOC2, ISO, PCI).

Strong problem-solving skills, algorithms, and data structures.

Excellent collaboration and communication skills.

Ability to work under tight deadlines and participate in on-call rotations.

Bachelor’s or Master’s degree in Computer Science, Information Systems, or related field.

Preferred: Experience building automation tools, frameworks, and contributing to cloud security compliance audits.

Benefits

  • Fully remote and flexible work environment.
  • Comprehensive medical, dental, and vision plans.
  • 401(k) plan with employer match.
  • Flexible Paid Time Off (FTO) and Volunteer Time Off (VTO).
  • 5-year service milestone sabbatical.
  • Paid parental leave.
  • Generous employee referral bonus program.
  • Pet insurance.
  • Virtual company-wide events and wellness programs.
  • Opportunities to learn and develop with industry-leading experts.

Jobgether Hiring Process:
Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching.
When you apply, your profile goes through our AI-powered screening process designed to identify top talent efficiently and fairly.
🔍 Our AI evaluates your CV and LinkedIn profile thoroughly, analyzing your skills, experience, and achievements.
📊 It compares your profile to the job’s core requirements and past success factors to determine your match score.
🎯 Based on this analysis, we automatically shortlist the 3 candidates with the highest match to the role.
🧠 When necessary, our human team may perform an additional manual review to ensure no strong profile is missed.
The process is transparent, skills-based, and free of bias — focusing solely on your fit for the role.
Once the shortlist is completed, we share it directly with the company that owns the job opening. The final decision and next steps (such as interviews or additional assessments) are then made by their internal hiring team.

Thank you for your interest!

#LI-CL1

Perks & Benefits Extracted with AI

  • Health Insurance: Comprehensive medical, dental, and vision plans.
  • Company-wide events and wellness programs: Virtual company-wide events and wellness programs.
  • Paid Parental Leave: Paid parental leave.
  • Paid Time Off: Flexible Paid Time Off (FTO) and Volunteer Time Off (VTO).
  • Remote-Friendly: Fully remote and flexible work environment.

Jobgether is the Largest Remote Job Platform worldwide with more than 160k remote jobs available across the world. Access the best flexible and remote jobs in just one click. Jobgether is your guide to the future of work, offering a variety of job oppo...

View all jobs
Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Senior Site Reliability Engineer Q&A's
Report this job
Apply for this job