Arista Networks is hiring a

Site Reliability Engineer - Remote from Romania or Hungary

Norway, United States
Full-Time
Remote

Who You'll Work With

Arista Networks is looking for Site Reliability Engineers to play an active role and have a high impact in the early rollout of both internal and customer-facing services making key architecture decisions, and designing and implementing best practices in advancing the Software Defined Networking revolution in the cloud. The Site Reliability Engineering (SRE) role combines software and systems engineering to build and run high performance, massively distributed, robust systems. The role is key in optimizing our system capacity and performance at all times.

SRE roles at Arista are generally in one of two areas:

  • Internal Tools: Designing and Operating our internal systems including CI/CD pipelines as well as source repos and other internal tools
  • External SaaS: An active role with a high impact on a cloud-based public SaaS across all Arista teams.

Both roles have the freedom to push the envelope forward in terms of quality and availability while designing, choosing, and building their own best practices and tools to make that happen.

What You'll Do

  • Engage in and improve the whole lifecycle of services—from inception and design, deployment, operation, and refinement.
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
  • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
  • Scale systems sustainably through mechanisms like automation; evolve systems by pushing for changes that improve reliability and velocity.
  • Practice sustainable incident response and blameless postmortems.

 

  • Bachelor's degree in Computer Science, a related technical field involving software/systems engineering, or equivalent practical experience.
  • Experience programming in the following languages: Go and Python.
  • Experience in operating a cloud-based SaaS
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems.
  • Experience with Jenkins, Docker, K8s
  • Ability to debug, optimize code, and automate routine tasks.
  • Understanding of Unix/Linux operating systems.
Apply for this job

Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!

Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Site Reliability Engineer Q&A's
Report this job
Apply for this job