Site Reliability Engineer

AI overview

Responsible for enhancing reliability and uptime within Wormhole's blockchain infrastructure, focusing on incident response and operational excellence.

The Wormhole Foundation
Our mission is to empower passionate people in the research and development of blockchain interoperability technologies. We support teams building secure, open-source, and decentralized products within the Wormhole ecosystem.

The Role: Site Reliability Engineer

Wormhole Foundation is seeking an experienced Site Reliability Engineer (SRE) to improve the reliability, security, and operational excellence of Wormhole’s production infrastructure. This role focuses on uptime, observability, deployment workflows, and incident response across critical blockchain and networking services. The SRE will work closely with engineering, DevOps, and validator partners to ensure Wormhole services operate at a minimum 99.99% uptime, excluding scheduled maintenance windows.

What you'll be doing:

  • Act as first responder and incident commander during production incidents
  • Lead incident triage, root cause analysis, and retrospective documentation
  • Build detailed incident timelines and preventative runbooks
  • Respond to incidents related to: performance issues, CCQ failures or degraded throughput, observability pipeline outages, and core Wormhole products
  • Deliver remediation recommendations and implement approved fixes
  • Improve reliability and uptime across all Wormhole services
  • Strengthen observability, monitoring, and alerting systems
  • Harden infrastructure for security and operational resiliency
  • Enhance deployment workflows and reduce operational friction
  • Lead incident response, analysis, and continuous improvement
  • Support operational tooling used by engineering, DevOps, and validator partners

Who you are:

  • Relevant tertiary qualifications in computer science or a closely related field (bachelors/masters) and/or relevant work experience over at least five years
  • Established experience as incident commander across multiple stakeholders in global team
  • Familiarity with metrics and log analysis tools (e.g., Grafana), incident response tools (e.g., PagerDuty), GitHub administration and related tools
  • Deep understanding of reliability engineering, observability, and incident response for distributed systems
  • Ability to write and debug code in any of the following: Go, Rust, Java
  • Strong experience operating in Grafana or Datadog or Splunk and/or Kubernetes in production environments
  • Experience securing distributed systems and public-facing infrastructure
  • Ability to operate independently, document clearly, and lead during incidents
  • Solid understanding of cloud computing environments (AWS and GCP preferred) and willingness to keep up to date with their changing offerings.
  • Excellent and proactive written and verbal communication
  • Ideal candidate will be based in ET or GMT time zone or the ability to work those hours

If you don’t meet all of these criteria, we’d still love to hear from you anyway if you think you’d be a great fit for this role!

The Wormhole Foundation is dedicated to supporting open-source, decentralized technologies that securely and seamlessly connect Web3.We are stewards of Wormhole, the world’s first generalized messaging protocol. Our mission is to empower passionate people in the research and development of blockchain interoperability technologies. We support teams building secure, open-source, and decentralized products within the Wormhole ecosystem. Why Work With Us: Impactful work. Your efforts contribute to something meaningful. You're constantly challenged to think innovatively, problem-solve creatively, and develop new skills.  Huge opportunity. We keep a lean team. You’ll learn a lot professionally and personally. You get to grow as the Wormhole ecosystem grows. An industry-shaping team.  Our team is driven by the mission to empower passionate people in the research and development of blockchain interoperability technologies. We support teams building secure, open-source, and decentralized products within the Wormhole ecosystem. Our global team are Web3 visionaries building the future.  Culture. We’re a global team working closely together to achieve high-impact goals. We are united in our belief that we will completely change the future of blockchain interoperability technologies.  We take pride in our culture of innovation, collaboration, fun, and passion. We can’t wait to see what you add! Compensation. You’ll receive a competitive salary and incentive package.  Want to join us? Don’t forget to learn more about Wormhole after you apply!

View all jobs
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Site Reliability Engineer Q&A's
Report this job
Apply for this job