Site Reliability Engineer (Dovecot) (m/f/d)

TLDR

Contribute to the development of Kubernetes-native Dovecot infrastructure while applying SRE principles to ensure reliability and scalability for millions of users worldwide.

Our mission:

Are you a motivated Site Reliability Engineer who wants to make an impact by building and operating reliable, scalable, and Kubernetes-native infrastructure in production?
At Open-Xchange, we develop open-source communication and collaboration software used by public sector organizations, telcos, and hosting providers to keep data secure and under their control. Our products cover email, collaboration, DNS security, and related services, and are used by millions of people worldwide.

Discover what Open-Xchange offers you:

  • We work in a non-hierarchical organization that empowers everyone to take responsibility, stay focused, and contribute directly to the company’s success and customer value.
  • We value expertise and continuous professional development. Our professionals work closely with experienced colleagues, take ownership of their work, and continuously expand their skills.
  • Openness and diversity are core to OX. Different perspectives, backgrounds, and ways of thinking strengthen our collaboration and help us see the bigger picture.

Your team:

You'll be part of our cross-functional Dovecot Cloud team with primary focus on developing and delivering Dovecot Pro software for Cloud based deployments, with a strong SRE and Kubernetes-native approach at the core of how we operate. The team is also the main contributor of several Dovecot components such as object storage, cluster controller and Kubernetes deliverables. We are a team of 8 members whose experience includes C and Python coding, operation, product delivery and product management.

Your new job:

  • Design, development, and maintenance of Dovecot infrastructure using Ansible, Terraform, and Helm, with a key contribution to the transition toward a fully Kubernetes-native infrastructure.
  • Application of SRE principles to ensure reliability, scalability, and performance of Dovecot in production, including definition of SLOs, reduction of operational toil, and development of self-healing systems.
  • Improvement of CI/CD pipelines and streamlining of Kubernetes-native deployments, including Helm chart development and lifecycle management.
  • Monitoring, troubleshooting, and optimization of system performance, including regular maintenance, updates, and incident response for production environments.
  • Close collaboration with development teams to enhance software reliability and system design, combined with continuous knowledge sharing and staying up to date with cloud-native technologies.
  • Participation in an on-call rotation to support production systems.

This is the tech stack you'll be working with:

  • System: Linux / Docker / Kubernetes
  • Deployment: Helm / Ansible / Terraform / Kubespray
  • CI: GitLab CI
  • Programming: Bash or Python
  • Specific to our product: Dovecot, Cassandra, Redis, Prometheus, Object Storage

Your background:

  • While experience is our primary consideration, a strong willingness to learn and high motivation can make a candidate stand out.
  • Strong hands-on experience with Kubernetes including bare-metal or self-managed cluster operations, workload design, and Helm chart development in production environments.
  • Comfort working with and administering Linux systems, including networking concepts such as routing, firewalling, DNS, and network performance troubleshooting.
  • Proficiency in scripting to automate daily tasks.
  • Solid experience writing Infrastructure as Code (Terraform, Ansible) and managing Kubernetes resources in a production environment.
  • Strong verbal and written communication skills in English.
  • Intrinsic motivation to take ownership and a passion for open-source technologies, thriving in a low-hierarchy environment.
  • Comfortable working in a remote-first environment with multicultural teams across different time zones.

This is what you get at Open-Xchange:

  • The flexibility to work 100% remotely ensures a work environment that suits you best.
  • Flexible working hours that allow you to successfully combine your home and family responsibilities with work. 
  • Getting together in-person for workshops and fun team events. 
  • Time off to volunteer – and mental health support when you need it.
  • We provide financial relief through corporate benefits and a subsidy for ergonomic chairs and desks.
  • We can discuss further location-related benefits together in an initial talk.

Join the team:

Join us in our fight for an open internet and deliver added value! Click “Apply now” to submit your application.

Your contact person:

Reach out to Justin ([email protected]) from the People Team. He will be able to discuss current opportunities and tell you more about our exciting vision and mission @OX.

Benefits

Flexible Work Hours

Flexible working hours that allow you to successfully combine your home and family responsibilities with work.

In-person workshops and team events

Getting together in-person for workshops and fun team events.

Remote-Friendly

The flexibility to work 100% remotely ensures a work environment that suits you best.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Site Reliability Engineer Q&A's
Report this job
Apply for this job