Site Reliability Engineer (SRE) - Application Support

AI overview

Solve complex production issues as part of the SRE team, improving application reliability while directly impacting customer satisfaction through hands-on troubleshooting.

About:

Step forward into the future of technology with ZILO™.

We’re here to redefine what’s possible in technology. While we’re trusted by the global Transfer Agency sector, our technology is truly flexible and designed to transform any business at scale. We’ve created a unified platform that adapts to diverse needs, offering the scalability and reliability legacy systems simply can’t match.

At ZILO™, our DNA is built on Character, Creativity, and Craftsmanship. We face every challenge with integrity, explore new ideas with a curious mind, and set a high standard in every detail.

We are a team of dedicated professionals where everyone, regardless of their role, drives our progress and creates real impact. If you’re ready to shape the future, let’s talk.

Requirements:

We’re looking for a Site Reliability Engineer to join our SRE team — someone who thrives on solving complex production issues, understands how applications behave in the real world, and takes pride in keeping systems reliable and performant.

This is not a platform engineering role. You won’t just be spinning up Kubernetes clusters or building infrastructure — you’ll be deeply involved in understanding our applications, what they do and how they operate, troubleshooting real-world issues, and working directly on improvements that impact our customers every day.

What You’ll Do

  • Incident Response & Troubleshooting: Investigate and resolve incidents raised by clients, diving into logs, metrics, and application code to identify root causes.
  • Application Debugging: Work across our core stack — Java, Golang, and Python — to trace and fix issues affecting reliability or performance.
  • Data Fixes: Perform data investigation and fixes using Postgres.
  • Operational Excellence: Patch and maintain Kubernetes clusters and other production systems.
  • SRE Roadmap: Contribute to the continuous improvement of our observability, reliability, and automation initiatives.

This role is hybrid and will require regular weekly attendance at our London office.

Requirements

  • Solid experience with application debugging in at least one of: Java, Golang, or Python.
  • A good grasp of PostgreSQL — enough to run queries, analyse data, and perform safe fixes.
  • Familiarity with Kubernetes and modern cloud platforms (AWS, GCP, or Azure).
  • Understanding of incident management, observability tools (Grafana, Prometheus, etc.)
  • A mindset focused on reliability, quality, and ownership.

Benefits

  • Enhanced leave - 38 days inclusive of 8 UK Public Holidays  
  • Private Health Care including family cover  
  • Life Assurance – 5x salary  
  • Flexible working-work from home and/or in our London Office  
  • Employee Assistance Program  
  • Company Pension (Salary Sacrifice options available)
  • Access to training and development  
  • Buy and Sell holiday scheme 
  • The opportunity for “work from anywhere/global mobility”

Perks & Benefits Extracted with AI

  • Flexible Work Hours: Flexible working-work from home and/or in our London Office
  • Health Insurance: Private Health Care including family cover
  • Learning Budget: Access to training and development
  • Work from anywhere/global mobility: The opportunity for “work from anywhere/global mobility”
  • Paid Time Off: Enhanced leave - 38 days inclusive of 8 UK Public Holidays

ZILO™ is focused on transforming global transfer agency to create sustainable value for firms and the customers they serve. To achieve this, we started with a clean technology slate, a design-driven approach, and a commitment to put people first. ZILO's technology enables firms to replace legacy technology and end-of-life systems, many of which were developed 30+ years ago, and slash costs, risk, and user friction along the way.Single global solutionThis digital transformation journey requires strong partnerships with our customers to modernise and expand their product and service propositions by unifying the full breadth of transfer agency into a single global solution.Our missionOur founders, leadership, engineering, and product teams are highly experienced with successful track histories of pioneering innovation-driven businesses, products, and services. Our collective goal is to be the market leading solution in global transfer agency. We'd also like to spread some joy along the way.Our TeamOur team of technology, operations, digital, and product experts—with decades of combined experience at leading firms in all regions—have unified the full breadth of global transfer agency into a single global solution.

View all jobs
Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Site Reliability Engineer Q&A's
Report this job
Apply for this job