DevOps / Infrastructure Engineer

AI overview

Take ownership of the observability platform, implementing real-time monitoring and alerting to optimize system performance and reliability within a collaborative team setting.

About the Role

We're looking for a hands-on DevOps / Infrastructure Engineer who lives and breathes observability. This isn't a role where you'll be drawing architecture diagrams from the sidelines—you'll be deep in the trenches, building and operating the systems that keep our platform running smoothly. If you get a kick out of hunting down a tricky performance issue at 2am or feel genuine satisfaction when a well-crafted dashboard lights up with meaningful metrics, we want to talk to you.

You'll own our monitoring, logging, and tracing stack end-to-end—from instrumenting applications to building alerting strategies that actually work. You're someone who believes that if it's not observable, it's not in production.

What You'll Do

  • Design, build, and maintain our observability platform—metrics, logs, traces, and everything in between
  • Get hands-on with infrastructure: deploy services, troubleshoot incidents, and fix things when they break (because they will)
  • Instrument applications and services to capture meaningful telemetry data that drives real insights
  • Build dashboards and alerting systems that teams actually use—not just noise generators
  • Dive into production issues, correlate data across systems, and lead root cause analysis
  • Champion observability best practices across engineering teams and help developers instrument their own code
  • Automate everything you can: infrastructure provisioning, deployment pipelines, and operational runbooks
  • Work closely with SRE and development teams to improve system reliability and performance
  • Evaluate and integrate new observability tools and technologies as the landscape evolves

What We're Looking For

  • 3+ years of experience in DevOps, Infrastructure, or SRE roles—with real production battle scars
  • Deep hands-on experience with observability tools: Prometheus, Grafana, Datadog, New Relic, Splunk, ELK stack, Jaeger, or similar
  • Strong proficiency with cloud platforms (AWS, GCP, or Azure) and infrastructure-as-code (Terraform, Pulumi, CloudFormation)
  • Solid scripting and automation skills (Python, Bash, Go, or similar)
  • Experience with containerisation and orchestration (Docker, Kubernetes)
  • Understanding of distributed systems, microservices architectures, and the unique observability challenges they present
  • Familiarity with CI/CD pipelines and GitOps workflows
  • Excellent troubleshooting skills—you're the person who doesn't give up until you've found the root cause

Nice to Have

  • Experience with OpenTelemetry and vendor-agnostic instrumentation strategies
  • Background in building custom exporters, collectors, or integrations
  • Familiarity with chaos engineering and resilience testing practices
  • Experience with FinOps and cloud cost optimisation
  • Contributions to open-source observability projects

The Kind of Person You Are

  • You're not afraid to roll up your sleeves and get stuck in—no task is beneath you
  • You thrive in fast-paced environments and stay calm when things go sideways
  • You take ownership and see problems through to resolution
  • You're curious by nature and constantly looking for ways to improve systems
  • You communicate clearly and can explain complex technical concepts to different audiences
  • You're pragmatic—you know when to build the perfect solution and when "good enough" ships

What We Offer

  • Competitive salary and equity package
  • Flexible working arrangements
  • Learning and development budget
  • Modern tech stack and the autonomy to make real impact
  • A team that values doing things properly over just doing things quickly

If this sounds like you, we'd love to hear from you. Send us your CV and tell us about a time you tracked down a gnarly production issue—bonus points if it involved creative use of observability data.

Perks & Benefits Extracted with AI

  • Flexible Work Hours: Flexible working arrangements
  • Learning Budget: Learning and development budget
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Infrastructure Engineer Q&A's
Report this job
Apply for this job