DevOps / Infrastructure Engineer

United Kingdom

Full-Time

Remote

TLDR

Take ownership of observing and maintaining a robust alerting and logging infrastructure, while collaborating closely with development teams to enhance system reliability.

About the Role

We're looking for a hands-on DevOps / Infrastructure Engineer who lives and breathes observability. This isn't a role where you'll be drawing architecture diagrams from the sidelines—you'll be deep in the trenches, building and operating the systems that keep our platform running smoothly. If you get a kick out of hunting down a tricky performance issue at 2am or feel genuine satisfaction when a well-crafted dashboard lights up with meaningful metrics, we want to talk to you.

You'll own our monitoring, logging, and tracing stack end-to-end—from instrumenting applications to building alerting strategies that actually work. You're someone who believes that if it's not observable, it's not in production.

What You'll Do

Design, build, and maintain our observability platform—metrics, logs, traces, and everything in between
Get hands-on with infrastructure: deploy services, troubleshoot incidents, and fix things when they break (because they will)
Instrument applications and services to capture meaningful telemetry data that drives real insights
Build dashboards and alerting systems that teams actually use—not just noise generators
Dive into production issues, correlate data across systems, and lead root cause analysis
Champion observability best practices across engineering teams and help developers instrument their own code
Automate everything you can: infrastructure provisioning, deployment pipelines, and operational runbooks
Work closely with SRE and development teams to improve system reliability and performance
Evaluate and integrate new observability tools and technologies as the landscape evolves

What We're Looking For

3+ years of experience in DevOps, Infrastructure, or SRE roles—with real production battle scars
Deep hands-on experience with observability tools: Prometheus, Grafana, Datadog, New Relic, Splunk, ELK stack, Jaeger, or similar
Strong proficiency with cloud platforms (AWS, GCP, or Azure) and infrastructure-as-code (Terraform, Pulumi, CloudFormation)
Solid scripting and automation skills (Python, Bash, Go, or similar)
Experience with containerisation and orchestration (Docker, Kubernetes)
Understanding of distributed systems, microservices architectures, and the unique observability challenges they present
Familiarity with CI/CD pipelines and GitOps workflows
Excellent troubleshooting skills—you're the person who doesn't give up until you've found the root cause

Nice to Have

Experience with OpenTelemetry and vendor-agnostic instrumentation strategies
Background in building custom exporters, collectors, or integrations
Familiarity with chaos engineering and resilience testing practices
Experience with FinOps and cloud cost optimisation
Contributions to open-source observability projects

The Kind of Person You Are

You're not afraid to roll up your sleeves and get stuck in—no task is beneath you
You thrive in fast-paced environments and stay calm when things go sideways
You take ownership and see problems through to resolution
You're curious by nature and constantly looking for ways to improve systems
You communicate clearly and can explain complex technical concepts to different audiences
You're pragmatic—you know when to build the perfect solution and when "good enough" ships

What We Offer

Competitive salary and equity package
Flexible working arrangements
Learning and development budget
Modern tech stack and the autonomy to make real impact
A team that values doing things properly over just doing things quickly

If this sounds like you, we'd love to hear from you. Send us your CV and tell us about a time you tracked down a gnarly production issue—bonus points if it involved creative use of observability data.

Benefits

Flexible Work Hours

Flexible working arrangements

Learning Budget

Learning and development budget

Apply for this job

Strive Gaming

Strive Gaming develops a robust iGaming platform designed for scalability, security, and high performance. Our focus is on providing reliable backend services that enhance the online gaming experience for our users. We're here to support gaming operators with powerful technology that drives engagement and growth.

View company profile

Infrastructure Engineer

Report this job