Drive the evolution of a fitness industry platform by establishing core SRE practices, supporting AI deployments, and enhancing system reliability for over 60 countries.
The company and our mission:
Zartis is a digital solutions provider working across technology strategy, software engineering and product development.
We partner with firms across financial services, MedTech, media, logistics technology, renewable energy, EdTech, e-commerce, and more. Our engineering hubs in EMEA and LATAM are full of talented professionals delivering business success and digital improvement across application development, software architecture, CI/CD, business intelligence, QA automation, and new technology integrations.
We are looking for a Senior SRE with AI/ML platform experience to work on a project with a leading provider of software for the fitness industry.
The project:
Our teammates are talented people that come from a variety of backgrounds. We’re committed to building an inclusive culture based on trust and innovation.
You will be part of a distributed team focused on revolutionising the fitness industry with digital solutions in over 60 countries. The team is building a platform that saves time, increases retention, and ultimately, helps studio and gym owners become more successful.
As a Senior SRE, you will take ownership of core infrastructure responsibilities from day one. We are looking for someone with strong foundational expertise who can operate independently and drive reliability improvements.
What you will do:
- Take end-to-end ownership of the company’s core infrastructure establishing SRE foundations across the organization, including incident response, uptime practices, and operational maturity.
- Design, build, and maintain AWS infrastructure using best-practice architecture and service selection.
- Drive reliability improvements through strong observability practices (metrics, logs, tracing, alerting).
- Lead capacity planning efforts, asking the right scaling questions, and proactively preventing bottlenecks.
- Implement and enforce Infrastructure as Code standards, ensuring all infrastructure is fully managed through Terraform
- Support deployment and operations of AI-enabled platform components (e.g., proxies, vector stores, inference services)
- Collaborate with designers, developers, product managers, and testers to transform ideas into unique, human experiences for fitness entrepreneurs.
What you will bring:
- More than 8 years of experience in Site Reliability Engineering or Infrastructure-focused DevOps roles.
- Deep understanding of distributed systems and how to operate them reliably in production.
- Expertise across core AWS services (EC2, Load Balancers, ECS, VPC, IAM, S3, Secrets Manager, and related foundational services)
- Strong experience with Infrastructure as Code with Terraform.
- Hands-on experience with AI/ML platform infrastructure (Bedrock, inference tooling, vector stores).
- Observability-first approach, with hands-on experience implementing monitoring and tracing systems.
- Strong production mindset: you think in terms of resilience, failure modes, and operational safety.
- Solid capacity planning skills and experience scaling platforms responsibly.
- Solid knowledge of SRE best practices and industry standards.
Nice to have:
- Familiarity with AI-focused AWS services (SageMaker, Managed Apache Airflow)
- Exposure to running open-source LLM inference workloads in production (e.g., Ollama, vLLM, llama.cpp)
- Desire to contribute to the wider community through collaboration, coaching, and mentoring of other technologists.
What we offer:
- 100% Remote.
- Work WFH allowance: Monthly payment as financial support for remote working.
- Training: For Tech training at Zartis, you have time allocated during the week at your disposal. You can request from a variety of options, such as online courses (from Pluralsight and Educative.io, for example), English classes, books, conferences, and events.
- Mentoring Program: You can become a mentor in Zartis or you can receive mentorship, or both.
- Zartis Wellbeing Hub (Kara Connect): A platform that provides sessions with a range of specialists, including mental health professionals, nutritionists, physiotherapists, fitness coaches, and webinars with such professionals as well.
- Multicultural working environment: We organize tech events, webinars, parties, and activities to do online team-building games and contests.
Perks & BenefitsExtracted with AI
Home Office Stipend:
Work WFH allowance: Monthly payment as financial support for remote working.
Learning Budget:
For Tech training at Zartis, you have time allocated during the week at your disposal. You can request from a variety of options, such as online courses (from Pluralsight and Educative.io, for example), English classes, books, conferences, and events.
Multicultural working environment with activities:
Multicultural working environment: We organize tech events, webinars, parties, and activities to do online team-building games and contests.
Zartis provides bespoke software development teams, outsourcing, and consulting services to drive business success with a diverse team of over 250 engineers.
Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!
Ace your job interview
Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.