Senior Site Reliability Engineer - OP01988

AI overview

Enhance the reliability and observability of services at a leading fintech company while collaborating closely with engineering teams to solve complex operational challenges.

🟢 Are you in Brazil or Argentina? Join us as we actively recruit in these locations, offering a comfortable remote environment. Submit your CV in English, and we'll get back to you!

We invite a Senior Site Reliability Engineer to join our dynamic team. In this hands-on role, you’ll focus on improving the stability, observability, and efficiency of our services. You’ll lead initiatives to enhance monitoring, automation, and reliability practices while collaborating with engineering teams to ensure our systems run smoothly and remain resilient.

🟩 What's in it for you:

  • Join a top S&P 500 company shaping the future of global payments and financial technology
  • Lead initiatives to improve stability, observability, and efficiency of critical services
  • Collaborate with engineering teams to solve complex problems and drive operational excellence

✅ Is that you?

  • 5+ years in site reliability, observability, or platform engineering
  • Experience building SRE or observability practices from scratch
  • Hands-on OpenTelemetry experience (SDKs and Collector)
  • Strong experience with PromQL/SPL and at least one APM platform (Datadog, Splunk APM, Google Cloud APM)
  • Experience designing SLOs and alerting strategies (burn rate, multi-window)
  • Familiarity with MuleSoft or API gateway observability
  • Awareness of security best practices (PII redaction, access control)
  • Experience building automation scripts for CI/CD tasks
  • Experience with logging frameworks (Logback, Serilog) and structured JSON logging
  • Collaboration, communication, and independent problem-solving skills
  • Upper-Intermediate+ English level

🧩Key responsibilities and your contribution

In this role, you’ll own and lead efforts to ensure the reliability, observability, and operational efficiency of our services.

  • Define and enforce logging, tracing, and metrics standards across services
  • Implement and maintain centralized telemetry pipelines and APM integrations
  • Build reusable instrumentation libraries for core languages (Java, .NET, Node.js, Python)
  • Establish dashboards and SLO/error budget alerts
  • Ensure log/trace correlation and schema consistency
  • Implement PII/secret redaction, retention, and cost optimization
  • Collaborate with development teams to onboard services and ensure observability readiness
  • Develop runbook templates, documentation, and training materials for engineering teams
  • Audit alerts, reduce noise, and maintain alert quality standards
  • Support incident response through tooling improvement and post-incident telemetry analysis

🎾 What's working at Dev.Pro like?

Dev.Pro is a global company that's been building great software since 2011. Our team values fairness, high standards, openness, and inclusivity for everyone — no matter your background
🌐 We are 99.9% remote — you can work from anywhere in the world
🌴 Get 30 paid days off per year to use however you like — vacations, holidays, or personal time
✔️ 5 paid sick days, up to 60 days of medical leave, and up to 6 paid days off per year for major family events like weddings, funerals, or the birth of a child
⚡️ Partially covered health insurance after the probation, plus a wellness bonus for gym memberships, sports nutrition, and similar needs after 6 months
💵 We pay in U.S. dollars and cover all approved overtime
📓 Join English lessons and Dev.Pro University programs, and take part in fun online activities and team-building events

Our next steps:

✅ Submit a CV in English — ✅ Intro call with a Recruiter — ✅ Internal interview — ✅ Client interview — ✅ Offer

Interested? Find out more:

📋How we work

💻 LinkedIn Page

📈 Our website

💻IG Page

Perks & Benefits Extracted with AI

  • Health Insurance: Partially covered health insurance after the probation, plus a wellness bonus for gym memberships, sports nutrition, and similar needs after 6 months
  • Paid Parental Leave: up to 6 paid days off per year for major family events like weddings, funerals, or the birth of a child
  • Paid Time Off: Get 30 paid days off per year to use however you like — vacations, holidays, or personal time

We are a US-based outsource software development company that has been delivering exceptional software experience to our clients since 2011, helping technology companies to become industry leaders. Over the past few years, we’ve been hiring specialists all over the world while our main development centers were in Ukraine. Now, we keep expanding and start growing our centers in different parts of the world. Dev.Pro is open to hire specialists from other countries as well as Ukrainians who live outside of Ukraine now. We stand with Ukraine and keep supporting our people by offering a friendly remote environment while adhering to the values of democracy, human rights, and state sovereignty. As a company of professionals, Dev.Pro offers challenging and interesting projects with world-leading clients, a modern technology stack, and career opportunities for both technical and non-technical specialists. We focus on what we value the most: Personal and professional development — get access to trainings, attend English classes with native speakers Openness and support — you can count on setup support and equipment A culture of growth — discover opportunities for yourself with the help of our Career Development Department, getting personal career plan and personality analysis

View all jobs
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Senior Site Reliability Engineer Q&A's
Report this job
Apply for this job