Observability Engineer (Datadog SME)

AI overview

Lead the observability architecture across multiple environments while enhancing performance and cost efficiency for cloud-native services.

πŸ“Œ Senior Observability Engineer – Datadog SME (LATAM)

We are looking for a Senior Observability Engineer with deep expertise in Datadog to join our Digital Ops team. This role is focused on owning and evolving the observability strategy for a large-scale, cloud-native environment supporting 150+ production services across multiple regions.

As a Datadog Subject Matter Expert, you will be responsible for designing, operating, and continuously improving observability capabilities, enabling engineering teams to build reliable, performant, and cost-efficient systems. You will work closely with DevOps, SRE, and development teams in an agile environment, acting as a technical reference for observability best practices.

πŸ—“ Start date: ASAP
πŸ“† Contract type: Full-Time, Remote, Contractor
🌐 Work hours and location: 8.00 am to 4.00 PM MST 


πŸ› οΈ What You’ll Be Doing

  • Own and lead the observability architecture and strategy across cloud-native services running in multiple environments and regions.
  • Act as the Datadog Subject Matter Expert, owning configuration, governance, and best practices.
  • Design, implement, and maintain Datadog dashboards, monitors, alerts, SLOs, and service health views.
  • Operate and optimize Datadog APM, Logs, Metrics, Synthetic Monitoring, and RUM.
  • Drive alert quality improvements, signal-to-noise reduction, and proactive detection of operational issues.
  • Lead Datadog cost management and usage optimization initiatives in collaboration with engineering and finance stakeholders.
  • Partner with development teams to embed observability into the SDLC and production readiness processes.
  • Define and document runbooks, operational procedures, and observability standards.
  • Eventually participate in a shared on-call rotation, triaging and resolving production incidents, acting as incident commander when needed, and leading post-incident reviews.
  • Continuously identify opportunities for automation and toil reduction across observability and operational workflows.
  • Set, track, and report on operational excellence metrics including reliability, performance, availability, security, and cost.

βœ… What You Need to Succeed

Must-haves

  • 3+ years of deep, hands-on experience with Datadog as an observability platform in production environments.
  • 5+ years of experience in DevOps, SRE, or Cloud Engineering roles supporting customer-facing systems.
  • Strong practical experience with Datadog APM, Logs, Metrics, dashboards, monitors, alerts, and SLOs.
  • Hands-on experience with Azure, Kubernetes, Terraform, Docker, and GitOps-based workflows.
  • Proven experience operating 24x7 production environments, including incident response, root cause analysis, and post-mortems.
  • Solid understanding of cloud-native architectures, distributed systems, and modern observability principles.
  • Ability to work independently in a fully remote, distributed team, with strong communication and collaboration skills.

Nice to have

  • Experience with ArgoCD, Azure DevOps CI/CD pipelines, and infrastructure automation.
  • Exposure to Databricks, SQL-based systems, or data-intensive platforms.
  • Hands-on experience building or extending custom DevOps/SRE tooling to reduce operational toil.
  • Relevant certifications (e.g. Datadog, Azure, Cloud Architecture, ITIL).


🧭 Our Recruitment Process

Here’s what to expect from our candidate-friendly interview process:

  1. Initial Interview – 60 minutes with our Talent Acquisition Specialist
  2. Culture Fit – 30 minutes with our Team Engagement Manager
  3. Technical Assessment – Online Challenge/Multiple Choice Questionnaire
  4. Final Stage – 60 minutes with the Hiring Manager

🌟 Why Join Launchpad?

We believe that great work starts with great people. At Launchpad, we offer:

  • People first culture
  • Excellent compensation
  • Hardware setup for working from home
  • Agile methodologies
  • Diverse and multicultural work environment
  • Training allowances
    …and more!

✨ Ready to make your mark? Apply now and be part of something exciting.

Β 

Β 

Perks & Benefits Extracted with AI

  • Training budget: Training allowances
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Engineer Q&A's
Report this job
Apply for this job