At Sysdig, we believe cloud security isn't a compromise - it's a promise. From the start, our mission has been clear: to help organizations secure innovation in the cloud, the right way.
We created Falco, the open standard for cloud threat detection, and continue to lead the cloud security market with runtime insights, open innovation, and agentic Al. Creators of technology trusted by over 60% of the Fortune 500, Sysdig gives teams the real-time clarity to move fast and defend what matters most.
Culture matters here. We believe diversity fuels stronger ideas, and open dialogue drives sharper decisions. Recognized as a Best Place to Work and one of Deloitte's fastest-growing companies for the past 5 years, we're here to raise the standard for what cloud security and workplace culture should be.
If you have the passion to dig deeper, the desire to challenge convention, and the curiosity to build something better, Sysdig is the right place for you.
What you will do
Architectural Optimization: Design and implement Kubernetes scaling strategies (HPA, VPA, Karpenter) that align resource consumption with real-time demand.
The Reliability/Cost Trade-off: Act as the technical lead in determining where we can safely optimize (e.g., Spot instances for non-critical workloads) and where we must invest in over-provisioning to protect our SLOs.
Proactive Analysis: Regularly audit cloud environments to identify underutilized resources and ghost infrastructure, providing actionable data to leadership on potential savings.
Automation & Guardrails: Develop IaC modules and CI/CD policies that prevent "cost-drift" before it happens, ensuring developers have the resources they need without excess waste.
Cross-Functional Advocacy: Partner with Finance and Product teams to translate technical infrastructure metrics into business value and cost-per-feature insights.
What you will bring with you
3+ years of experience in DevOps, SRE, or Infrastructure Engineering roles.
Deep Kubernetes Expertise: Expert-level knowledge of K8s internals.
Cloud Fluency: Extensive experience managing large-scale environments in AWS, GCP, or Azure.
Infrastructure as Code (IaC): Mastery of Terraform to manage complex, multi-environment deployments.
System Design: Proven ability to explain the trade-offs between different compute types (e.g., On-Demand vs. Spot vs. Reserved) and their impact on system availability.
What we look for
Data-Driven Mindset: Strong proficiency in Prometheus, Grafana, and SQL to analyze performance metrics and correlate them with billing data.
When you join Sysdig, you can expect:
Extra days off to prioritize your well-being
Mental health support for you and your family through the Modern Health app
Great compensation package
We would love for you to join us! Please reach out even if your experience doesn't perfectly match the job description. We can always explore other options after starting the conversation. Your background and passion will set you apart, especially if your career path is different.
Some of our Hiring Managers are globally distributed, an English version of your CV will be appreciated.
Sysdig values a diverse workplace and encourages women, people of color, LGBTQIA+ individuals, people with disabilities, members of ethnic minorities, foreign-born residents, and veterans to apply. Sysdig is an equal-opportunity employer. Sysdig does not discriminate on the basis of race, color, religion, sex, national origin, age, disability, genetic information, sexual orientation, gender identity, or any other legally protected status.
#LI-SM3
#LI-remote