Cinder provides a cutting-edge platform to protect the internet. Cinder safeguards the interaction layer: the front end of products that users engage — and sometimes abuse. Our AI agents, integrated workflow platform, and deep expertise power real-time integrity at the speed of innovation.
We support some of the most important and innovative companies in the world, including OpenAI, Midjourney, and ElevenLabs. Cinder is backed by Accel and Y Combinator.
We care deeply about being relentless, intentional, and empathetic. We hire gritty thinkers who set ego aside, aggressively solve customer problems, and get better every day. We’re building an enterprise-grade platform to make the internet safer and we need highly curious, hard-working self-starters.
Cinder is seeking an experienced Site Reliability Engineer to help maintain and evolve our robust infrastructure.
Drive our production infrastructure from Day 0 through Day 2 operations, continuously evolving it into a highly reliable platform.
Build proactive deployment observability and guardrails that surface risk early, prevent outages, and continuously raise the bar on release safety.
Architect and oversee compliance-by-design infrastructure, ensuring SOC 2, HIPAA, and GDPR requirements are embedded into systems.
Partner across data, AI, and product to unblock work, improve velocity, and raise operational standards.
Partner directly with our most strategic customers to diagnose issues, guide deployments, and shape infrastructure decisions that matter in production.
Contribute to on-call coverage while serving as the go-to escalation point for complex deployment issues.
You're fluent in cloud infrastructure, infrastructure as code (e.g. Terraform), container orchestration, and containerization technologies
You're comfortable working with diverse technologies (e.g., AWS, PostgreSQL, OpenSearch, Kafka) and excited to learn new systems as our infrastructure evolves
You bring a strong bias toward repeatable, secure deployment processes and operational excellence
You take pride in building software that is fast, reliable, and scalable
You love to stay on top of the latest trends with deployment technologies
Curious, systematic, and execution-oriented—you don't wait for perfect requirements and can navigate technical tradeoffs independently
Experience operating production infrastructure in environments with increasing scale, complexity, and reliability demands
This is a pivotal moment to join Cinder. As a fast-growing startup building critical AI software for some of the world's largest, most highly used, and most visible companies - your impact will be immediate and substantial. We are committed to fostering a vibrant, collaborative culture with a wonderful office, frequent in-person time to connect and innovate, competitive salary, generous benefits, and a 401k match, ensuring you are supported as you tackle defining industry challenges.
Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!
Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.
Site Reliability Engineer Q&A's