Too much data, not enough insight?
We get it. At KNIME, we build software that helps people clean, combine, and understand their data: fast, efficiently, and without code.
And with our focus on Data Analytics & AI, we empower everyone to turn complex challenges into clear, actionable insights.
You can help make that happen.
We’re not just an open-source data analytics company, we’re a fast-growing, globally recognized pioneer at the intersection of data and AI. With users in every industry and an international team from 30+ nationalities as well as a thriving open community.
Join us as a Site Reliability Engineer in Austin* and help us build and operate reliable, scalable cloud platforms for our next generation of products.
Reliability-driven: You care deeply about stable, scalable systems and design infrastructure with reliability in mind from day one.
Cloud-native engineer: You have hands-on experience building and operating cloud infrastructure, ideally in Azure; AWS experience is a plus.
Kubernetes-savvy: You’re comfortable running and supporting production Kubernetes clusters and understand common deployment patterns.
Automation-first: You use code and Infrastructure as Code tools to eliminate manual work and scale operations sustainably.
Systems thinker: You bring strong Linux and networking knowledge and understand how distributed systems behave in production.
Curious & improvement-oriented: You enjoy learning, challenging existing setups, and continuously improving how systems are built and operated.
Production ownership: Build, operate, and scale cloud-native SaaS platforms used by thousands of users.
Infrastructure as Code: Design and maintain reproducible infrastructure using tools like Terraform, Helm, or similar.
Reliability standards: Define and improve practices around availability, observability, monitoring, and performance.
Incident response: Troubleshoot production issues, perform root cause analysis, and implement long-term fixes.
Identity & security: Support authentication and authorization in multi-tenant SaaS environments.
Cross-team collaboration: Work closely with product and engineering teams to bring features into production reliably.
Impact: Ownership of critical infrastructure used by thousands of users worldwide.
Engineering depth: Modern cloud-native stack with real technical challenges.
Autonomy & trust: High responsibility with room to shape solutions.
Open-source mindset: User-first thinking and pragmatic engineering decisions.
Learning: Continuous growth alongside experienced platform engineers.
Flexibility: A work environment that supports sustainable ways of working.
Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!
Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.
Site Reliability Engineer Q&A's