Years of Experience - 8+
Role Summary
We are looking for a hands-on Senior DevOps to build, operate and scale production-grade streaming infrastructure. In this role, you will own the deployment, scalability, performance and observability of real-time data pipelines based on Apache Flink and MongoDB, using modern DevOps and IaC practices. You will enable data teams to leverage real-time analytics and AI-driven workflows while ensuring reliability, automation and maintainability across environments.
Roles & Responsibilities
- Deploy, manage and scale real-time streaming infrastructure using Apache Flink — ensuring reliable, low-latency streaming for analytics and AI-scoring workloads.
- Operate and maintain MongoDB clusters: handle replication, sharding, performance tuning, backups, and high-availability setup.
- Use Kubernetes (with StatefulSets where needed) — along with Helm — to containerize and orchestrate data-infrastructure workloads.
- Build and maintain CI/CD pipelines for data pipelines and infrastructure changes to enable fast, reliable deployments.
- Define and manage infrastructure as code (IaC) using Terraform — build reusable modules, manage environments, enable automated provisioning.
- Set up observability for streaming infrastructure: monitoring, metrics, alerting, dashboards using Prometheus + Grafana, including logging and incident response.
- Collaborate with data engineers, backend teams and product engineers to ensure streaming/data-infrastructure supports product features (real-time analytics, AI scoring) effectively.
- Document infrastructure, maintain runbooks, and build self-service modules/templates for data teams to use.