Lead DevOps Automation & Observability Engineer (Serbia)

Novi Sad , Serbia
full-time

AI overview

Design and optimize cloud-native infrastructures across AWS, Azure, and GCP while driving DevOps practices and supporting AI/ML workloads.

We are looking for an experienced DevOps / Cloud Platform Engineer to design, automate, and optimize modern cloud-native infrastructures across AWS, Azure, and GCP.
This role goes beyond implementation, you will take ownership of platform decisions, design end-to-end solutions, and actively influence how our cloud and DevOps practices evolve.

You will work with Kubernetes, Terraform, CI/CD, observability tools, and cloud services to build secure, scalable, and production-ready platforms. You will collaborate closely with development and cross-functional teams, act as a technical partner, and help drive reliability, performance, and delivery excellence, including support for next-generation AI/ML workloads.



What Will You Do

  • Design and own cloud-native architectures across AWS, Azure, and (optionally) GCP, from concept to production.
  • Build, maintain, and continuously improve scalable, secure, and highly available platforms using Kubernetes, Terraform, and CI/CD pipelines.
  • Make technical decisions independently, balancing reliability, security, cost, and delivery speed.
  • Automate cloud provisioning and deployments, improve platform reliability, and drive cost and performance optimization.
  • Integrate and evolve observability solutions (Datadog, Grafana, Prometheus, Splunk), define SLOs/SLIs, and lead troubleshooting and root-cause analysis.
  • Work closely with developers, QA, and other stakeholders to design solutions together, enable DevOps practices, and improve delivery workflows.
  • Support AI/ML workloads by designing infrastructure for training, inference, and MLOps pipelines (SageMaker, Azure ML, Vertex AI, etc).
  • Take responsibility for platform outcomes, not just tasks, including stability, scalability, and long-term maintainability.
  • Maintain documentation, build self-service DevOps capabilities, mentor team members, and contribute to engineering best practices.


Who You Are

  • 5+ years of experience in DevOps, SRE, or cloud platform engineering, with a track record of owning systems in production.
  • Strong expertise in AWS or Azure cloud architectures, including networking, security, and cost considerations.
  • Hands-on experience with Kubernetes (EKS/AKS), Docker, Helm, and infrastructure-as-code (Terraform).
  • Solid understanding of Linux, distributed systems, and architecture design for scale and resilience.
  • Experience with CI/CD tools (Jenkins, GitHub Actions, Azure DevOps) and GitOps practices (ArgoCD).
  • Comfortable with observability tooling (Datadog, Splunk, Prometheus, Grafana) and leading incident analysis and improvements.
  • Experience with AI/ML platforms or ML-driven workloads is a strong plus.
  • Proactive, ownership-driven mindset, you take responsibility, propose improvements, and follow through.
  • Strong communication skills and the ability to work with technical and non-technical stakeholders.
  • You enjoy designing solutions, making decisions, and being accountable for their success in production.


We appreciate the interest of all applicants. Please note that only those whose qualifications align closely with the position requirements will be contacted for the next steps in the selection process.


All applications will be handled with confidentiality. 


⋮IWConnect's Privacy Statement for Job Applicants

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Engineer Q&A's
Report this job
Apply for this job