Do you get excited about scaling machine learning systems in production?
Are you ready to work on cutting-edge Computer Vision technology?
Do you feel tired of working in a big company?
Would you like to cooperate with top professionals in our industry?
If your answers are mostly yes, then you should keep reading. At Nomagic, we're on a mission to teach robots the real world.
We're now looking for a Senior DevOps Engineer to own and scale the infrastructure behind our Computer Vision ML Cloud Service.
Offer essentials:
Work on cloud-native ML infrastructure at scale
Salary: 26,000 - 30,000 gross UoP per month
Equity for every employee
Relocation package
No late evening calls - the entire eng team is based in Europe :)
English-speaking environment
We work mainly remotely, but you have to reckon with occasional office visits if the task requires it
Here is why we love this job ourselves, and hope you will enjoy it too:
We get to be creative
We're still pretty small, so everyone has a direct impact on the final result
Nothing is written in stone, we can easily change the technology we use (if the requirements change)
The CEO and part of the management are experienced infrastructure engineers who created the foundations of Google Cloud Platform
We combine world-class research with top-notch engineering and apply it to solve real problems
Some of the problems you may try to solve with us:
Scaling from single digit clients to 100+ in the future - designing multi-tenant infrastructure that scales": single-digit clients źle brzmi... może samo "scaling infrastructure to hundreds of B2B customers
Building standardized, repeatable deployment patterns across multiple customer environments
Optimizing data pipelines for Computer Vision model training and deployment
Building robust monitoring and alerting systems for ML service health and performance across all environments
Automated CI/CD pipelines for ML model deployment with safe rollback strategies
Deploying and managing applications on Kubernetes clusters using Helm charts and ArgoCD for GitOps workflows
Infrastructure as Code using Terraform to provision and manage cloud resources consistently
Implementing comprehensive observability with Prometheus and Grafana across multiple tenants
Ensuring high availability, security, and cost optimization of cloud services
Occasional hardware integration tasks (nice to have, not primary focus)
What skills we'd like you to have:
3+ years of experience as a DevOps or Infrastructure Engineer or in a similar role
2+ years of experience in software development
Strong Kubernetes experience - managing production clusters, helm charts, and GitOps (Like ArgoCD)
Infrastructure as Code expertise with Terraform
Monitoring and alerting - hands-on experience with Prometheus, Grafana, and building effective alerting strategies
Deep knowledge about Docker and container orchestration
Experience with CI/CD pipelines and automated deployments
Good understanding of networking and security in cloud environments
Strong proficiency in Python
Experience with one of the major cloud providers (preferably Google Cloud Platform)
Experience designing multi-tenant infrastructure is a big plus
Experience with ML infrastructure or data-intensive applications is a plus
Hardware experience (servers, edge devices) is nice to have but not required
Fluent communication in English
What should you expect once you apply?
30 minutes call with a Recruiter
45 minutes Hiring Manager interview
60 minutes Technical/Coding Interview
Onsites - half a day of interviews & discussions at the office
See a short sneak peek of our product here: