[BZA] Senior ML Platform Engineer - ML Platforms & MLOps

TLDR

Design and scale machine learning platforms while ensuring seamless collaboration with data science and product teams to deliver impactful solutions in a fast-paced environment.

Project – the aim you’ll have
Our client is a leading e-commerce company specialized in fashion, shoes, accessories, beauty – i.e. retail / online fashion platform.
We are looking for an experienced Senior ML Platform Engineer to design, build, and scale machine learning platforms and MLOps tooling. You will work at the intersection of software engineering and machine learning, enabling teams to develop, deploy, and operate ML models reliably in production. You will join a team that values high engineering standards, automation, fast delivery of business value, and close collaboration with data science, product, and infrastructure teams.

Position – how you’ll contribute

  • Support and contribute hands-on to multiple ML platform POCs
  • Work closely with Applied Scientists, ML Engineers, and internal platform teams
  • Evaluate platform capabilities across:
  • GPU training and experimentation
  • Real-time and batch inference
  • Orchestration, monitoring, and operability
  • Multi-tenancy, isolation, and scalability
  • Assess integration points with existing in-house tooling
  • Perform performance and operability analysis
  • Contribute technical input to:
  • Build vs buy vs extend decisions
  • Target platform stack recommendations
  • OPEX and CAPEX justification for rollout

Expectations – the experience you need

  • 5+ years building and operating ML infrastructure or large-scale data/ML systems on cloud platforms
  • Experience supporting mission-critical systems serving multiple teams
  • Containers (Docker) and orchestration (Kubernetes)
  • Experience with streaming and batch processing systems (e.g. Kafka/Kinesis, Spark/Flink)
  • Experience designing and operating systems with strict latency and throughput requirements (e.g. systems with sub-10ms inference or retrieval paths)
  • Familiarity with caching, traffic shaping, and request management in production
  • Designing systems with SLOs, monitoring, and safe deployment practices
  • Experience with incident response, capacity planning, and post-incident reviews
  • Experience working with IAM, secrets management, and network boundaries
  • Ability to embed security, compliance, and governance into engineering workflows
  • Experience combining multiple platform components (open source and managed services) into a coherent, shared, multi-team, production-ready ML platform
  • Comfortable evaluating and integrating tools rather than relying on a single end-to-end solution
  • Evaluating build vs buy vs extend trade-offs
  • Clear articulation of technical trade-offs and recommendations
  • Ability to produce architecture designs, POC findings, and decision input
  • Effective collaboration with platform, infra, and ML teams

Additional skills – the edge you have

  • Experience with enterprise ML platforms (e.g. Databricks, Domino, ClearML)
  • Kubernetes-first ML systems
  • Hands-on experience running ML workloads on Kubernetes (EKS preferred)
  • Multi-tenant environments, resource isolation, autoscaling
  • Experience running and optimising GPU-based training workloads in shared, multi-tenant environments (e.g. scheduling, utilisation, cost efficiency).
  • Feature platform or feature store experience
  • Online/offline consistency, schema evolution
  • Familiarity with Hopsworks, Feast, or similar
  • Governance and compliance experience in regulated ML environments
  • Experience onboarding teams onto shared platforms
  • FinOps awareness (cost attribution and optimisation for ML workloads)
  • Developer experience / platform enablement mindset (golden paths, templates, onboarding flows)

Our offer – professional development, personal growth

  • Flexible employment and remote work
  • International projects with leading global clients 
  • International business trips  
  • Non-corporate atmosphere 
  • Language classes 
  • Internal & external training 
  • Private healthcare and insurance  
  • Multisport card 
  • Well-being initiatives 

Benefits

Education Stipend

Internal & external training

Flexible Work Hours

Flexible employment and remote work

Health Insurance

Private healthcare and insurance

International business trips

Wellness Stipend

Well-being initiatives

Software Mind is a dynamic software engineering company that partners with businesses to drive their digital transformation. We specialize in providing skilled software engineers and autonomous teams that manage complete software life cycles, employing expertise in cloud, AI, and data science. Our unique approach focuses on creating impactful solutions for tech giants and startups alike, ensuring that our clients stay ahead in a rapidly evolving digital landscape.

View all jobs
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Platform Engineer Q&A's
Report this job
Apply for this job