ML Software Engineer - Platform LLM Training and Inference

AI overview

Contribute to redefining AI infrastructure by designing and implementing core services and workflows for the ILM platform, optimizing GPU scheduling and model lifecycles.

At Rayn, we don’t just work—we innovate, collaborate, and create solutions that leave a lasting impact. As part of our team, you’ll have the opportunity to shape the future of a purpose-driven organization that is redefining how technology can address societal & public health challenges.

We are looking for a software engineer to help design and build the core of the ILM platform. This role sits at the intersection of systems engineering, AI infrastructure, and developer tooling. You will work on runtime orchestration, GPU scheduling, and model lifecycle workflows that power training and inference across heterogeneous environments. This is a hands-on role suited for engineers who enjoy working close to the metal, care about performance, and want to build foundational infrastructure rather than application-layer features.


What you will bring to Rayn as a ML Software Engineer

  • Design and implement core services for the ILM orchestration platform
  • Build unified pipelines for fine tuning and training methods including LoRA, QLoRA, and RL-based approaches
  • Integrate and manage multiple LLM runtimes such as vLLM, TensorRT, and llama.cpp
  • Implement GPU-aware scheduling, resource allocation, and workload isolation
  • Optimize VRAM usage, KV-cache management, and inference throughput
  • Build internal APIs and developer tooling for model lifecycle management
  • Collaborate closely with hardware, platform, and research teams


Who We’re Looking For

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
  • Strong experience in JavaScript, Python and C++ with a minimum of 6+ years of experience in software development, architecture, and team leadership roles.
  • Solid understanding of Linux systems, containers, and runtime environments
  • Experience working with GPU-accelerated workloads
  • Familiarity with deep learning inference and training frameworks such as PyTorch
  • Familiarity with fine tuning and training techniques for large language models
  • Practical experience deploying and optimizing AI infrastructure
  • Strong problem-solving skills and attention to performance and reliability
  • Experience with CUDA, llama.cpp, VLLM, LlamaFactory, TensorRT, or GPU profiling tools.
  • Experience building distributed systems or schedulers
  • Experience in AI infrastructure, MLOps, or HPC environments
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Software Engineer Q&A's
Report this job
Apply for this job