At Rayn, we don’t just work—we innovate, collaborate, and create solutions that leave a lasting impact. As part of our team, you’ll have the opportunity to shape the future of a purpose-driven organization that is redefining how technology can address societal & public health challenges.

We are looking for a software engineer to help design and build the core of the ILM platform. This role sits at the intersection of systems engineering, AI infrastructure, and developer tooling. You will work on runtime orchestration, GPU scheduling, and model lifecycle workflows that power training and inference across heterogeneous environments. This is a hands-on role suited for engineers who enjoy working close to the metal, care about performance, and want to build foundational infrastructure rather than application-layer features.

What you will bring to Rayn as a ML Software Engineer

Design and implement core services for the ILM orchestration platform
Build unified pipelines for fine tuning and training methods including LoRA, QLoRA, and RL-based approaches
Integrate and manage multiple LLM runtimes such as vLLM, TensorRT, and llama.cpp
Implement GPU-aware scheduling, resource allocation, and workload isolation
Optimize VRAM usage, KV-cache management, and inference throughput
Build internal APIs and developer tooling for model lifecycle management
Collaborate closely with hardware, platform, and research teams

Who We’re Looking For

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
Strong experience in JavaScript, Python and C++ with a minimum of 6+ years of experience in software development, architecture, and team leadership roles.
Solid understanding of Linux systems, containers, and runtime environments
Experience working with GPU-accelerated workloads
Familiarity with deep learning inference and training frameworks such as PyTorch
Familiarity with fine tuning and training techniques for large language models
Practical experience deploying and optimizing AI infrastructure
Strong problem-solving skills and attention to performance and reliability
Experience with CUDA, llama.cpp, VLLM, LlamaFactory, TensorRT, or GPU profiling tools.
Experience building distributed systems or schedulers
Experience in AI infrastructure, MLOps, or HPC environments

ML Software Engineer - Platform LLM Training and Inference

AI overview