Mistral AI is hiring a

GPU programming Expert (San Francisco)

Palo Alto, United States
Full-Time
Mistral AI is hiring an expert in the role of serving and training large language models at high speed on GPUs. The role is based in San Francisco. 

The role will involve
-Writing low-level code to take all advantage of high-end GPUs (H100) and max out their capacity
-Rethinking various part of the generative model architecture to make them more suitable for efficient inference-Integrating low-level efficient code in a high-level MLOps framework 

The successful candidate will have
-High technical competence for writing custom CUDA kernels and pushing GPUs to their limits. High expertise on the distributed computation infrastructure of current generation GPU clusters
-Overall understanding of the field of generative AI, knowledge or interest in fine-tuning and using language models for applications

Apply for this job

Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!

Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Report this job
Apply for this job