Train and fine-tune vision-language models while extending capabilities to video and optimizing inference for production in a focused AI safety team.
TLDR: You’ll train and fine-tune vision-language models, extend them to video, build alignment pipelines (GRPO, DPO, reward modeling), develop evaluation benchmarks, optimize inference for production, and work with MoE architectures.
White Circle is an AI Safety company building the safety, reliability, and optimization layer for AI systems. At the core of our platform are policies – simple natural-language rules that define what an AI model should and shouldn’t do. We automatically test, enforce, and continuously improve these policies at scale.
We’ve raised $11M from top funds, founders, and senior leaders at OpenAI, Anthropic, HuggingFace, Mistral, DeepMind, Datadog, Sentry, and others
We process over one hundred million API calls every month
We fine-tune and train our own LLMs so they run faster and cheaper than any open or proprietary model
We’re a small, highly focused team. If you want to work deeply on hard problems, see your work ship to production quickly, and influence how AI safety is actually built – you’re the one we need.
Train vision-language models from scratch and fine-tune existing architectures for image understanding
Extend VLM capabilities to video: design temporal modeling approaches, handle long-context efficiently
Design evaluation benchmarks that matter: visual QA, spatial reasoning, video comprehension
Curate and maintain multimodal datasets — including synthetic data generation pipelines
Train and optimize MoE architectures for efficient multimodal inference
Deploy models to production: quantization, batching strategies, latency optimization
3+ years training and fine-tuning vision-language models (LLaVA, Qwen-VL, InternVL, or similar)
Deep experience with multimodal architectures — you understand how vision encoders, projectors, and LLMs fit together
Hands-on with RLHF/alignment for multimodal: GRPO, DPO, reward modeling — not just for text
Experience with video understanding: temporal modeling, long-context processing, efficient attention mechanisms
Track record shipping VLMs to production: you've optimized inference, not just reported benchmark scores
Comfortable with large-scale dataset curation: image-text pairs, video-instruction data, synthetic data generation
Familiar with MoE architectures and their tradeoffs for multimodal workloads
Strong PyTorch skills, experience with distributed training (DeepSpeed, FSDP)
Salary of $100,000 to $250,000 + equity
20 days of paid vacation
Work from Paris (hybrid) + relocation package
Best medical insurance in France
All the hardware, tools, and services you need
Covered subscriptions for AI agents and IDEs
Team off-sites twice a year: we’ve recently been to the Alps and to Saint-Tropez
Intro call with one of our colleagues
Сomplete the take-home assignment
Show your best during the technical interview
Final call with our CEO and CTO
Please submit your application in English - it’s our company language so you’ll be speaking lots of it if you join
Health Insurance
Best medical insurance in France
Annual team off-sites
Team off-sites twice a year: we’ve recently been to the Alps and to Saint-Tropez
Paid Time Off
20 days of paid vacation
Remote-Friendly
Work from Paris (hybrid) + relocation package
White Circle builds a safety, reliability, and optimization layer for AI systems, focusing on natural-language policies that define the boundaries for AI models. Our platform automatically tests, enforces, and continuously improves these policies at scale, ensuring that AI operates within safe and defined parameters.
Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!
Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.
AI Engineer Q&A's