SonarSource
SonarSource

Machine Learning Scientist (AI for Code) (f/m/d)

TLDR

Pioneer the next generation of Sonar's code analysis engine by applying cutting-edge AI and LLM techniques to help millions of developers write better, more secure code.

The impact you will have     At Sonar, we are seeking an innovative Machine Learning Scientist to join our Data & AI team and pioneer the next generation of our code analysis engine. You will be at the forefront of applying cutting-edge AI and Large Language Model (LLM) techniques to the complex domain of source code. Your work will directly shape our products, pushing the boundaries of static analysis to help millions of developers write better, more secure code. If you are driven to solve real-world problems by turning state-of-the-art research into practical, high-impact solutions, this is the role for you.   What you will do daily
  • Spearhead Research & Innovation: Stay on the cutting edge of ML, Deep Learning, and LLMs, specifically their application to the Software Development Lifecycle (SDLC), and identify novel opportunities to enhance our products.
  • Develop Advanced AI Models: Design, prototype, and validate novel ML models that identify and resolve complex bugs, vulnerabilities, and code smells, going beyond the capabilities of traditional static analysis.
  • Build LLM-Powered Features: Develop and implement advanced LLM-based solutions, including Retrieval-Augmented Generation (RAG) for contextual code analysis, fine-tuning models on proprietary codebases, and exploring agentic systems for automated code remediation.
  • Engineer Data Pipelines: Build and manage robust data pipelines to gather, process, and version massive code-centric datasets required for training and evaluating specialized models at scale.
  • Translate Prototypes to Products: Collaborate closely with engineering and product teams to integrate successful ML prototypes into Sonar's cutting-edge products, ensuring they meet the needs of our global user base.
  • Communicate and Evangelize: Clearly articulate and document complex technical concepts and research findings to both technical and non-technical stakeholders.
  • The experience that you need
  • An advanced academic background (Master’s or PhD) in Computer Science, Machine Learning, or a related quantitative field.
  • Strong industry experience in machine learning, with a solid understanding of modern software engineering practices and tools.
  • Solid programming skills in Python and hands-on experience with core ML/DL frameworks (e.g., PyTorch, TensorFlow, Hugging Face). Familiarity with Java is a plus.
  • Proven experience in applied Machine Learning, with a strong focus on Natural Language Processing (NLP) or, ideally, Programming Language Processing (PLP).
  • Hands-on experience with modern LLM architectures and techniques, such as Fine-tuning strategies (e.g., LoRA, QLoRA), advanced prompt engineering, building and optimizing Retrieval-Augmented Generation (RAG) pipelines and working with vector databases and semantic search
  • Experience with large-scale data processing frameworks and cloud infrastructure (e.g. AWS).
  • Experience of driving research projects from initial ideation to a demonstrable prototype with a high degree of autonomy.
  • Excellent communication skills in English and a talent for explaining complex scientific topics clearly and concisely.
  • SonarSource builds powerful code quality and security tools that help developers prevent issues in software production. Targeting development teams and organizations of all sizes, their solutions streamline workflows and enhance productivity, leveraging both human and AI-driven contributions. With support for over 30 programming languages and widespread adoption, SonarSource is committed to creating secure, reliable, and maintainable applications.

    Founded
    Founded 2008
    Employees
    51-200 employees
    Industry
    Internet Software & Services
    Total raised
    $45M raised
    View company profile
    Report this job
    Apply for this job