Software Test Lead

Ankara , Türkiye

AI overview

Lead a team in designing and implementing test strategies and evaluating AI model outputs while fostering a culture of technical quality and continuous improvement.

Key Responsibilities

 Software Testing & QA Leadership

  • Design, review, and lead the implementation of test plans, test cases, and test strategies for various software components (APIs, services, UI).
  • Oversee test automation script development using tools such as PyTest, Selenium, Playwright, or Postman.
  • Maintain and optimize test automation pipelines, integrating with CI/CD tools (e.g., Jenkins, GitLab CI, Azure DevOps).
  • Lead functional, regression, smoke, and performance testing efforts to validate system readiness.
  • Ensure traceability from requirements to test cases and bug reports.

 LLM Evaluation & Benchmarking

  • Lead a team responsible for the evaluation of Large Language Model (LLM) outputs.
  • Design capability-based evaluation benchmarks (e.g., summarization, reasoning, math, code generation).
  • Guide the development and execution of auto-evaluation scripts, using LLM-as-a-judge, rule-based, and metric-based methods.
  • Build and maintain evaluation pipelines to track model accuracy, hallucination rates, robustness, and more.
  • Collaborate closely with AI Engineers and Data Scientists to align evaluations with development priorities.

Team Leadership & Technical Coaching

  • Mentor and support a team of QA engineers and model evaluators.
  • Allocate tasks, define sprint goals, and ensure timely and high-quality delivery of testing and evaluation artifacts.
  • Foster a culture of test-first thinking, technical quality, and continuous improvement.
  • Communicate evaluation insights and quality reports to product managers and stakeholders.

Required Qualifications

  • Bachelor's or Master’s degree in Computer Science, Software Engineering, AI, or a related field.
  • Minimum 5+ years in software testing, including experience as a Senior QA Engineer or Test Lead.
  • Strong experience in test case writing, test scenario design, and test automation scripting.
  • Proficiency in scripting languages like Python, JavaScript, or Java for test automation.
  • Experience with tools such as PyTest, Selenium, JUnit, Playwright, Postman, etc.
  • Familiarity with LLMs (e.g., DeepSeek, Mistral, LLaMA) and AI evaluation metrics (BLEU, ROUGE, Accuracy, etc.).
  • Experience in building or maintaining benchmark datasets for AI evaluation.
  • Understanding of prompt engineering, response validation, and error case analysis.

Preferred Skills

    • Experience with LLM evaluation libraries/tools like OpenAI Evals, TruLens, LangChain Eval, or custom scripts.
    • Experience working with MLOps or AI pipelines and integrating tests within them.
    • Familiarity with dataset labeling platforms or human-in-the-loop evaluation systems.
    • Strong data analysis and reporting skills using Excel, Python (Pandas/Matplotlib), or dashboards.
    • Ability to define and customize evaluation logic per customer or business domain.

12 yılda yaklaşık 5.000 mühendis ve araştırmacı ile bilişim ve iletişim teknolojileri profesyonelleri yetiştirerek ekosistemin büyümesine katkı sağlıyor ve global projelere imza atıyoruz. Birlikte geleceği kodluyoruz!

View all jobs
Report this job
Apply for this job