Software Test Lead

Ankara , Türkiye

AI overview

Lead the design and implementation of test strategies and automation for software components while mentoring QA engineers and evaluating Large Language Model outputs.

Key Responsibilities

 Software Testing & QA Leadership

  • Design, review, and lead the implementation of test plans, test cases, and test strategies for various software components (APIs, services, UI).
  • Oversee test automation script development using tools such as PyTest, Selenium, Playwright, or Postman.
  • Maintain and optimize test automation pipelines, integrating with CI/CD tools (e.g., Jenkins, GitLab CI, Azure DevOps).
  • Lead functional, regression, smoke, and performance testing efforts to validate system readiness.
  • Ensure traceability from requirements to test cases and bug reports.

 LLM Evaluation & Benchmarking

  • Lead a team responsible for the evaluation of Large Language Model (LLM) outputs.
  • Design capability-based evaluation benchmarks (e.g., summarization, reasoning, math, code generation).
  • Guide the development and execution of auto-evaluation scripts, using LLM-as-a-judge, rule-based, and metric-based methods.
  • Build and maintain evaluation pipelines to track model accuracy, hallucination rates, robustness, and more.
  • Collaborate closely with AI Engineers and Data Scientists to align evaluations with development priorities.

Team Leadership & Technical Coaching

  • Mentor and support a team of QA engineers and model evaluators.
  • Allocate tasks, define sprint goals, and ensure timely and high-quality delivery of testing and evaluation artifacts.
  • Foster a culture of test-first thinking, technical quality, and continuous improvement.
  • Communicate evaluation insights and quality reports to product managers and stakeholders.

Required Qualifications

  • Bachelor's or Master’s degree in Computer Science, Software Engineering, AI, or a related field.
  • Minimum 5+ years in software testing, including experience as a Senior QA Engineer or Test Lead.
  • Strong experience in test case writing, test scenario design, and test automation scripting.
  • Proficiency in scripting languages like Python, JavaScript, or Java for test automation.
  • Experience with tools such as PyTest, Selenium, JUnit, Playwright, Postman, etc.
  • Familiarity with LLMs (e.g., DeepSeek, Mistral, LLaMA) and AI evaluation metrics (BLEU, ROUGE, Accuracy, etc.).
  • Experience in building or maintaining benchmark datasets for AI evaluation.
  • Understanding of prompt engineering, response validation, and error case analysis.

Preferred Skills

    • Experience with LLM evaluation libraries/tools like OpenAI Evals, TruLens, LangChain Eval, or custom scripts.
    • Experience working with MLOps or AI pipelines and integrating tests within them.
    • Familiarity with dataset labeling platforms or human-in-the-loop evaluation systems.
    • Strong data analysis and reporting skills using Excel, Python (Pandas/Matplotlib), or dashboards.
    • Ability to define and customize evaluation logic per customer or business domain.

12 yılda yaklaşık 5.000 mühendis ve araştırmacı ile bilişim ve iletişim teknolojileri profesyonelleri yetiştirerek ekosistemin büyümesine katkı sağlıyor ve global projelere imza atıyoruz. Birlikte geleceği kodluyoruz!

View all jobs
Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Report this job
Apply for this job