AI Quality Engineer - 11100

AI overview

Ensure the quality, safety, and reliability of generative AI solutions by designing comprehensive testing strategies and collaborating with various stakeholders in a Google Cloud environment.
Coupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small. Coupa AI is informed by trillions of dollars of direct and indirect spend data across a global network of 10M+ buyers and suppliers. We empower you with the ability to predict, prescribe, and automate smarter, more profitable business decisions to improve operating margins. Why join Coupa? 🔹 Pioneering Technology: At Coupa, we're at the forefront of innovation, leveraging the latest technology to empower our customers with greater efficiency and visibility in their spend. 🔹 Collaborative Culture: We value collaboration and teamwork, and our culture is driven by transparency, openness, and a shared commitment to excellence. 🔹 Global Impact: Join a company where your work has a global, measurable impact on our clients, the business, and each other.  Learn more on Life at Coupa blog and hear from our employees about their experiences working at Coupa.  The Impact of an AI Quality Engineer at Coupa: We are seeking a detail-oriented AI Quality Engineer to ensure the quality, reliability, and safety of our generative AI solutions built on Google Cloud's AI ecosystem. This role is a core part of our AI Delivery team, responsible for designing and executing comprehensive testing strategies for AI agents and solutions developed with Google ADK (Agent Developer Kit), Vertex AI, and related Google Cloud services. This position combines QA expertise with AI-specific testing methodologies, ensuring that our agentic AI solutions meet quality standards, perform reliably in production, and adhere to responsible AI principles. The AI Quality Engineer will partner closely with AI Architects, Agentic AI Developers, and business stakeholders to validate solutions from concept through production deployment within the Google Cloud environment. What You'll Do:
  • AI-Specific Test Strategy: Design and implement comprehensive testing strategies for AI agents built with Google ADK and Vertex AI, including functional testing, prompt validation, model behavior testing, and agent orchestration verification.
  • Automated Testing for AI Systems: Develop and maintain automated test suites for AI agents using Python, Vertex AI APIs, and testing frameworks. Create test harnesses for evaluating LLM outputs, agent responses, and multi-agent interactions.
  • Vertex AI Quality Assurance: Test and validate AI solutions across the Vertex AI platform including model deployments, endpoint configurations, vector search accuracy, RAG pipeline quality, and integration points with Google Cloud services.
  • Responsible AI Testing: Implement and execute tests for AI safety, bias detection, fairness metrics, toxicity screening, and compliance requirements using Vertex AI monitoring tools and responsible AI frameworks. Validate guardrails and safety mechanisms.
  • Performance & Reliability Testing: Conduct performance testing, load testing, and reliability validation for AI agents deployed on Google Cloud Platform. Monitor latency, throughput, cost optimization, and scalability under various conditions.
  • Integration Testing: Validate integrations between AI agents and enterprise systems including BigQuery, Cloud Functions, Cloud Run, Pub/Sub, and third-party applications. Ensure data flows, API connections, and orchestration logic function correctly.
  • Test Documentation & Reporting: Create detailed test plans, test cases, and quality metrics specific to AI systems. Document test results, defect reports, and quality trends. Provide clear feedback to development teams on agent behavior and edge cases.
  • Continuous Improvement: Collaborate with AI Architects and Developers to establish quality standards and testing best practices. Identify patterns in failures, suggest improvements to agent design, and contribute to CI/CD pipeline quality gates.
  • User Acceptance Support: Facilitate UAT processes with business stakeholders, helping them validate AI agent behavior against requirements and ensuring solutions meet real-world use cases.
  • What You Will Bring to Coupa:
  • Proven experience (3-5+ years) in software quality assurance, testing, or QA engineering with at least 1-2 years focused on AI/ML systems or data-intensive applications.
  • Strong understanding of AI/ML fundamentals, LLM behavior, prompt engineering, and the unique testing challenges of non-deterministic AI systems.
  • Experience testing applications on Google Cloud Platform, with familiarity with Vertex AI, BigQuery, Cloud Functions, or related GCP services.
  • Proficiency in Python for test automation and scripting. Experience with testing frameworks (pytest, unittest) and API testing tools.
  • Demonstrated ability to design test strategies for complex systems, including positive/negative testing, boundary testing, and edge case identification.
  • Strong analytical and problem-solving skills with attention to detail in identifying defects, inconsistencies, and quality issues in AI agent behavior.
  • Experience with test automation, CI/CD pipelines, and version control systems (Git).
  • Excellent communication skills with the ability to document technical findings and collaborate effectively with developers and stakeholders.
  • Bachelor's degree in Computer Science, Information Systems, or a related field, or equivalent practical experience.

  • Preferred:
  • Hands-on experience testing AI agents, LLM applications, or agentic systems using frameworks like Google ADK, LangChain, or similar tools.
  • Experience with Vertex AI specific testing including model evaluation metrics, A/B testing, model monitoring, and drift detection.
  • Knowledge of responsible AI testing methodologies including bias detection, fairness evaluation, explainability validation, and safety testing.
  • Familiarity with evaluation metrics for LLMs and RAG systems (BLEU, ROUGE, semantic similarity, retrieval accuracy, hallucination detection).
  • Experience with load testing tools (Locust, JMeter) and monitoring tools (Cloud Monitoring, Cloud Logging) in Google Cloud environments.
  • Understanding of prompt injection attacks, adversarial testing, and security testing for AI systems.
  • Experience with containerization (Docker, Cloud Run) and testing in containerized environments.
  • Coupa complies with relevant laws and regulations regarding equal opportunity and offers a welcoming and inclusive work environment. Decisions related to hiring, compensation, training, or evaluating performance are made fairly, and we provide equal employment opportunities to all qualified candidates and employees. 

    Please be advised that inquiries or resumes from recruiters will not be accepted.

    By submitting your application, you acknowledge that you have read Coupa’s Privacy Policy and understand that Coupa receives/collects your application, including your personal data, for the purposes of managing Coupa's ongoing recruitment and placement activities, including for employment purposes in the event of a successful application and for notification of future job opportunities if you did not succeed the first time. You will find more details about how your application is processed, the purposes of processing, and how long we retain your application in our Privacy Policy.

    Coupa Software is a global technology platform for Business Spend Management.

    View all jobs
    Ace your job interview

    Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

    Quality Engineer Q&A's
    Report this job
    Apply for this job