Series

Testing AI products.

Evaluating AI features when there's no single correct answer — and using AI on the test side without fooling yourself. Testing AI breaks the usual playbook: outputs vary, so you test properties instead of equality. This series covers evaluating AI features, reviewing AI-written tests, and where AI genuinely helps a QA workflow versus where it's a trap.

Who it's forQA engineers testing AI featuresAutomation engineers using AI

// overview

Testing AI breaks the playbook QA grew up on. There's no single correct output, so “does it equal the expected value?” stops working — and a lot of teams either freeze or wave the feature through. This series is about testing AI features anyway, by checking properties and boundaries instead of exact strings.

It covers both sides of AI in QA: evaluating AI products (hallucinations, refusals, scope, grounded facts), and using AI on the test side without fooling yourself — reviewing AI-written tests that pass for the wrong reasons, and knowing where AI genuinely saves time versus where it's a trap.

The throughline: AI changes what you assert and what you log, not whether you test. The judgement is still the job.


// reading order

  1. Tutorials·13 June 2026 · 9 min read

    How I evaluate an AI chatbot before release

    A practical evaluation pass for AI chat features: hallucinations, refusals, prompt injection, and the cases with no single right answer.

    ai-testingllmevaluation
  2. Deep dives·13 June 2026 · 9 min read

    Prompt injection testing for QA engineers

    LLMs can't reliably separate instructions from data, so user input can hijack the model. Direct and indirect injection, what to check for, and how to report it QA-safe.

    ai-testingsecurity-testingprompt-injectionllm
  3. Tutorials·13 June 2026 · 8 min read

    What QA should log when testing AI features

    A screenshot isn't a repro when outputs vary. Capture the full assembled prompt, retrieved context, model version, and parameters so an AI bug is actually reproducible.

    ai-testingobservabilityllm
  4. Tutorials·13 June 2026 · 9 min read

    How to review AI-written Playwright tests

    AI writes plausible Playwright tests that pass for the wrong reasons. Here is the review checklist that catches them.

    ai-testingplaywrightreview
  5. Tutorials·30 December 2025 · 10 min read

    Using Claude and Copilot for test writing: a practical playbook

    The practical playbook for AI-assisted test writing in 2026. The prompts that work, the prompts that don't, and the human-in-the-loop checkpoints that keep AI from writing tests that pass for the wrong reasons.

    aiclaudecopilotworkflow
  6. Tutorials·13 June 2026 · 9 min read

    The hallucination test cases I run on AI features

    Concrete test cases for AI hallucination — unanswerable questions, false premises, invented entities, citations — and how to judge answers with no 'correct' value.

    ai-testingllmhallucinationtest-cases

// RELATED QA.CODES RESOURCES


Next seriesSecurity testing for QA