How AI Is Changing QA — A Realistic View

8 min read

"AI is going to replace QA" has been a vendor pitch for nearly a decade. The reality in 2026 is more interesting and more useful: AI has become a serious productivity multiplier for testers who already know what they are doing — and a poor substitute for those who don't. Understanding what has actually changed, and what hasn't, is the foundation for every other decision in this course.

What AI is genuinely good at today

These are the capabilities you can rely on right now, with appropriate review:

  • Generating test code from descriptions. Write a paragraph describing a flow and an LLM will produce a passable Playwright, Cypress, or Selenium test. With your existing patterns as context, the output is often 80% there.
  • Analysing logs and stack traces. Paste a 200-line stack trace and ask "what likely caused this" — the model groups the noise, names the probable culprit, and suggests where to look next.
  • Suggesting test cases. Given a feature description, AI enumerates edge cases — empty inputs, oversize inputs, Unicode, race conditions, expired tokens — faster than any single tester can think of them.
  • Self-healing selectors. Tools like Healenium watch a locator fail, find the most likely replacement element, and auto-update the test. Reduces the constant trickle of locator-rot maintenance.
  • Visual diffing. Visual AI services (Applitools, Percy) ignore meaningless pixel differences (anti-aliasing, sub-pixel rendering) and surface only the changes a human would care about.
  • Bug clustering. Group 200 incoming bug reports by likely root cause in seconds — work that used to consume a triage engineer for half a day.

What AI is still bad at

Just as important as what AI can do is what it cannot:

  • Independently running an entire QA programme. It needs human direction at every meaningful decision.
  • Understanding business context. "Is this even worth testing?" is a judgement question that depends on customer impact, regulatory risk, and team priorities — none of which the model has direct access to.
  • Replacing exploratory testers. Humans notice things that don't fit a pattern: copy that reads weirdly, layouts that "feel off," interactions that violate platform conventions.
  • Maintaining stable production tests. AI-generated tests need the same review and refactoring as any other code — and arguably more, because they often work the first time and rot quietly afterwards.
  • Strategic decisions. What to test, what to skip, when to invest in automation, when to rip something out. Judgement calls.

QA work — before and after AI augmentation

Traditional QA

  • Manually authoring every test

  • Hours triaging CI failures

  • Boilerplate page objects, fixtures, data setup

  • Engineers spend most time on plumbing

  • Strategy work squeezed into spare moments

AI-augmented QA

  • AI scaffolds; humans review and refine

  • Triage tools cluster failures in seconds

  • Boilerplate generated from descriptions

  • Engineers spend more time on strategy and exploration

  • Risk modelling and design become the core craft

The realistic role of AI in QA

The honest framing: AI augments QA engineers, it does not replace them. It speeds up the boilerplate (page objects, fixtures, scaffolding), reduces the toil (triage, log analysis, flaky-test detection), and frees experienced engineers to spend more time where humans still win — strategy, exploration, and digging into complex bugs.

  • Big test automation vendors — Mabl, Testim, Functionize — have leaned heavily into AI features as their core differentiation.
  • Open-source tools have followed: Playwright now has an MCP server, Cypress has AI features, frameworks ship with prompt-driven generators.
  • GitHub Copilot has become the default AI coding assistant inside QA teams, alongside Cursor and similar AI-native IDEs.
  • "Agentic" testing — AI agents that autonomously plan, run, and triage tests — is emerging but not production-ready for most teams. Watch the space, don't bet your release on it yet.

The skills shift for QA engineers

Your job is not disappearing; the shape of it is changing. The skills that matter more now:

  • Prompt engineering. Knowing how to phrase a request so the model produces useful output (covered in detail in Chapter 2).
  • Reviewing AI output. Spotting hallucinated APIs, unsafe assertions, missed edge cases — this is the highest-leverage QA skill of the next few years.
  • Curating context. Choosing the right examples, conventions, and codebase snippets to feed the model so its output fits your project.
  • Strategic test design. As coding becomes commoditised, deciding what to test, where the risk lives, and which test pyramid layer fits each scenario becomes more valuable, not less.

Practitioners who lean into these skills are the ones whose careers accelerate over the next few years. Those who treat AI as a threat — or as a magic wand — fall behind in different directions.

⚠️ Common Mistakes

  • Treating AI output as ground truth. Every generated test is a hypothesis until reviewed. Hallucinated APIs and silent assertion bugs are real and frequent.
  • Adopting AI to "cut headcount." The teams that succeed use AI to do more with the people they have, not fewer. Pitching AI as a cost-cutting measure to leadership tends to produce brittle results and demoralised engineers.
  • Chasing every new tool. The space moves fast. Pick a small number of tools, get good at them, re-evaluate quarterly — don't try to adopt everything announced this month.

🎯 Practice Task

15–20 minutes. No installation needed.

  1. Open ChatGPT, Claude, or Gemini.
  2. Pick a feature in a product you know well — a login form, a checkout, a search box.
  3. Ask the AI: "List 15 edge cases I should test for this feature." Read the output critically.
  4. Mark each suggestion as one of: (a) genuinely useful, (b) obvious, (c) doesn't apply, (d) wrong.
  5. Note what categories the AI missed — the gaps reveal where human judgement still wins.

The next lesson maps the broad categories of AI tools relevant to QA work, so you can pick the right category for each problem.

// tip to track lessons you complete and pick up where you left off across devices.