Bug Reproduction — Describing the Bug, AI Reproduces It

A user reports "the cart is broken". Triage typically eats an hour — read the ticket, log into the right account, click around guessing what the user did, fail to repro twice, finally find the trigger, then write up clean steps for the developer. Playwright MCP collapses that loop. Hand the assistant the user's description and it walks the flow autonomously, varies inputs to find the boundary, captures console errors and network failures, and emits a developer-ready bug report with steps and a Playwright trace attached. This lesson walks the prompt structure, the variation patterns that find the real edge, and the format that turns "reproduced" into "merged fix."

The dollar value here is the most quantifiable in the course. Ten vague bug reports normally cost 10–15 hours of triage. With AI-driven reproduction the same backlog is 30–60 minutes. That arithmetic is why this is the first workflow most teams adopt after the install lesson.

A prompt that produces a triable report

The user-reported bug: "When I have items in my cart and apply the discount code SAVE20, the total goes negative."

Your prompt:

A user reported this bug:
 
"When I have items in my cart and apply the discount code SAVE20, the total
goes negative."
 
Investigate on https://demo.myshop.com using the test account
demo@test.com / demo123. Report:
 
1. Whether you reproduced the bug.
2. The exact steps that triggered it (precise enough for a developer to follow).
3. The cart state at the moment of failure — items, quantities, subtotal, displayed
   total — captured from the page.
4. Any console errors or failed network requests during the failure.
5. Whether the bug is consistent across different cart shapes — try at least
   1 item, 5 items, 20 items, mixed price points, and one item with a sale price.
6. A Playwright trace file path for the successful reproduction.
 
Format the output as a Jira-ready bug report with:
- Title (one line)
- Description
- Steps to reproduce (numbered)
- Expected behaviour
- Actual behaviour
- Environment (deploy SHA from the footer, browser, viewport)
- Attachments (trace file path, screenshot file path)

The assistant runs the flow, varies the cart shape, captures the artefacts, and writes up the result. Five minutes from prompt to triable ticket — and the variation step often surfaces the real edge case the user couldn't articulate ("only reproduces when the cart total exceeds the discount value, regardless of item count").

The investigation loop

Step 1 of 6

Vague user report arrives

Free-text description, often missing steps, exact inputs, or environment. The classic 'cart is broken' ticket.

The variation step is where this earns its keep

Anyone can re-run the steps from a user report. The genuinely valuable thing is narrowing the condition — "reproduces only when the discount exceeds the subtotal" is far more useful than "reproduces when SAVE20 is applied." The prompt above explicitly asks for variation across cart shapes; that is the single sentence in the prompt that turns a yes/no answer into a precise repro.

A more aggressive variation prompt for stubborn bugs:

Try reproducing the SAVE20 bug with each of these cart shapes:
 
- 1 item at $5
- 1 item at $50
- 1 item at $200
- 5 items totalling $30
- 5 items totalling $300
- 1 item with a sale price (regular $100, sale $50)
- 1 in-stock + 1 out-of-stock item
 
For each shape, report whether the bug reproduces. Then propose a hypothesis
for the trigger condition, citing which shapes confirm and which refute it.

The output looks like:

- $5 item       → no repro (final total $4)
- $50 item      → no repro (final total $30)
- $200 item     → no repro (final total $180)
- 5×$6 = $30    → REPRODUCES (total -$10)
- 5×$60 = $300  → no repro (final total $260)
- $50 sale item → REPRODUCES (total -$10)
- mixed stock   → no repro (only in-stock item discounted)
 
Hypothesis: bug fires when the subtotal AFTER SAVE20 percentage discount drops
below the flat-discount fallback minimum. Two shapes confirm; the others refute.

That hypothesis is what closes the bug. The developer doesn't have to guess where to add a console.log; the report already triangulated the trigger.

A Jira-ready report shape

Asking for explicit format is non-optional. Without it, the assistant writes prose; with it, you get this:

Title: Cart total goes negative when SAVE20 reduces subtotal below flat-discount fallback
 
Description:
Applying coupon SAVE20 to a cart whose subtotal × 0.80 falls below the flat
$10 fallback discount produces a negative total. Likely the percentage and
flat discount paths both apply instead of one being a max() guard.
 
Steps to reproduce:
1. Log in as demo@test.com / demo123.
2. Add 5× "Basic socks" ($6 each) to cart. Subtotal: $30.
3. Apply coupon SAVE20.
4. Observe cart total: -$10.
 
Expected behaviour:
Total should be max(subtotal × 0.80, subtotal − $10) = $20, never negative.
 
Actual behaviour:
Total displays -$10. Checkout button is enabled. Network: POST /api/cart/coupon
returned 200 with body { discount: -40, total: -10 }.
 
Environment:
Deploy: 8a3c9f2 (footer)
Browser: Chromium 142
Viewport: 1280×720
 
Attachments:
- /tmp/playwright-traces/save20-negative-2026-05-08.zip
- /tmp/playwright-screenshots/save20-cart-state.png

That's the artefact you paste into the bug tracker. Total elapsed time, prompt to ticket: under ten minutes including the variation grid.

Where this fits with bug-reporting fundamentals

Everything in the bug-reporting lessons of the Manual Software Testing course still applies — severity vs priority, reproducibility, what makes a report actionable. The AI doesn't change what a good bug report contains; it just shortens the path to producing one. Use the existing rubric to grade the AI's output before filing.

Reproductions that don't reproduce

Two outcomes are real findings in their own right:

Cannot reproduce on staging, can reproduce on production. Suggests environment-specific data or a deploy-skew bug. Capture the staging vs production cart shapes side by side; that often points at a missing migration or a feature flag.
Cannot reproduce at all. The user might be on an old browser, an outdated build (PWA cached), or a corner of the data the test account doesn't share. Document what was tried, ask the user for one specific extra detail (browser version, account email if appropriate), and try again.

A "could not reproduce" report with a clear "here's what was tried" list is far more useful than silence.

⚠️ Common mistakes

Skipping the variation step. Reproducing the user's exact steps is good; finding the boundary is what closes the bug. The prompt difference is one sentence ("vary across cart sizes and price points") and the value difference is hours of developer guesswork.
Trusting the AI's hypothesis without checking the artefacts. The triangulated trigger condition is a hypothesis, not a verdict. Open the trace, confirm the steps, sanity-check the network response. Most are right; a few are confidently wrong.
Filing the bug without the trace attached. A repro with steps is helpful; a repro with a Playwright trace is complete. The developer can scrub the timeline, watch the network calls, and see the DOM state at each moment without re-doing the discovery work. Always have the agent save the trace and reference its path in the ticket.

🎯 Practice task

Triage three real bug reports with AI assistance. 45 minutes.

Open your bug tracker. Pick three vague reports that have been sitting in "cannot reproduce" or "needs investigation" for more than a week — the kind nobody wants to claim.
For each, run the prompt template above. Include the variation grid. Be explicit about the artefact format you want back.
Read each output critically. For findings the assistant marked "reproduced," open the trace and confirm. For findings it marked "could not reproduce," try one targeted variation yourself before believing it.
File or update each ticket with the AI-produced report. Include the trace path. If the bug reproduces, attach a one-line proposed fix area for the developer ("likely in src/cart/coupon.ts") — the AI's hypothesis often points at the right file.
Track time elapsed against the time you'd have spent doing this manually. The first time you do it, write that delta down somewhere visible — that number is the case for adopting this workflow team-wide.
Stretch: for one of the reproduced bugs, ask the assistant to "emit a failing Playwright test that reproduces this bug deterministically." That test goes into the suite as a regression check; the bug closing PR makes it pass. This is the loop that prevents the same bug from coming back.

The next lesson moves to a different kind of execution: visual verification — using vision mode to catch the layout and design regressions that snapshot mode is blind to.