Q36 of 38 · Manual & exploratory
How do you decide when to stop testing and ship?
Short answer
Short answer: When all critical-path tests pass, no S1/S2 bugs remain open, the risk-vs-impact of remaining issues is acceptable to stakeholders, and the pre-agreed exit criteria are met. 'All bugs fixed' is never the bar — the bar is 'remaining risk is acceptable to ship.'
Detail
"When can we ship?" is a business decision dressed as a test decision. The tester's job is to make the risk visible; the call belongs to the team — usually the product manager or release manager — with input from QA, engineering, and sometimes legal/security.
The exit-criteria framework:
- Critical paths green. All P0/P1 user journeys (login, checkout, the headline feature) pass automated regression, exploratory smoke, and on at least the supported browsers. Non-negotiable.
- No S1 (blocker) or S2 (high-impact) defects open. S3/S4 defects are acceptable but documented. The list of open defects with severities is the formal "risk ledger" that release sign-off references.
- Pre-agreed coverage reached. "We agreed the regression suite covers cart, payment, and profile; all three are covered."
- Non-functional gates passed. Performance regressed within tolerance, no new accessibility violations of acceptance level, no new high-severity security findings.
- Signal from production-like data. Smoke tests pass against staging with production-shape data, not just synthetic test data.
- Stakeholders informed. PM, engineering lead, and on-call know what's shipping, what's not, what's deferred, and what to watch for.
Where senior judgement comes in: trade-offs are explicit ("we're shipping with a known cosmetic bug; agreed with PM Tuesday because the public release calendar is harder to move than the bug is to fix later"), the risk ledger is shared, and a post-release plan exists (monitoring set, rollback plan tested, oncall briefed). "Shipping" is the start of validation, not the end.
What junior testers get wrong: treating "stop testing" as a test-team decision rather than a stakeholder one; saying "we found no bugs" — that's not an exit criterion, it's a measurement gap.