Q31 of 42 · Playwright
Walk through how you'd debug a Playwright test that flakes only in CI.
Short answer
Short answer: Download the trace from the failed CI run and open with `show-trace` — DOM snapshots and network reveal the actual failure. Match local environment to CI (Docker, viewport, locale). Common causes: timing differences, missing fonts, viewport-dependent layout, slow third parties. Increase observability before increasing retries.
Detail
Playwright's trace viewer makes CI-only flake dramatically easier than other frameworks. The systematic approach:
1. Get the trace. Configure CI to upload test-results/ as artifacts on failure. Download the relevant trace zip locally.
npx playwright show-trace test-results/login-test-chromium/trace.zip
The trace shows everything: timeline, DOM at each step, network log, screenshots, console. Most CI-only failures are diagnosable from the trace alone.
2. Reproduce the environment. Run locally in the exact CI Docker image:
docker run -it -v $(pwd):/work -w /work \
mcr.microsoft.com/playwright:v1.45.0-jammy \
npx playwright test login.spec.ts
If the flake reproduces in the container, the cause is environmental (font, locale, OS, Playwright/browser version). If it doesn't, look for timing or load patterns.
3. Common CI-only causes:
- Timing: CI runners are usually slower than dev laptops; auto-wait works but assertions can hit timeout sooner. Bump per-test timeout where justified, or wait for a specific event (
waitForResponse) instead of relying on Locator auto-wait. - Missing system fonts: text wraps differently, layout shifts, screenshots diverge. Install fonts in the Docker image.
- Locale / time zone: CI often runs in UTC while local is local time. Pin
TZandLANGenv vars. - Smaller viewport: headless default is sometimes 1280×720 but config defaults to a larger size. Pin
viewportin config. - Network speed / third-party services: a real Stripe / Auth0 call that's slow in CI. Stub at the network layer.
- Concurrent test pressure: shared backend state from parallel workers. Ensure each worker has isolated test data.
4. Increase observability first. Before adding retries, add:
- More
trace: 'on'(capture trace on every retry, not just first). video: 'retain-on-failure'to see post-failure state.- Custom
console.logmarkers in the test. - A pre-failure dump of
page.content()for HTML diffing.
5. Then consider retries. retries: 2 in CI is reasonable; chronic 3+ retries is a code smell.
The senior signal: trace-first mentality, environmental matching, observability over retries.