A test that prints expected [] but got [{...}, {...}, {...}] is technically correct and practically useless — nobody reading CI logs at 3am will know what to fix. The previous lesson showed how to catch a11y violations; this one is about how to report them in a form your dev team will actually read and act on. Console logs, attachments to the Playwright HTML report, JSON files for downstream dashboards, summary reports across multiple pages — every team's exact reporting setup is different, but the building blocks are the same. By the end you'll know how to turn raw axe results into the reports that turn a11y debt into a tracked, prioritised backlog.
Step one — readable console output
The simplest improvement over expect(violations).toEqual([]) is logging what failed in human terms:
import { test, expect } from "@playwright/test";
import AxeBuilder from "@axe-core/playwright";
test("a11y — homepage", async ({ page }) => {
await page.goto("/");
const results = await new AxeBuilder({ page }).analyze();
if (results.violations.length > 0) {
console.log(`\n=== Accessibility violations (${results.violations.length}) ===`);
results.violations.forEach(v => {
console.log(`\n[${v.impact?.toUpperCase()}] ${v.id}: ${v.description}`);
console.log(` Help: ${v.help}`);
console.log(` Docs: ${v.helpUrl}`);
console.log(` Affected: ${v.nodes.length} element(s)`);
v.nodes.slice(0, 3).forEach(n => {
console.log(` - ${n.target.join(" ")} → ${n.html.slice(0, 100)}`);
});
});
}
expect(results.violations).toEqual([]);
});Now a failing run prints something like:
=== Accessibility violations (3) ===
[CRITICAL] aria-required-attr: Required ARIA attributes must be provided
Help: Required ARIA attributes must be provided
Docs: https://dequeuniversity.com/rules/axe/4.7/aria-required-attr
Affected: 1 element(s)
- #search-combobox → <div id="search-combobox" role="combobox">
[SERIOUS] color-contrast: Elements must meet minimum colour contrast
Affected: 4 element(s)
- .promo-banner span → <span class="promo-text">Save 20%!</span>
...
A developer reading this in CI logs has the rule ID, the impact, the help URL, and the exact element. They can fix it without opening the HTML report or running the test locally.
Step two — attaching reports to the HTML reporter
testInfo.attach() adds arbitrary data to a test result. Use it to attach the full axe JSON:
test("a11y — homepage", async ({ page }, testInfo) => {
await page.goto("/");
const results = await new AxeBuilder({ page }).analyze();
await testInfo.attach("accessibility-scan-results", {
body: JSON.stringify(results, null, 2),
contentType: "application/json"
});
expect(results.violations).toEqual([]);
});After the run, open the HTML report (npx playwright show-report). Click the failing test → a panel shows the attachment. Click the JSON file → it opens in a viewer with full violation details, affected elements, the inapplicable rules, the passes (rules that did pass) — everything axe knows.
You can attach multiple things:
await testInfo.attach("axe-results.json", {
body: JSON.stringify(results, null, 2),
contentType: "application/json"
});
await testInfo.attach("axe-violations.txt", {
body: results.violations
.map(v => `[${v.impact}] ${v.id}: ${v.help}`)
.join("\n"),
contentType: "text/plain"
});
await testInfo.attach("page-screenshot.png", {
body: await page.screenshot(),
contentType: "image/png"
});A failing test now has the violation list, the full JSON, and a screenshot of the page state — three artefacts that together tell the whole story of what was broken when.
Step three — file-system reports for tooling
Sometimes you want the JSON on disk for downstream tools (dashboards, Slack bots, CI summaries):
import { writeFileSync, mkdirSync } from "fs";
import { join } from "path";
test("a11y — homepage", async ({ page }) => {
await page.goto("/");
const results = await new AxeBuilder({ page }).analyze();
mkdirSync("reports/a11y", { recursive: true });
writeFileSync(
join("reports/a11y", "homepage.json"),
JSON.stringify(results, null, 2)
);
expect(results.violations).toEqual([]);
});A CI job after the test run can sweep the reports/a11y/ folder, aggregate the data, and post a daily summary to Slack:
Today's a11y scan: 5 pages scanned, 3 critical, 12 serious.
Down from yesterday: -2 critical (good!).
Top offender: /checkout (4 critical violations).
Step four — a multi-page audit suite
For most teams, a11y testing isn't one test — it's a suite that scans every key page in the app. Generate them programmatically:
import { test, expect } from "@playwright/test";
import AxeBuilder from "@axe-core/playwright";
const pages = [
{ path: "/", name: "homepage" },
{ path: "/products", name: "product-listing" },
{ path: "/products/wireless-headphones", name: "product-detail" },
{ path: "/cart", name: "cart" },
{ path: "/checkout", name: "checkout" },
{ path: "/account", name: "account" },
{ path: "/help", name: "help-center" }
];
for (const { path, name } of pages) {
test(`a11y — ${name}`, async ({ page }, testInfo) => {
await page.goto(path);
const results = await new AxeBuilder({ page })
.withTags(["wcag2a", "wcag2aa", "wcag21a", "wcag21aa"])
.analyze();
await testInfo.attach(`${name}-axe-results.json`, {
body: JSON.stringify(results, null, 2),
contentType: "application/json"
});
const blocking = results.violations.filter(
v => v.impact === "critical" || v.impact === "serious"
);
expect(blocking, `Blocking violations on ${name}: ${blocking.map(v => v.id).join(", ")}`).toEqual([]);
});
}Seven pages, seven independent tests. The HTML report shows them as a list — five pass, two fail, click into the failures to see the JSON and the assertion message that called out which rule IDs were the problem. This is the format a real team uses for a weekly a11y review.
The reporting pipeline
CI integration — graduated severity
Most teams don't have the luxury of "fail on every violation from day one." A graduated approach:
test("a11y gate — critical only (always fail)", async ({ page }) => {
await page.goto("/");
const results = await new AxeBuilder({ page }).analyze();
const critical = results.violations.filter(v => v.impact === "critical");
expect(critical).toEqual([]);
});
test("a11y warn — serious (track, but don't fail yet)", async ({ page }, testInfo) => {
await page.goto("/");
const results = await new AxeBuilder({ page }).analyze();
const serious = results.violations.filter(v => v.impact === "serious");
if (serious.length > 0) {
await testInfo.attach("serious-violations.json", {
body: JSON.stringify(serious, null, 2),
contentType: "application/json"
});
console.warn(`${serious.length} serious violations — please fix soon`);
}
// Note: no expect — this test passes regardless
});The first test gates CI; the second tracks debt. As the team fixes serious issues, the second test's count drops to zero, then you delete it (or promote its assertion). New moderate checks take its place. The bar moves up over time without ever blocking shipping on issues nobody on the team has bandwidth for yet.
Comparing snapshots over time
Once you have JSON files in reports/a11y/, comparing yesterday's results to today's becomes possible. A simple node script:
// scripts/a11y-diff.ts
import { readFileSync, readdirSync } from "fs";
const today = readdirSync("reports/a11y/today");
const yesterday = readdirSync("reports/a11y/yesterday");
for (const file of today) {
const t = JSON.parse(readFileSync(`reports/a11y/today/${file}`, "utf-8"));
const y = JSON.parse(readFileSync(`reports/a11y/yesterday/${file}`, "utf-8"));
const newViolations = t.violations.filter(
(v: any) => !y.violations.some((yv: any) => yv.id === v.id)
);
if (newViolations.length) {
console.log(`NEW violations on ${file}:`);
newViolations.forEach((v: any) => console.log(` [${v.impact}] ${v.id}`));
}
}Run this nightly; post the diff to Slack. New a11y debt is visible the moment it lands — much cheaper than catching it months later in an annual audit.
Third-party dashboards
For larger teams, axe-core results feed into commercial tools:
- Deque axe DevTools — converts JSON results into a centralised dashboard with trends, owner assignment, and remediation guidance.
- Allure / Monocart reporters — Playwright reporters that aggregate axe attachments into a richer report than the default HTML.
- Custom dashboards — many teams pipe axe JSON into BigQuery / Looker for company-wide a11y metrics.
The pattern is always: produce JSON in tests → ship it somewhere → derive insight there. The Playwright side is the same testInfo.attach + writeFileSync we've covered.
Coming from Cypress?
The mappings:
cypress-axe'scy.checkA11y(null, null, terminalLog)→console.logafterawait axeBuilder.analyze().cypress-axe's screenshot-on-violation →await page.screenshot()+testInfo.attach.- Cypress doesn't have a
testInfo.attachequivalent for arbitrary artefacts — Playwright's reporter integration is genuinely better for a11y.
If you've used cypress-axe and worked around its console-only output, this is one area where Playwright gives you more out of the box.
⚠️ Common mistakes
- Failing the build on every impact level from day one. Most apps have 20+ existing violations. Failing CI on all of them blocks every PR for weeks while the team triages debt that wasn't caused by this PR. Start with
critical-only gating; ratchet up. - Not committing the JSON reports anywhere. Without persisting results, you can't tell whether a11y is improving over time. At minimum,
testInfo.attachso the Playwright HTML report has the data; ideally, write JSON to disk and let CI archive it. - Treating
helpUrlas optional. Every axe violation includes a link to deque's documentation explaining the rule, why it matters, and how to fix it. Print it in your console output. Devs who fix one a11y bug by reading the docs become teammates who can spot the same issue in code review next time.
🎯 Practice task
Build a multi-page a11y reporting suite. 30-40 minutes.
-
Create
tests/a11y-suite.spec.ts:import { test, expect } from "@playwright/test"; import AxeBuilder from "@axe-core/playwright"; const pages = [ { path: "/", name: "login" }, { path: "/inventory.html", name: "inventory", auth: true }, { path: "/cart.html", name: "cart", auth: true } ]; for (const { path, name, auth } of pages) { test(`a11y — ${name}`, async ({ page }, testInfo) => { await page.goto("https://www.saucedemo.com"); if (auth) { await page.getByPlaceholder("Username").fill("standard_user"); await page.getByPlaceholder("Password").fill("secret_sauce"); await page.getByRole("button", { name: "Login" }).click(); } await page.goto(`https://www.saucedemo.com${path}`); const results = await new AxeBuilder({ page }) .withTags(["wcag2a", "wcag2aa", "wcag21a", "wcag21aa"]) .analyze(); // 1. Console log readable summary if (results.violations.length > 0) { console.log(`\n=== ${name}: ${results.violations.length} violations ===`); results.violations.forEach(v => { console.log(`[${v.impact}] ${v.id}: ${v.help}`); console.log(` Docs: ${v.helpUrl}`); console.log(` Affects ${v.nodes.length} element(s)`); }); } // 2. Attach full JSON to the HTML report await testInfo.attach(`${name}-axe-results.json`, { body: JSON.stringify(results, null, 2), contentType: "application/json" }); // 3. Filter and assert (only critical fails) const critical = results.violations.filter(v => v.impact === "critical"); expect( critical, `Critical violations on ${name}: ${critical.map(v => v.id).join(", ")}` ).toEqual([]); }); } -
Run it:
npx playwright test a11y-suite.spec.ts --project=chromium. Open the HTML report (npm run report). Click any test → the JSON attachment is one click away. -
Inspect the attachment. Open the JSON in your editor. Note the structure:
violations,passes,incomplete,inapplicable. Thepassesarray is what your app did get right — useful as a credibility check that the scan ran correctly, not "0 violations because axe didn't actually scan." -
Persist to disk. Add a step that writes the JSON to
reports/a11y/${name}.json. Run the suite. Inspect the files. Now imagine running this nightly via CI and shipping the diff to Slack — you've built the foundation of a real a11y observability pipeline. -
Stretch: add a "warning" test (no
expect, just console output) that reportsserious-impact violations. Run the suite. The critical-gate test still controls pass/fail; the warning test gives the team visibility into the next bar to ratchet up to. This two-tier pattern is what most a11y-mature teams settle on.
That closes Chapter 7 — visual and accessibility testing. You now have screenshot diffing, accessibility scanning, and structured reporting wired into the same test runner as your functional and API tests. The next chapter — parallel execution and CI/CD — turns the suite from "runs locally" into "runs in 3 minutes on every PR across multiple machines and shards." Every visual and a11y test you've built here will benefit from those patterns directly.