Allure and HTML Reporting in CI — Playwright with Python

CI green or red is the headline; the test run details are the story. A failing test on its own tells you something broke; a failing test with a screenshot, a network trace, a stack-traced video, and a history view of the same test passing yesterday tells you what changed and where. This last lesson of chapter 7 covers the three reporters most Playwright Python teams actually run in CI: pytest-html for a quick dashboard, allure-pytest for the rich production view, and JUnit XML for Jenkins compatibility. Plus the screenshot-on-failure hook that surfaces UI bugs alongside the assertion message, and the GitHub Actions wiring that publishes Allure as a downloadable artefact.

pytest-html — the simple option

Single HTML file, embeds everything, opens in any browser:

pip install pytest-html
pytest --html=reports/report.html --self-contained-html

--self-contained-html inlines all CSS, JavaScript, and images. The output is one file you can email, attach to a Slack message, or upload as a CI artefact without a follow-up "where's the rest of it?" question.

What you see:

Pass/fail summary at the top.
Per-test entries with duration, status, and any captured stdout/stderr.
Expandable detail for failures showing the full traceback.
Optional environment info (Python version, OS, browser).

For a typical PR check, this is enough — green or red, click through the failures, fix and re-run.

Allure — the production-grade option

Allure produces a multi-page web app with feature/severity hierarchies, history graphs, attachment galleries, and per-test step timelines. Install:

pip install allure-pytest

Run:

pytest --alluredir=reports/allure
allure serve reports/allure

--alluredir writes raw JSON results. allure serve starts a local HTTP server with the rendered dashboard. CI typically runs only the first step (writing JSON), uploads the directory as an artefact, and a follow-up step (or the reviewer locally) generates the HTML.

What Allure adds over pytest-html:

Trends — pass-rate, duration, flake rate over the last N runs (when history is configured).
Categories — group failures by exception type or message pattern.
Attachments per step — screenshots, JSON dumps, videos attached to specific actions inside a test.
Severity filtering — "show me all CRITICAL Auth tests that failed in the last 5 runs."

Worth the extra setup for any team that runs the suite more than once a day.

Screenshot on failure — the autouse fixture pattern

Failing tests with no screenshot are debugging-by-imagination. Wire up an autouse fixture that captures a screenshot whenever the test fails:

# tests/conftest.py
import pytest
import allure
from playwright.sync_api import Page
 
 
@pytest.hookimpl(hookwrapper=True, tryfirst=True)
def pytest_runtest_makereport(item, call):
    """Make the test outcome available to fixtures via item.rep_call."""
    outcome = yield
    rep = outcome.get_result()
    setattr(item, f"rep_{rep.when}", rep)
 
 
@pytest.fixture(autouse=True)
def screenshot_on_failure(request, page: Page):
    yield
    if hasattr(request.node, "rep_call") and request.node.rep_call.failed:
        allure.attach(
            page.screenshot(),
            name=f"failure-{request.node.name}",
            attachment_type=allure.attachment_type.PNG,
        )

Two pieces:

pytest_runtest_makereport hook — attaches the test result (rep_setup, rep_call, rep_teardown) to the test item. Without this, the fixture has no way to know whether the test actually failed.
screenshot_on_failure fixture — runs around every test (autouse=True), checks the result after the test body, and attaches a screenshot to Allure on failure.

Now every failed test in the Allure report has a clickable failure-<test_name> PNG attachment. No manual instrumentation in the test bodies.

Tracing — Playwright's built-in debugging artefact

Playwright captures a trace — a complete record of every action, every locator query, every network request, every snapshot — that lets you replay the test offline. Enable it via browser_context_args:

@pytest.fixture(scope="session")
def browser_context_args(browser_context_args):
    return {**browser_context_args, "record_video_dir": "videos/"}

Or use pytest-playwright's built-in flags:

pytest --tracing on-first-retry --screenshot only-on-failure --video retain-on-failure

The three options:

--tracing — records a .zip trace file for every test. View with playwright show-trace trace.zip.
--screenshot — captures a PNG.
--video — saves an MP4 of the test.

Together they form a complete failure forensics kit. Configure once, debug for years.

JUnit XML — the Jenkins lingua franca

Jenkins has been around longer than pytest-html or Allure; it consumes JUnit XML natively:

pytest --junitxml=results.xml

A single XML file with test results, durations, and failure messages. Jenkins's "Publish JUnit test results" plugin reads it, generates a dashboard, and tracks pass/fail trends. If your team uses Jenkins, JUnit XML is the easiest reporting target — every build tool understands it.

Combine with the others:

pytest --junitxml=results.xml --html=reports/report.html --alluredir=reports/allure

All three reporters at once. The JSON results, the HTML dashboard, and the JUnit XML file end up in the artefacts.

Uploading reports as CI artefacts

GitHub Actions:

- name: Run tests with all reporters
  run: |
    pytest tests/ \
      --html=reports/report.html --self-contained-html \
      --alluredir=reports/allure \
      --junitxml=reports/junit.xml
 
- uses: actions/upload-artifact@v4
  if: always()
  with:
    name: test-reports
    path: reports/
    retention-days: 30

if: always() ensures the upload runs even when the test step fails — exactly when reports matter most. Reviewers download reports/ from the workflow run page; opening report.html shows the pytest-html dashboard, opening junit.xml is for downstream tooling, and allure/ is the input for a separate Allure-publish step.

Generating the Allure HTML in CI

The two-step Allure workflow: pytest writes raw JSON, a follow-up step (or a separate workflow) renders HTML and publishes it.

- name: Run tests
  run: pytest tests/ --alluredir=allure-results
 
- name: Set up Allure CLI
  uses: simple-elf/allure-report-action@v1.7
  if: always()
  with:
    allure_results: allure-results
    allure_report: allure-report
 
- uses: actions/upload-artifact@v4
  if: always()
  with:
    name: allure-report
    path: allure-report/
 
- name: Deploy to GitHub Pages
  uses: peaceiris/actions-gh-pages@v3
  if: always()
  with:
    github_token: ${{ secrets.GITHUB_TOKEN }}
    publish_dir: allure-report

Three layers:

allure-report-action generates the HTML dashboard from the raw results.
upload-artifact lets you download the rendered HTML from the workflow run.
peaceiris/actions-gh-pages publishes to GitHub Pages — anyone with the URL can browse the latest report. Combined with Allure's history feature, this gives the team a permanent dashboard.

The reporting pipeline end-to-end

1. pytest runsTest suite executes with --alluredir, --…

2. Raw artefactsallure-results/ JSON, report.html single…

3. Allure rendersallure-report-action turns raw JSON into…

4. CI uploadsactions/upload-artifact preserves everyt…

5. Team reviewsDevs grab the HTML report from the run p…

The five-step pipeline turns raw test output into a team-facing dashboard automatically. Set it up once; never touch it again unless the reporters themselves get major version bumps.

Allure history — the killer feature

By default, every Allure run starts fresh — no memory of previous runs. Enable history persistence and the dashboard suddenly shows trend graphs, flake rates, and "this test has been failing for 3 runs":

- name: Get Allure history
  uses: actions/checkout@v4
  if: always()
  with:
    ref: gh-pages
    path: gh-pages
 
- name: Generate Allure report with history
  uses: simple-elf/allure-report-action@v1.7
  if: always()
  with:
    allure_results: allure-results
    allure_report: allure-report
    keep_reports: 20
    gh_pages: gh-pages

The action checks out the previous report from gh-pages, merges its history into the new one, and generates a dashboard with keep_reports: 20 runs of context. The team sees how flake rates trend over time — the metric that matters most when keeping a 500-test suite healthy.

Coming from Playwright TypeScript?

The TS course covers Playwright's built-in HTML reporter and Allure separately. The Python equivalents are:

TS npx playwright test --reporter=html → Python pytest --html=reports/report.html
TS npx playwright test --reporter=allure-playwright → Python pytest --alluredir=reports/allure
TS testInfo.attach(...) → Python allure.attach(...)
TS --reporter=junit → Python pytest --junitxml=results.xml

Same conceptual model; same artefact shapes. The Python ecosystem has more reporter options because pytest existed before Playwright — you can layer pytest-html, pytest-md (Markdown summary), pytest-csv, etc. depending on what your team needs.

⚠️ Common mistakes

Skipping the pytest_runtest_makereport hook when using screenshot_on_failure. Without the hook, request.node.rep_call doesn't exist; the fixture either errors out or never captures any screenshots. The hook is twelve lines, lives in conftest, and you only write it once.
Treating Allure history as automatic. Out of the box, Allure shows zero trend data — every run looks like the first. The history feature requires checking out the previous report and feeding it into the renderer. Set this up early; trend data is what makes Allure pay off.
Generating Allure HTML inside the test job. Generation takes 30-60 seconds and the rendered HTML is much larger than the raw results. Most teams keep the test job lean (only --alluredir) and run a separate job for HTML rendering and publishing. Decouples failure modes — a broken renderer doesn't fail the test job.

🎯 Practice task

Wire up reporting end-to-end in CI. 30-40 minutes.

Install the reporting tools:
```
pip install pytest-html allure-pytest
```
Add the screenshot_on_failure fixture and the pytest_runtest_makereport hook to tests/conftest.py (copy from earlier in this lesson).

Run locally with all three reporters:

pytest tests/ \
  --html=reports/report.html --self-contained-html \
  --alluredir=reports/allure \
  --junitxml=reports/junit.xml

Open reports/report.html in your browser — pytest-html dashboard.

Generate the Allure report locally:
```
# If you have the allure CLI installed (npm install -g allure-commandline or via brew):
allure serve reports/allure
```
A browser opens with the Allure dashboard. Click into a test detail; if you forced a failure earlier in this chapter, the failure screenshot is attached.
Force a failure to see screenshot capture in action. Edit a test to assert the wrong URL. Re-run with the three reporters. Open the Allure dashboard — the failed test has a failure-<name>.png attachment.

Add the artefact upload to your .github/workflows/playwright.yml:

- run: pytest tests/ --alluredir=reports/allure --html=reports/report.html --self-contained-html --junitxml=reports/junit.xml
 
- uses: actions/upload-artifact@v4
  if: always()
  with:
    name: test-reports
    path: reports/
    retention-days: 30

Push, watch the workflow run, download the test-reports artefact from the Actions tab.

Stretch: add the simple-elf/allure-report-action and peaceiris/actions-gh-pages steps for full Allure publishing. Configure gh-pages branch in your repo settings, push, wait for the action to finish — the rendered Allure dashboard is now live at https://<username>.github.io/<repo>/.

You've completed the CI/CD chapter end-to-end — workflows, parallelism, Docker, and reporting. The next chapter zooms out to production framework engineering: project structure, shared utilities, data factories, and the maintenance habits that keep a 30-test prototype evolving smoothly into a 300-test team-shared suite.