Allure and HTML Reporting in CI

8 min read

CI green or red is the headline; the test run details are the story. A failing test on its own tells you something broke; a failing test with a screenshot, a network trace, a stack-traced video, and a history view of the same test passing yesterday tells you what changed and where. This last lesson of chapter 7 covers the three reporters most Playwright Python teams actually run in CI: pytest-html for a quick dashboard, allure-pytest for the rich production view, and JUnit XML for Jenkins compatibility. Plus the screenshot-on-failure hook that surfaces UI bugs alongside the assertion message, and the GitHub Actions wiring that publishes Allure as a downloadable artefact.

pytest-html — the simple option

Single HTML file, embeds everything, opens in any browser:

pip install pytest-html
pytest --html=reports/report.html --self-contained-html

--self-contained-html inlines all CSS, JavaScript, and images. The output is one file you can email, attach to a Slack message, or upload as a CI artefact without a follow-up "where's the rest of it?" question.

What you see:

  • Pass/fail summary at the top.
  • Per-test entries with duration, status, and any captured stdout/stderr.
  • Expandable detail for failures showing the full traceback.
  • Optional environment info (Python version, OS, browser).

For a typical PR check, this is enough — green or red, click through the failures, fix and re-run.

Allure — the production-grade option

Allure produces a multi-page web app with feature/severity hierarchies, history graphs, attachment galleries, and per-test step timelines. Install:

pip install allure-pytest

Run:

pytest --alluredir=reports/allure
allure serve reports/allure

--alluredir writes raw JSON results. allure serve starts a local HTTP server with the rendered dashboard. CI typically runs only the first step (writing JSON), uploads the directory as an artefact, and a follow-up step (or the reviewer locally) generates the HTML.

What Allure adds over pytest-html:

  • Trends — pass-rate, duration, flake rate over the last N runs (when history is configured).
  • Categories — group failures by exception type or message pattern.
  • Attachments per step — screenshots, JSON dumps, videos attached to specific actions inside a test.
  • Severity filtering — "show me all CRITICAL Auth tests that failed in the last 5 runs."

Worth the extra setup for any team that runs the suite more than once a day.

Screenshot on failure — the autouse fixture pattern

Failing tests with no screenshot are debugging-by-imagination. Wire up an autouse fixture that captures a screenshot whenever the test fails:

# tests/conftest.py
import pytest
import allure
from playwright.sync_api import Page
 
 
@pytest.hookimpl(hookwrapper=True, tryfirst=True)
def pytest_runtest_makereport(item, call):
    """Make the test outcome available to fixtures via item.rep_call."""
    outcome = yield
    rep = outcome.get_result()
    setattr(item, f"rep_{rep.when}", rep)
 
 
@pytest.fixture(autouse=True)
def screenshot_on_failure(request, page: Page):
    yield
    if hasattr(request.node, "rep_call") and request.node.rep_call.failed:
        allure.attach(
            page.screenshot(),
            name=f"failure-{request.node.name}",
            attachment_type=allure.attachment_type.PNG,
        )

Two pieces:

  1. pytest_runtest_makereport hook — attaches the test result (rep_setup, rep_call, rep_teardown) to the test item. Without this, the fixture has no way to know whether the test actually failed.
  2. screenshot_on_failure fixture — runs around every test (autouse=True), checks the result after the test body, and attaches a screenshot to Allure on failure.

Now every failed test in the Allure report has a clickable failure-<test_name> PNG attachment. No manual instrumentation in the test bodies.

Tracing — Playwright's built-in debugging artefact

Playwright captures a trace — a complete record of every action, every locator query, every network request, every snapshot — that lets you replay the test offline. Enable it via browser_context_args:

@pytest.fixture(scope="session")
def browser_context_args(browser_context_args):
    return {**browser_context_args, "record_video_dir": "videos/"}

Or use pytest-playwright's built-in flags:

pytest --tracing on-first-retry --screenshot only-on-failure --video retain-on-failure

The three options:

  • --tracing — records a .zip trace file for every test. View with playwright show-trace trace.zip.
  • --screenshot — captures a PNG.
  • --video — saves an MP4 of the test.

Together they form a complete failure forensics kit. Configure once, debug for years.

JUnit XML — the Jenkins lingua franca

Jenkins has been around longer than pytest-html or Allure; it consumes JUnit XML natively:

pytest --junitxml=results.xml

A single XML file with test results, durations, and failure messages. Jenkins's "Publish JUnit test results" plugin reads it, generates a dashboard, and tracks pass/fail trends. If your team uses Jenkins, JUnit XML is the easiest reporting target — every build tool understands it.

Combine with the others:

pytest --junitxml=results.xml --html=reports/report.html --alluredir=reports/allure

All three reporters at once. The JSON results, the HTML dashboard, and the JUnit XML file end up in the artefacts.

Uploading reports as CI artefacts

GitHub Actions:

- name: Run tests with all reporters
  run: |
    pytest tests/ \
      --html=reports/report.html --self-contained-html \
      --alluredir=reports/allure \
      --junitxml=reports/junit.xml
 
- uses: actions/upload-artifact@v4
  if: always()
  with:
    name: test-reports
    path: reports/
    retention-days: 30

if: always() ensures the upload runs even when the test step fails — exactly when reports matter most. Reviewers download reports/ from the workflow run page; opening report.html shows the pytest-html dashboard, opening junit.xml is for downstream tooling, and allure/ is the input for a separate Allure-publish step.

Generating the Allure HTML in CI

The two-step Allure workflow: pytest writes raw JSON, a follow-up step (or a separate workflow) renders HTML and publishes it.

- name: Run tests
  run: pytest tests/ --alluredir=allure-results
 
- name: Set up Allure CLI
  uses: simple-elf/allure-report-action@v1.7
  if: always()
  with:
    allure_results: allure-results
    allure_report: allure-report
 
- uses: actions/upload-artifact@v4
  if: always()
  with:
    name: allure-report
    path: allure-report/
 
- name: Deploy to GitHub Pages
  uses: peaceiris/actions-gh-pages@v3
  if: always()
  with:
    github_token: ${{ secrets.GITHUB_TOKEN }}
    publish_dir: allure-report

Three layers:

  1. allure-report-action generates the HTML dashboard from the raw results.
  2. upload-artifact lets you download the rendered HTML from the workflow run.
  3. peaceiris/actions-gh-pages publishes to GitHub Pages — anyone with the URL can browse the latest report. Combined with Allure's history feature, this gives the team a permanent dashboard.

The reporting pipeline end-to-end

The five-step pipeline turns raw test output into a team-facing dashboard automatically. Set it up once; never touch it again unless the reporters themselves get major version bumps.

Allure history — the killer feature

By default, every Allure run starts fresh — no memory of previous runs. Enable history persistence and the dashboard suddenly shows trend graphs, flake rates, and "this test has been failing for 3 runs":

- name: Get Allure history
  uses: actions/checkout@v4
  if: always()
  with:
    ref: gh-pages
    path: gh-pages
 
- name: Generate Allure report with history
  uses: simple-elf/allure-report-action@v1.7
  if: always()
  with:
    allure_results: allure-results
    allure_report: allure-report
    keep_reports: 20
    gh_pages: gh-pages

The action checks out the previous report from gh-pages, merges its history into the new one, and generates a dashboard with keep_reports: 20 runs of context. The team sees how flake rates trend over time — the metric that matters most when keeping a 500-test suite healthy.

Coming from Playwright TypeScript?

The TS course covers Playwright's built-in HTML reporter and Allure separately. The Python equivalents are:

  • TS npx playwright test --reporter=html → Python pytest --html=reports/report.html
  • TS npx playwright test --reporter=allure-playwright → Python pytest --alluredir=reports/allure
  • TS testInfo.attach(...) → Python allure.attach(...)
  • TS --reporter=junit → Python pytest --junitxml=results.xml

Same conceptual model; same artefact shapes. The Python ecosystem has more reporter options because pytest existed before Playwright — you can layer pytest-html, pytest-md (Markdown summary), pytest-csv, etc. depending on what your team needs.

⚠️ Common mistakes

  • Skipping the pytest_runtest_makereport hook when using screenshot_on_failure. Without the hook, request.node.rep_call doesn't exist; the fixture either errors out or never captures any screenshots. The hook is twelve lines, lives in conftest, and you only write it once.
  • Treating Allure history as automatic. Out of the box, Allure shows zero trend data — every run looks like the first. The history feature requires checking out the previous report and feeding it into the renderer. Set this up early; trend data is what makes Allure pay off.
  • Generating Allure HTML inside the test job. Generation takes 30-60 seconds and the rendered HTML is much larger than the raw results. Most teams keep the test job lean (only --alluredir) and run a separate job for HTML rendering and publishing. Decouples failure modes — a broken renderer doesn't fail the test job.

🎯 Practice task

Wire up reporting end-to-end in CI. 30-40 minutes.

  1. Install the reporting tools:

    pip install pytest-html allure-pytest
  2. Add the screenshot_on_failure fixture and the pytest_runtest_makereport hook to tests/conftest.py (copy from earlier in this lesson).

  3. Run locally with all three reporters:

    pytest tests/ \
      --html=reports/report.html --self-contained-html \
      --alluredir=reports/allure \
      --junitxml=reports/junit.xml

    Open reports/report.html in your browser — pytest-html dashboard.

  4. Generate the Allure report locally:

    # If you have the allure CLI installed (npm install -g allure-commandline or via brew):
    allure serve reports/allure

    A browser opens with the Allure dashboard. Click into a test detail; if you forced a failure earlier in this chapter, the failure screenshot is attached.

  5. Force a failure to see screenshot capture in action. Edit a test to assert the wrong URL. Re-run with the three reporters. Open the Allure dashboard — the failed test has a failure-<name>.png attachment.

  6. Add the artefact upload to your .github/workflows/playwright.yml:

    - run: pytest tests/ --alluredir=reports/allure --html=reports/report.html --self-contained-html --junitxml=reports/junit.xml
     
    - uses: actions/upload-artifact@v4
      if: always()
      with:
        name: test-reports
        path: reports/
        retention-days: 30

    Push, watch the workflow run, download the test-reports artefact from the Actions tab.

  7. Stretch: add the simple-elf/allure-report-action and peaceiris/actions-gh-pages steps for full Allure publishing. Configure gh-pages branch in your repo settings, push, wait for the action to finish — the rendered Allure dashboard is now live at https://<username>.github.io/<repo>/.

You've completed the CI/CD chapter end-to-end — workflows, parallelism, Docker, and reporting. The next chapter zooms out to production framework engineering: project structure, shared utilities, data factories, and the maintenance habits that keep a 30-test prototype evolving smoothly into a 300-test team-shared suite.

// tip to track lessons you complete and pick up where you left off across devices.