Running Playwright in GitHub Actions

9 min read

A test suite that only runs locally is a half-built suite. The whole point of automation is that the tests run on every pull request, every merge, every deploy — without a human remembering to click "run." GitHub Actions is the most common CI for modern web teams; Playwright integrates with it cleanly because both are first-party Microsoft tooling. This lesson is the canonical workflow file, the patterns for caching browsers and dependencies (huge speedup), serving the app under test, environment secrets, and the matrix combinations you'll actually use.

The minimal viable workflow

Every Playwright GitHub Actions workflow boils down to: install Node, install npm deps, install browsers, run tests, upload report. Here's the simplest version that works:

# .github/workflows/playwright.yml
name: Playwright Tests
on: [push, pull_request]
 
jobs:
  test:
    timeout-minutes: 30
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
 
      - uses: actions/setup-node@v4
        with:
          node-version: lts/*
 
      - name: Install dependencies
        run: npm ci
 
      - name: Install Playwright browsers
        run: npx playwright install --with-deps
 
      - name: Run tests
        run: npx playwright test
 
      - uses: actions/upload-artifact@v4
        if: ${{ !cancelled() }}
        with:
          name: playwright-report
          path: playwright-report/
          retention-days: 30

Push this. Open the Actions tab on GitHub. The workflow runs on every push and PR. If tests fail, the playwright-report/ artefact is downloadable from the run summary.

Three things to internalise:

  • npx playwright install --with-deps — installs the browser binaries and the system libraries Linux needs (fonts, codecs, GTK). The --with-deps flag is what makes this work on a fresh Ubuntu runner. Without it, you'll get cryptic "missing library" errors.
  • if: ${{ !cancelled() }} on the artefact upload — this means "upload even on test failure (but not on workflow cancel)." Without it, failed runs upload nothing — exactly when you want the report most.
  • timeout-minutes: 30 — a global runaway protection. Set it generously above your expected suite time; the goal is to catch infinite loops, not to time-bound healthy runs.

Caching for speed

Cold installs are slow: npm ci takes 30-60s, and downloading three browser binaries takes another 30-60s. On every PR, every push, every retry. Cache them:

- uses: actions/setup-node@v4
  with:
    node-version: lts/*
    cache: "npm"   # caches node_modules based on package-lock.json
 
- name: Cache Playwright browsers
  uses: actions/cache@v4
  id: pw-cache
  with:
    path: ~/.cache/ms-playwright
    key: pw-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
 
- name: Install Playwright browsers (only if cache missed)
  if: steps.pw-cache.outputs.cache-hit != 'true'
  run: npx playwright install --with-deps
 
- name: Install Playwright system deps (always)
  run: npx playwright install-deps

Two caches, two wins:

  • cache: 'npm' on setup-node is the simplest. It caches the npm download cache (not node_modules itself), so npm ci becomes much faster on warm runs.
  • actions/cache@v4 on ~/.cache/ms-playwright caches the browser binaries. The cache key includes package-lock.json so it invalidates when you upgrade Playwright.

The install-deps step (without --with-deps) installs only the system libraries every run — they're tiny but version-specific to the runner OS, so we don't cache them. The actual browser binaries (the big download) come from the cache.

After caching, a typical PR run goes from ~3 minutes of setup to ~30 seconds. Multiplied across hundreds of PRs a month, this is a meaningful CI cost saving.

Running against a deployed environment

For staging tests — your app is already deployed to staging.example.com, and you just want to point Playwright at it — set BASE_URL and skip the local server entirely:

- run: npx playwright test
  env:
    BASE_URL: https://staging.example.com

In playwright.config.ts:

use: { baseURL: process.env.BASE_URL || "http://localhost:3000" }

CI sets BASE_URL; local dev uses the default. One config, two environments.

This pattern is also what you use to run the same tests against per-PR preview deploys (Vercel, Netlify, Render previews):

- run: npx playwright test
  env:
    BASE_URL: ${{ steps.deploy.outputs.preview-url }}

Where the preview URL is captured from a previous deploy step.

Starting a server in CI

For pre-deploy regression tests — the suite runs against the app built and started in CI, not a deployed instance — let Playwright start the server for you:

// playwright.config.ts
export default defineConfig({
  webServer: {
    command: "npm run start",
    url: "http://localhost:3000",
    reuseExistingServer: !process.env.CI,
    timeout: 120 * 1000
  },
  use: { baseURL: "http://localhost:3000" }
});

webServer.command is what Playwright runs to boot the app. webServer.url is the URL Playwright polls until it returns a 200 — once it does, tests start. reuseExistingServer: !process.env.CI says "in local dev, if a server is already running, reuse it" (so you can npm start once and run tests in another terminal); on CI, always start a fresh server.

In the workflow:

- name: Build app
  run: npm run build
 
- name: Run Playwright tests
  run: npx playwright test
  env:
    CI: true

Playwright handles starting the server, waiting for it to be ready, and shutting it down after tests finish. No background process management in your workflow.

Environment variables and secrets

Real test suites need API keys, login credentials, database URLs. GitHub Actions has built-in secrets management:

- run: npx playwright test
  env:
    BASE_URL: https://staging.example.com
    TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }}
    TEST_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}
    API_TOKEN: ${{ secrets.STAGING_API_TOKEN }}
    CI: true

Secrets are added in repo settings → Secrets and variables → Actions. They're masked in logs, encrypted at rest, and only exposed to workflows that explicitly request them.

In your tests:

test("login as admin", async ({ page }) => {
  const email = process.env.TEST_USER_EMAIL!;
  const password = process.env.TEST_USER_PASSWORD!;
  await page.goto("/login");
  await page.getByLabel("Email").fill(email);
  await page.getByLabel("Password").fill(password);
});

The ! non-null assertion is correct here — if the env var is missing, you want the test to fail fast with a clear error, not silently use undefined.

The full pipeline, visualised

A production-grade workflow file

Putting every pattern together — sharded, cached, environment-aware, with merged reports:

# .github/workflows/playwright.yml
name: Playwright Tests
on:
  push:
    branches: [main]
  pull_request:
 
jobs:
  test:
    timeout-minutes: 30
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        shardIndex: [1, 2, 3, 4]
        shardTotal: [4]
    steps:
      - uses: actions/checkout@v4
 
      - uses: actions/setup-node@v4
        with:
          node-version: lts/*
          cache: "npm"
 
      - run: npm ci
 
      - name: Cache Playwright browsers
        uses: actions/cache@v4
        id: pw-cache
        with:
          path: ~/.cache/ms-playwright
          key: pw-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
 
      - name: Install Playwright browsers
        if: steps.pw-cache.outputs.cache-hit != 'true'
        run: npx playwright install --with-deps
 
      - name: Install Playwright system deps
        if: steps.pw-cache.outputs.cache-hit == 'true'
        run: npx playwright install-deps
 
      - name: Run tests
        run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
        env:
          BASE_URL: ${{ secrets.STAGING_URL }}
          TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }}
          TEST_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}
          CI: true
 
      - uses: actions/upload-artifact@v4
        if: ${{ !cancelled() }}
        with:
          name: blob-report-${{ matrix.shardIndex }}
          path: blob-report
          retention-days: 1
 
  merge-reports:
    needs: test
    if: always()
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: lts/*, cache: "npm" }
      - run: npm ci
 
      - uses: actions/download-artifact@v4
        with:
          path: all-blob-reports
          pattern: blob-report-*
          merge-multiple: true
 
      - run: npx playwright merge-reports --reporter html ./all-blob-reports
 
      - uses: actions/upload-artifact@v4
        with:
          name: html-report--final
          path: playwright-report
          retention-days: 14

This is what most production Playwright projects' CI looks like. ~80 lines of YAML and the pattern is set for the lifetime of the project.

Smoke vs full suite — a CI strategy

Running 200 tests across 4 shards and 3 browsers is comprehensive but slow. A common split:

  • PR pipeline — smoke suite (~20 critical tests), Chromium only, no sharding. Runs in 90 seconds. Fast feedback for every commit.
  • Main branch pipeline — full suite, all browsers, sharded. Runs in 5-10 minutes. Catches regressions before they ship.
  • Nightly pipeline — full suite plus visual snapshots and a11y audits. Runs at 2am. Surfaces issues that don't block PRs.

In Playwright, you tag tests with @smoke and filter:

test("homepage loads @smoke", async ({ page }) => { /* ... */ });
test("admin can configure features (full suite only)", async ({ page }) => { /* ... */ });
# PR job
- run: npx playwright test --grep @smoke
 
# Main job
- run: npx playwright test

This pattern is the default for almost every team running serious Playwright suites in production.

Coming from Cypress?

The mappings:

  • cypress run --record --parallel --key X (Cypress Cloud) → npx playwright test --shard=N/M (no service required).
  • cypress-io/github-action → built-in actions/setup-node + npx playwright install --with-deps.
  • Cypress's video recording is on by default → Playwright's trace: 'on-first-retry' plus screenshot: 'only-on-failure' is the equivalent.

The migration shape: replace the Cypress Action with the Playwright pattern above; remove Cypress Cloud configuration; drop --record and --parallel flags. Most teams find their CI YAML shrinks substantially.

⚠️ Common mistakes

  • Skipping the browser cache and reinstalling on every run. A 60-second per-run install is 6 hours of CI time across 360 PR runs. The cache step is 5 lines of YAML and makes the same workflow ~5x faster on warm runs.
  • Hardcoding BASE_URL in the test code. Tests should read from process.env.BASE_URL. Hardcoding localhost:3000 means the same suite can't run against staging without code changes — defeating the point of having tests.
  • Setting if: success() on the artefact upload. That's the opposite of what you want — failures are when you most need the report. Use if: ${{ !cancelled() }} (or if: always()) so failed runs upload too.

🎯 Practice task

Wire up a complete CI workflow. 30-40 minutes.

  1. Create .github/workflows/playwright.yml with the production-grade workflow above. Adjust BASE_URL to point at any public site you control or a sandbox like https://www.saucedemo.com.

  2. Add a playwright.config.ts that uses BASE_URL:

    import { defineConfig } from "@playwright/test";
     
    export default defineConfig({
      use: { baseURL: process.env.BASE_URL || "https://www.saucedemo.com" },
      reporter: process.env.CI ? [["blob"]] : [["html"]],
      workers: process.env.CI ? 2 : undefined
    });
  3. Add at least one @smoke-tagged test:

    import { test, expect } from "@playwright/test";
     
    test("homepage loads @smoke", async ({ page }) => {
      await page.goto("/");
      await expect(page.getByPlaceholder("Username")).toBeVisible();
    });
     
    test("inventory shows items after login", async ({ page }) => {
      await page.goto("/");
      await page.getByPlaceholder("Username").fill("standard_user");
      await page.getByPlaceholder("Password").fill("secret_sauce");
      await page.getByRole("button", { name: "Login" }).click();
      await expect(page.locator(".inventory_item")).toHaveCount(6);
    });
  4. Push to a branch and open a PR. Watch the Actions tab — the workflow runs across 4 shards, then the merge job runs.

  5. Download the html-report--final artefact. Open playwright-report/index.html. Verify it shows all your tests in one unified view.

  6. Force a failure. Change one assertion to fail (toHaveCount(99)). Push. Watch CI fail. Open the merged report — the failing test shows the assertion error, the auto-captured screenshot, and the trace (because of trace: 'on-first-retry'). This is the developer experience you want for every CI failure.

  7. Stretch: add a smoke-only PR job that runs a subset:

    smoke:
      runs-on: ubuntu-latest
      if: github.event_name == 'pull_request'
      steps:
        - # ... same setup ...
        - run: npx playwright test --grep @smoke

    Now PRs get fast smoke feedback (90 seconds) while merges run the full sharded suite (5 minutes). This is the dev-experience-vs-coverage trade-off most teams settle on.

You now have a production-grade CI pipeline. The next and final lesson of this chapter looks at the environment layer — Docker images, pinned browser versions, and the local-vs-CI parity that eliminates "works on my machine" once and for all.

// tip to track lessons you complete and pick up where you left off across devices.