A test suite that only runs locally is a half-built suite. The whole point of automation is that the tests run on every pull request, every merge, every deploy — without a human remembering to click "run." GitHub Actions is the most common CI for modern web teams; Playwright integrates with it cleanly because both are first-party Microsoft tooling. This lesson is the canonical workflow file, the patterns for caching browsers and dependencies (huge speedup), serving the app under test, environment secrets, and the matrix combinations you'll actually use.
The minimal viable workflow
Every Playwright GitHub Actions workflow boils down to: install Node, install npm deps, install browsers, run tests, upload report. Here's the simplest version that works:
# .github/workflows/playwright.yml
name: Playwright Tests
on: [push, pull_request]
jobs:
test:
timeout-minutes: 30
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: lts/*
- name: Install dependencies
run: npm ci
- name: Install Playwright browsers
run: npx playwright install --with-deps
- name: Run tests
run: npx playwright test
- uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: playwright-report
path: playwright-report/
retention-days: 30Push this. Open the Actions tab on GitHub. The workflow runs on every push and PR. If tests fail, the playwright-report/ artefact is downloadable from the run summary.
Three things to internalise:
npx playwright install --with-deps— installs the browser binaries and the system libraries Linux needs (fonts, codecs, GTK). The--with-depsflag is what makes this work on a fresh Ubuntu runner. Without it, you'll get cryptic "missing library" errors.if: ${{ !cancelled() }}on the artefact upload — this means "upload even on test failure (but not on workflow cancel)." Without it, failed runs upload nothing — exactly when you want the report most.timeout-minutes: 30— a global runaway protection. Set it generously above your expected suite time; the goal is to catch infinite loops, not to time-bound healthy runs.
Caching for speed
Cold installs are slow: npm ci takes 30-60s, and downloading three browser binaries takes another 30-60s. On every PR, every push, every retry. Cache them:
- uses: actions/setup-node@v4
with:
node-version: lts/*
cache: "npm" # caches node_modules based on package-lock.json
- name: Cache Playwright browsers
uses: actions/cache@v4
id: pw-cache
with:
path: ~/.cache/ms-playwright
key: pw-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
- name: Install Playwright browsers (only if cache missed)
if: steps.pw-cache.outputs.cache-hit != 'true'
run: npx playwright install --with-deps
- name: Install Playwright system deps (always)
run: npx playwright install-depsTwo caches, two wins:
cache: 'npm'onsetup-nodeis the simplest. It caches the npm download cache (notnode_modulesitself), sonpm cibecomes much faster on warm runs.actions/cache@v4on~/.cache/ms-playwrightcaches the browser binaries. The cache key includespackage-lock.jsonso it invalidates when you upgrade Playwright.
The install-deps step (without --with-deps) installs only the system libraries every run — they're tiny but version-specific to the runner OS, so we don't cache them. The actual browser binaries (the big download) come from the cache.
After caching, a typical PR run goes from ~3 minutes of setup to ~30 seconds. Multiplied across hundreds of PRs a month, this is a meaningful CI cost saving.
Running against a deployed environment
For staging tests — your app is already deployed to staging.example.com, and you just want to point Playwright at it — set BASE_URL and skip the local server entirely:
- run: npx playwright test
env:
BASE_URL: https://staging.example.comIn playwright.config.ts:
use: { baseURL: process.env.BASE_URL || "http://localhost:3000" }CI sets BASE_URL; local dev uses the default. One config, two environments.
This pattern is also what you use to run the same tests against per-PR preview deploys (Vercel, Netlify, Render previews):
- run: npx playwright test
env:
BASE_URL: ${{ steps.deploy.outputs.preview-url }}Where the preview URL is captured from a previous deploy step.
Starting a server in CI
For pre-deploy regression tests — the suite runs against the app built and started in CI, not a deployed instance — let Playwright start the server for you:
// playwright.config.ts
export default defineConfig({
webServer: {
command: "npm run start",
url: "http://localhost:3000",
reuseExistingServer: !process.env.CI,
timeout: 120 * 1000
},
use: { baseURL: "http://localhost:3000" }
});webServer.command is what Playwright runs to boot the app. webServer.url is the URL Playwright polls until it returns a 200 — once it does, tests start. reuseExistingServer: !process.env.CI says "in local dev, if a server is already running, reuse it" (so you can npm start once and run tests in another terminal); on CI, always start a fresh server.
In the workflow:
- name: Build app
run: npm run build
- name: Run Playwright tests
run: npx playwright test
env:
CI: truePlaywright handles starting the server, waiting for it to be ready, and shutting it down after tests finish. No background process management in your workflow.
Environment variables and secrets
Real test suites need API keys, login credentials, database URLs. GitHub Actions has built-in secrets management:
- run: npx playwright test
env:
BASE_URL: https://staging.example.com
TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }}
TEST_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}
API_TOKEN: ${{ secrets.STAGING_API_TOKEN }}
CI: trueSecrets are added in repo settings → Secrets and variables → Actions. They're masked in logs, encrypted at rest, and only exposed to workflows that explicitly request them.
In your tests:
test("login as admin", async ({ page }) => {
const email = process.env.TEST_USER_EMAIL!;
const password = process.env.TEST_USER_PASSWORD!;
await page.goto("/login");
await page.getByLabel("Email").fill(email);
await page.getByLabel("Password").fill(password);
});The ! non-null assertion is correct here — if the env var is missing, you want the test to fail fast with a clear error, not silently use undefined.
The full pipeline, visualised
A production-grade workflow file
Putting every pattern together — sharded, cached, environment-aware, with merged reports:
# .github/workflows/playwright.yml
name: Playwright Tests
on:
push:
branches: [main]
pull_request:
jobs:
test:
timeout-minutes: 30
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
shardIndex: [1, 2, 3, 4]
shardTotal: [4]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: lts/*
cache: "npm"
- run: npm ci
- name: Cache Playwright browsers
uses: actions/cache@v4
id: pw-cache
with:
path: ~/.cache/ms-playwright
key: pw-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
- name: Install Playwright browsers
if: steps.pw-cache.outputs.cache-hit != 'true'
run: npx playwright install --with-deps
- name: Install Playwright system deps
if: steps.pw-cache.outputs.cache-hit == 'true'
run: npx playwright install-deps
- name: Run tests
run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
env:
BASE_URL: ${{ secrets.STAGING_URL }}
TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }}
TEST_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}
CI: true
- uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: blob-report-${{ matrix.shardIndex }}
path: blob-report
retention-days: 1
merge-reports:
needs: test
if: always()
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: lts/*, cache: "npm" }
- run: npm ci
- uses: actions/download-artifact@v4
with:
path: all-blob-reports
pattern: blob-report-*
merge-multiple: true
- run: npx playwright merge-reports --reporter html ./all-blob-reports
- uses: actions/upload-artifact@v4
with:
name: html-report--final
path: playwright-report
retention-days: 14This is what most production Playwright projects' CI looks like. ~80 lines of YAML and the pattern is set for the lifetime of the project.
Smoke vs full suite — a CI strategy
Running 200 tests across 4 shards and 3 browsers is comprehensive but slow. A common split:
- PR pipeline — smoke suite (~20 critical tests), Chromium only, no sharding. Runs in 90 seconds. Fast feedback for every commit.
- Main branch pipeline — full suite, all browsers, sharded. Runs in 5-10 minutes. Catches regressions before they ship.
- Nightly pipeline — full suite plus visual snapshots and a11y audits. Runs at 2am. Surfaces issues that don't block PRs.
In Playwright, you tag tests with @smoke and filter:
test("homepage loads @smoke", async ({ page }) => { /* ... */ });
test("admin can configure features (full suite only)", async ({ page }) => { /* ... */ });# PR job
- run: npx playwright test --grep @smoke
# Main job
- run: npx playwright testThis pattern is the default for almost every team running serious Playwright suites in production.
Coming from Cypress?
The mappings:
cypress run --record --parallel --key X(Cypress Cloud) →npx playwright test --shard=N/M(no service required).cypress-io/github-action→ built-inactions/setup-node+npx playwright install --with-deps.- Cypress's video recording is on by default → Playwright's
trace: 'on-first-retry'plusscreenshot: 'only-on-failure'is the equivalent.
The migration shape: replace the Cypress Action with the Playwright pattern above; remove Cypress Cloud configuration; drop --record and --parallel flags. Most teams find their CI YAML shrinks substantially.
⚠️ Common mistakes
- Skipping the browser cache and reinstalling on every run. A 60-second per-run install is 6 hours of CI time across 360 PR runs. The cache step is 5 lines of YAML and makes the same workflow ~5x faster on warm runs.
- Hardcoding
BASE_URLin the test code. Tests should read fromprocess.env.BASE_URL. Hardcodinglocalhost:3000means the same suite can't run against staging without code changes — defeating the point of having tests. - Setting
if: success()on the artefact upload. That's the opposite of what you want — failures are when you most need the report. Useif: ${{ !cancelled() }}(orif: always()) so failed runs upload too.
🎯 Practice task
Wire up a complete CI workflow. 30-40 minutes.
-
Create
.github/workflows/playwright.ymlwith the production-grade workflow above. AdjustBASE_URLto point at any public site you control or a sandbox likehttps://www.saucedemo.com. -
Add a
playwright.config.tsthat usesBASE_URL:import { defineConfig } from "@playwright/test"; export default defineConfig({ use: { baseURL: process.env.BASE_URL || "https://www.saucedemo.com" }, reporter: process.env.CI ? [["blob"]] : [["html"]], workers: process.env.CI ? 2 : undefined }); -
Add at least one
@smoke-tagged test:import { test, expect } from "@playwright/test"; test("homepage loads @smoke", async ({ page }) => { await page.goto("/"); await expect(page.getByPlaceholder("Username")).toBeVisible(); }); test("inventory shows items after login", async ({ page }) => { await page.goto("/"); await page.getByPlaceholder("Username").fill("standard_user"); await page.getByPlaceholder("Password").fill("secret_sauce"); await page.getByRole("button", { name: "Login" }).click(); await expect(page.locator(".inventory_item")).toHaveCount(6); }); -
Push to a branch and open a PR. Watch the Actions tab — the workflow runs across 4 shards, then the merge job runs.
-
Download the
html-report--finalartefact. Openplaywright-report/index.html. Verify it shows all your tests in one unified view. -
Force a failure. Change one assertion to fail (
toHaveCount(99)). Push. Watch CI fail. Open the merged report — the failing test shows the assertion error, the auto-captured screenshot, and the trace (because oftrace: 'on-first-retry'). This is the developer experience you want for every CI failure. -
Stretch: add a smoke-only PR job that runs a subset:
smoke: runs-on: ubuntu-latest if: github.event_name == 'pull_request' steps: - # ... same setup ... - run: npx playwright test --grep @smokeNow PRs get fast smoke feedback (90 seconds) while merges run the full sharded suite (5 minutes). This is the dev-experience-vs-coverage trade-off most teams settle on.
You now have a production-grade CI pipeline. The next and final lesson of this chapter looks at the environment layer — Docker images, pinned browser versions, and the local-vs-CI parity that eliminates "works on my machine" once and for all.