Parallel Execution and Load Balancing — Cypress with TypeScript

Cypress runs specs serially within a single process. A 60-spec suite that takes 15 seconds per spec runs for 15 minutes — every push, every CI worker. Doubling the workers doesn't help if Cypress can't share the load between them. This lesson covers the two ways to actually parallelise: Cypress Cloud's smart orchestration (paid, easiest) and matrix-based spec sharding (free, manual). Either approach takes a 15-minute suite to under 5 minutes; both rely on you keeping spec files focused enough to be the unit of parallelism.

Why "parallel" needs a coordinator

Cypress's process model is one spec at a time. Three workers running npx cypress run against the same project all execute every spec — three identical 15-minute runs in parallel. That's not parallelism; it's wasted CI minutes.

Real parallelisation means distributing different specs to different workers. Worker 1 runs auth/*.cy.ts, worker 2 runs products/*.cy.ts, worker 3 runs checkout/*.cy.ts. Total wall-clock time drops to roughly the longest single shard.

Two ways to coordinate the distribution:

Cypress Cloud — a paid SaaS service. Workers report into the Cloud; the Cloud assigns specs based on historical timing data; you get balanced shards automatically.
Matrix sharding — split specs by a pattern at the CI level. Each matrix job runs its own subset. Free; manual; works on every CI.

Cypress Cloud parallelisation

The Cloud-driven setup is the smoothest. Add record: true and parallel: true to the action:

# .github/workflows/cypress.yml
jobs:
  cypress-run:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        containers: [1, 2, 3, 4]   # 4 parallel workers
    steps:
      - uses: actions/checkout@v4
      - uses: cypress-io/github-action@v6
        with:
          record: true
          parallel: true
          group: "CI"
          build: npm run build
          start: npm start
          wait-on: "http://localhost:3000"
        env:
          CYPRESS_RECORD_KEY: ${{ secrets.CYPRESS_RECORD_KEY }}
          GITHUB_TOKEN:       ${{ secrets.GITHUB_TOKEN }}

Four matrix entries spawn four parallel jobs. Each one calls into Cypress Cloud, which says "you take spec A, spec C, spec G" — based on what's fastest given the historical durations. The four workers finish at roughly the same time and the build completes in a quarter of the serial duration.

The record: true flag uploads results to the Cloud (covered in the next lesson). The parallel: true flag is what activates the load-balancer; without it, record: true alone just uploads results without parallelising.

group: "CI" lets you run the same suite across multiple groups (e.g., one group for Chrome, one for Firefox) and have the Cloud track them separately.

Matrix sharding without Cypress Cloud

If you don't want to pay for Cloud, split specs by glob in a matrix:

strategy:
  fail-fast: false
  matrix:
    spec-group:
      - "cypress/e2e/auth/**/*.cy.ts"
      - "cypress/e2e/products/**/*.cy.ts"
      - "cypress/e2e/checkout/**/*.cy.ts"
      - "cypress/e2e/admin/**/*.cy.ts"
 
steps:
  - uses: actions/checkout@v4
  - uses: cypress-io/github-action@v6
    with:
      spec: ${{ matrix.spec-group }}
      build: npm run build
      start: npm start
      wait-on: "http://localhost:3000"

Four jobs, each running its own folder of specs. No Cloud, no record key, no recurring fee. The downside: balance is manual — if auth/ runs in 30 seconds and checkout/ runs in 10 minutes, the suite is bottlenecked on checkout/.

For a fairer split, lean on a script that distributes spec files round-robin or by historical duration. The OSS package cypress-split does this:

strategy:
  matrix:
    container: [1, 2, 3, 4]
steps:
  - uses: actions/checkout@v4
  - run: npm ci
  - run: npx cypress-split run --total-runners 4 --runner ${{ matrix.container }}

cypress-split reads a cypress-split.json (or computes splits on the fly) and assigns specs to runners deterministically. It's the closest free alternative to Cloud's smart orchestration.

Sorry-cypress — the open-source Cloud

For teams that want Cloud-style features without the SaaS cost, sorry-cypress is a self-hosted compatible service. Same record: true / parallel: true config in your CI workflow, except you point at a sorry-cypress server you run yourself. Useful for on-prem requirements or large teams whose Cloud bill would dwarf the cost of running a small server.

Don't reach for sorry-cypress unless cost is a real constraint — the Cloud's UX, flake detection, and maintenance burden are usually worth the licence.

Why spec-file size matters

Parallel execution distributes whole spec files. Cypress can't split one spec across two workers — the unit of parallelism is the file. So a 12-minute spec sets a 12-minute floor regardless of how many workers you add:

Total = max(longest_single_spec, total_duration / N_workers)

Real numbers, illustrative:

60-test suite timing — sequential vs parallel

1 worker (serial)900s

2 workers480s

4 workers260s

8 workers180s

16 workers (capped by longest spec)170s

The 8-worker → 16-worker drop-off is the point where the longest single spec is now the bottleneck. Adding more workers stops helping; the only further speedup comes from splitting that fat spec.

The discipline that pays off: keep individual .cy.ts files in the 1-3 minute range. A spec running 8 minutes is a sign you should split it into two files — same coverage, twice the parallelism.

Speed levers beyond parallelism

Parallel execution multiplies whatever speed you already have. Compound it with the techniques you've already learned:

API login + cy.session (chapter 6) — saves 4 seconds per test, 4 minutes on a 60-test suite, and that's before parallelism.
Stub network responses (chapter 4) — kills the dependency on slow staging APIs.
App actions for state setup (chapter 5) — replaces UI-driven setup with sub-second function calls.
No cy.wait(ms) anywhere (chapter 3) — every fixed wait is dead time.
Independent specs — no spec depends on the order of another. Parallelism breaks the moment two specs share state.

A team that does all five gets 4× speedup from parallelism on top of a suite that's already 3× faster than the naive version. The compound: 12-minute serial run → 1-minute parallel run.

A complete production setup

Putting the whole stack together, a real cypress.yml for a 60-spec suite:

name: Cypress
 
on:
  pull_request:
    branches: [main]
 
jobs:
  cypress-run:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        containers: [1, 2, 3, 4]
    steps:
      - uses: actions/checkout@v4
 
      - uses: cypress-io/github-action@v6
        with:
          record: true
          parallel: true
          group: "CI - Chrome"
          browser: chrome
          build: npm run build
          start: npm start
          wait-on: "http://localhost:3000"
        env:
          CYPRESS_RECORD_KEY:     ${{ secrets.CYPRESS_RECORD_KEY }}
          CYPRESS_BASE_URL:       ${{ secrets.STAGING_URL }}
          CYPRESS_ADMIN_PASSWORD: ${{ secrets.STAGING_ADMIN_PASSWORD }}
          GITHUB_TOKEN:           ${{ secrets.GITHUB_TOKEN }}

Four parallel workers. Cloud-orchestrated load balancing. Secret-injected env vars. PR-only trigger to keep cost down. This is the pattern most production Cypress projects converge on.

⚠️ Common mistakes

Adding record: true without parallel: true. The Cloud receives results from every worker but doesn't distribute specs — every worker still runs every spec. The recording feature works; the speedup doesn't. Both flags are needed for parallelism.
Sharding by glob on folders of wildly different sizes. auth/ runs in 30s and checkout/ runs in 8 minutes; sharding by folder creates a 30-second worker and an 8-minute worker. Either rebalance the folders or use Cloud / cypress-split for duration-aware splitting.
Sharing state between specs and breaking under parallelism. A test in auth/login.cy.ts creates a user; a test in dashboard/welcome.cy.ts assumes the user exists. Serial: works. Parallel: the dashboard spec runs before the auth spec on a different worker and fails. Every spec must be independent — chapter 9 returns to this rule explicitly.

🎯 Practice task

Run a real 4-way parallel suite. 30-40 minutes.

With Cypress Cloud (preferred if budget allows) — sign up, create a project, copy the record key, add it to GitHub Actions secrets as CYPRESS_RECORD_KEY. Update your workflow to add record: true and parallel: true plus a 4-container matrix. Push a PR; confirm four parallel jobs run.
Without Cypress Cloud — install cypress-split (npm install --save-dev cypress-split). Configure the matrix to call npx cypress-split run --total-runners 4 --runner ${{ matrix.container }}. Push; confirm specs are distributed across four workers.
Time the serial baseline — run the full suite in one job (runs-on: ubuntu-latest, no matrix). Note the wall-clock time.
Time the parallel run — re-run with the matrix. Confirm the wall-clock dropped to roughly serial-time / 4 (give or take the longest single spec).
Identify the bottleneck spec — open Cypress Cloud or read the action output. Which single spec takes longest? Could it be split into two files?
Compound the speedup — replace any UI logins in the bottleneck spec with cy.sessionLogin (chapter 6) or app-actions-based seeding (chapter 5). Re-run; confirm the slowest spec dropped.
Stretch: add a second matrix dimension for browser (browser: [chrome, firefox]). With 4 containers × 2 browsers, 8 jobs run in parallel. Confirm the matrix expands as expected and no jobs share state.

The last lesson of chapter 8 takes a closer look at Cypress Cloud — its dashboard, analytics, and the flake detection that turns "rerun until green" into a managed workflow.