Caching Dependencies for Faster Builds

8 min read

A sharded 4-runner workflow spends 90 seconds on dependency installation per runner. That's 6 minutes of total compute time spent downloading the same files that were already downloaded in yesterday's build. Multiply across a team that opens 15 PRs per day and you're burning 90 minutes of CI time daily on npm install. Caching eliminates this. On a warm cache, installation takes 3–5 seconds instead of 90.

How CI caching works

A cache stores a directory (like node_modules or ~/.m2/repository) after a build and restores it at the start of the next build. The cache is keyed on a hash — typically of your lock file (package-lock.json, pom.xml). When the hash matches, the cache restores in seconds. When the lock file changes (you add or remove a dependency), the hash changes, the cache misses, and dependencies re-install fresh.

The invariant is correct: the same lock file always produces the same dependency tree. If the lock file hasn't changed, the downloaded packages are identical to what was installed last time — so restoring from cache is safe.

GitHub Actions: built-in caching

The easiest caching in GitHub Actions is the built-in support in setup actions. One property enables it:

- uses: actions/setup-node@v4
  with:
    node-version: '20'
    cache: 'npm'              # caches ~/.npm between runs
 
- uses: actions/setup-java@v4
  with:
    java-version: '21'
    distribution: 'temurin'
    cache: 'maven'            # caches ~/.m2/repository between runs
 
- uses: actions/setup-python@v5
  with:
    python-version: '3.12'
    cache: 'pip'              # caches pip's download cache

These handle everything: computing the cache key from the lock file, restoring on a hit, and saving on a miss. Use built-in caching first — it's less configuration and less to maintain.

GitHub Actions: manual actions/cache

When built-in caching isn't available — a custom tool, an unusual directory, or Playwright's browser binaries — use actions/cache directly:

- name: Cache Playwright browsers
  uses: actions/cache@v4
  with:
    path: ~/.cache/ms-playwright
    key: playwright-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
    restore-keys: |
      playwright-${{ runner.os }}-
 
- run: npm ci
- run: npx playwright install --with-deps chromium

The key is playwright-ubuntu-latest-<hash>. When package-lock.json changes, the hash changes and the cache misses. restore-keys provides a fallback: if the exact key misses, try any cache entry that starts with playwright-ubuntu-latest-. A partial hit is better than a complete miss — it restores the closest previous cache, and npm ci + playwright install bring it up to date.

Caching Maven repositories:

- uses: actions/cache@v4
  with:
    path: |
      ~/.m2/repository
      !~/.m2/repository/com/example    # exclude your own artifacts
    key: maven-${{ runner.os }}-${{ hashFiles('**/pom.xml') }}
    restore-keys: |
      maven-${{ runner.os }}-

Excluding your own group ID prevents stale local artifacts from being restored from cache.

Caching Gradle:

- uses: actions/cache@v4
  with:
    path: |
      ~/.gradle/caches
      ~/.gradle/wrapper
    key: gradle-${{ runner.os }}-${{ hashFiles('**/*.gradle*', '**/gradle-wrapper.properties') }}
    restore-keys: |
      gradle-${{ runner.os }}-

What and what not to cache

Cache thisDon't cache this
~/.m2/repository (Maven packages)Build output (target/, build/)
~/.npm or node_modulesTest results and screenshots
~/.cache/ms-playwright (Playwright binaries)Secrets or credentials
~/.gradle/cachesFiles that change on every build
~/.cache/pip (Python packages)Large generated files (>1GB)

GitHub Actions imposes a 10GB cache limit per repository. Caches that haven't been accessed in 7 days are evicted automatically. Don't cache build artifacts — they change every run and pollute the cache with stale data.

Speed impact: cached vs uncached

A typical Java/Maven project with Selenium:

StepUncachedCached
Maven dependency download3–5 minutes5–15 seconds
Playwright browser install60–90 seconds3–5 seconds
npm install (300 packages)60–90 seconds2–5 seconds

On a 4-shard workflow, every shard pays the uncached cost independently. With caching, all four shards share a warm cache.

Uncached vs cached pipeline: same test suite, same test count

Uncached — every run

  • npm install: 90 seconds

    Downloads 200MB from npm registry every run

  • Playwright install: 80 seconds

    Downloads browser binaries from GitHub every run

  • Maven resolve: 4 minutes

    Downloads JARs from Maven Central every run

  • 4 shards × 5 min setup = 20 min overhead

    Pure waiting time before a test runs

Cached — warm cache

  • npm install: 5 seconds

    Restores node_modules from cache — skips network

  • Playwright install: 4 seconds

    Restores ~/.cache/ms-playwright from cache

  • Maven resolve: 12 seconds

    Restores ~/.m2/repository — no network needed

  • 4 shards × 21 sec setup = 1.5 min overhead

    Time saved: 18+ minutes per pipeline run

Jenkins caching

Jenkins doesn't have a built-in equivalent to actions/cache. The traditional Jenkins approach relies on two workspace behaviours:

Persistent workspace: by default, Jenkins keeps the workspace between builds. node_modules and ~/.m2 persist across runs on the same agent. This is implicit caching — it works as long as the same agent picks up the job, which isn't guaranteed in multi-agent setups.

Pipeline Cache Plugin: for explicit caching in multi-agent setups, install the Pipeline Cache Plugin. Configuration is similar in concept to actions/cache:

cache(maxCacheSize: 400, caches: [
    arbitraryFileCache(path: '~/.m2/repository', cacheValidityDecidingFile: 'pom.xml'),
    arbitraryFileCache(path: 'node_modules', cacheValidityDecidingFile: 'package-lock.json')
]) {
    sh 'mvn test -B'
}

For Maven specifically, the -Dmaven.repo.local flag lets you pin the local repository to a workspace subdirectory that persists between builds:

sh 'mvn test -Dmaven.repo.local=${WORKSPACE}/.m2-local -B'

⚠️ Common mistakes

  • Caching node_modules instead of ~/.npm. Caching the node_modules directory is faster to restore but can cause subtle issues: some packages contain native binaries compiled for the cache machine's OS. Caching ~/.npm (the download cache) is safer — npm ci still runs, but re-uses cached packages rather than re-downloading.
  • Not including restore-keys. Without restore-keys, a cache miss (because package-lock.json changed) means a full re-download. With restore-keys: maven-${{ runner.os }}-, a partial cache is restored and only the delta is downloaded — much faster for incremental dependency updates.
  • Caching too aggressively with stale keys. Changing key to something that never changes (key: always-hit) means you'll restore a stale cache that may have corrupted or missing packages. The lock file hash is the right key because it changes exactly when dependencies change.

🎯 Practice task

Add caching to your CI workflow — 20 minutes.

  1. If using setup-node or setup-java: add the cache: property ('npm' or 'maven'). Push. Look at the Actions log for "Cache restored" vs "Cache not found" messages.
  2. If using Playwright: add a separate actions/cache step for ~/.cache/ms-playwright keyed on package-lock.json. Run twice. Confirm the second run shows "Cache restored" and the Playwright install step takes 3–5 seconds instead of 60+.
  3. Compare the "Setup" step duration between the first run (cold cache) and the second run (warm cache). Record the saving.
  4. Stretch: on a sharded workflow (4 shards), measure total setup time without caching (4 × cold install) vs with caching (4 × warm restore). If your setup costs 90 seconds cold and 8 seconds warm, that's 82 seconds × 4 shards = 5.5 minutes saved per pipeline run.

The final lesson in this chapter covers test selection — running only the tests that are relevant to a given code change, the next frontier beyond parallelism and caching.

// tip to track lessons you complete and pick up where you left off across devices.