Caching Dependencies for Faster Builds

A sharded 4-runner workflow spends 90 seconds on dependency installation per runner. That's 6 minutes of total compute time spent downloading the same files that were already downloaded in yesterday's build. Multiply across a team that opens 15 PRs per day and you're burning 90 minutes of CI time daily on npm install. Caching eliminates this. On a warm cache, installation takes 3–5 seconds instead of 90.

How CI caching works

A cache stores a directory (like node_modules or ~/.m2/repository) after a build and restores it at the start of the next build. The cache is keyed on a hash — typically of your lock file (package-lock.json, pom.xml). When the hash matches, the cache restores in seconds. When the lock file changes (you add or remove a dependency), the hash changes, the cache misses, and dependencies re-install fresh.

The invariant is correct: the same lock file always produces the same dependency tree. If the lock file hasn't changed, the downloaded packages are identical to what was installed last time — so restoring from cache is safe.

GitHub Actions: built-in caching

The easiest caching in GitHub Actions is the built-in support in setup actions. One property enables it:

- uses: actions/setup-node@v4
  with:
    node-version: '20'
    cache: 'npm'              # caches ~/.npm between runs
 
- uses: actions/setup-java@v4
  with:
    java-version: '21'
    distribution: 'temurin'
    cache: 'maven'            # caches ~/.m2/repository between runs
 
- uses: actions/setup-python@v5
  with:
    python-version: '3.12'
    cache: 'pip'              # caches pip's download cache

These handle everything: computing the cache key from the lock file, restoring on a hit, and saving on a miss. Use built-in caching first — it's less configuration and less to maintain.

GitHub Actions: manual `actions/cache`

When built-in caching isn't available — a custom tool, an unusual directory, or Playwright's browser binaries — use actions/cache directly:

- name: Cache Playwright browsers
  uses: actions/cache@v4
  with:
    path: ~/.cache/ms-playwright
    key: playwright-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
    restore-keys: |
      playwright-${{ runner.os }}-
 
- run: npm ci
- run: npx playwright install --with-deps chromium

The key is playwright-ubuntu-latest-<hash>. When package-lock.json changes, the hash changes and the cache misses. restore-keys provides a fallback: if the exact key misses, try any cache entry that starts with playwright-ubuntu-latest-. A partial hit is better than a complete miss — it restores the closest previous cache, and npm ci + playwright install bring it up to date.

Caching Maven repositories:

- uses: actions/cache@v4
  with:
    path: |
      ~/.m2/repository
      !~/.m2/repository/com/example    # exclude your own artifacts
    key: maven-${{ runner.os }}-${{ hashFiles('**/pom.xml') }}
    restore-keys: |
      maven-${{ runner.os }}-

Excluding your own group ID prevents stale local artifacts from being restored from cache.

Caching Gradle:

- uses: actions/cache@v4
  with:
    path: |
      ~/.gradle/caches
      ~/.gradle/wrapper
    key: gradle-${{ runner.os }}-${{ hashFiles('**/*.gradle*', '**/gradle-wrapper.properties') }}
    restore-keys: |
      gradle-${{ runner.os }}-

What and what not to cache

Cache this	Don't cache this
`~/.m2/repository` (Maven packages)	Build output (`target/`, `build/`)
`~/.npm` or `node_modules`	Test results and screenshots
`~/.cache/ms-playwright` (Playwright binaries)	Secrets or credentials
`~/.gradle/caches`	Files that change on every build
`~/.cache/pip` (Python packages)	Large generated files (>1GB)

GitHub Actions imposes a 10GB cache limit per repository. Caches that haven't been accessed in 7 days are evicted automatically. Don't cache build artifacts — they change every run and pollute the cache with stale data.

Speed impact: cached vs uncached

A typical Java/Maven project with Selenium:

Step	Uncached	Cached
Maven dependency download	3–5 minutes	5–15 seconds
Playwright browser install	60–90 seconds	3–5 seconds
npm install (300 packages)	60–90 seconds	2–5 seconds

On a 4-shard workflow, every shard pays the uncached cost independently. With caching, all four shards share a warm cache.

Uncached vs cached pipeline: same test suite, same test count

Uncached — every run

npm install: 90 seconds
Downloads 200MB from npm registry every run
Playwright install: 80 seconds
Downloads browser binaries from GitHub every run
Maven resolve: 4 minutes
Downloads JARs from Maven Central every run
4 shards × 5 min setup = 20 min overhead
Pure waiting time before a test runs

Cached — warm cache

npm install: 5 seconds
Restores node_modules from cache — skips network
Playwright install: 4 seconds
Restores ~/.cache/ms-playwright from cache
Maven resolve: 12 seconds
Restores ~/.m2/repository — no network needed
4 shards × 21 sec setup = 1.5 min overhead
Time saved: 18+ minutes per pipeline run

Jenkins caching

Jenkins doesn't have a built-in equivalent to actions/cache. The traditional Jenkins approach relies on two workspace behaviours:

Persistent workspace: by default, Jenkins keeps the workspace between builds. node_modules and ~/.m2 persist across runs on the same agent. This is implicit caching — it works as long as the same agent picks up the job, which isn't guaranteed in multi-agent setups.

Pipeline Cache Plugin: for explicit caching in multi-agent setups, install the Pipeline Cache Plugin. Configuration is similar in concept to actions/cache:

cache(maxCacheSize: 400, caches: [
    arbitraryFileCache(path: '~/.m2/repository', cacheValidityDecidingFile: 'pom.xml'),
    arbitraryFileCache(path: 'node_modules', cacheValidityDecidingFile: 'package-lock.json')
]) {
    sh 'mvn test -B'
}

For Maven specifically, the -Dmaven.repo.local flag lets you pin the local repository to a workspace subdirectory that persists between builds:

sh 'mvn test -Dmaven.repo.local=${WORKSPACE}/.m2-local -B'

⚠️ Common mistakes

Caching node_modules instead of ~/.npm. Caching the node_modules directory is faster to restore but can cause subtle issues: some packages contain native binaries compiled for the cache machine's OS. Caching ~/.npm (the download cache) is safer — npm ci still runs, but re-uses cached packages rather than re-downloading.
Not including restore-keys. Without restore-keys, a cache miss (because package-lock.json changed) means a full re-download. With restore-keys: maven-${{ runner.os }}-, a partial cache is restored and only the delta is downloaded — much faster for incremental dependency updates.
Caching too aggressively with stale keys. Changing key to something that never changes (key: always-hit) means you'll restore a stale cache that may have corrupted or missing packages. The lock file hash is the right key because it changes exactly when dependencies change.

🎯 Practice task

Add caching to your CI workflow — 20 minutes.

If using setup-node or setup-java: add the cache: property ('npm' or 'maven'). Push. Look at the Actions log for "Cache restored" vs "Cache not found" messages.
If using Playwright: add a separate actions/cache step for ~/.cache/ms-playwright keyed on package-lock.json. Run twice. Confirm the second run shows "Cache restored" and the Playwright install step takes 3–5 seconds instead of 60+.
Compare the "Setup" step duration between the first run (cold cache) and the second run (warm cache). Record the saving.
Stretch: on a sharded workflow (4 shards), measure total setup time without caching (4 × cold install) vs with caching (4 × warm restore). If your setup costs 90 seconds cold and 8 seconds warm, that's 82 seconds × 4 shards = 5.5 minutes saved per pipeline run.

The final lesson in this chapter covers test selection — running only the tests that are relevant to a given code change, the next frontier beyond parallelism and caching.