Q18 of 38 · Performance

How do you handle warm-up periods and cold-cache effects in load tests?

PerformanceSeniorperformancewarmupcold-cachejittest-design

Short answer

Short answer: Run an explicit warm-up stage at 10-20% of target load to populate caches, JIT-compile, and prime connection pools — discard those metrics from the report. The 'real' measurement window starts after warm-up. For cold-start tests, intentionally clear caches first.

Detail

Cold systems are not the systems you're trying to test under typical load — and warm systems are not the systems you're trying to test under cold-start conditions. Both matter; design the test accordingly.

Why warm-up is needed:

  • Application JIT — JVM/CLR/V8 compile hot paths after enough iterations. First 100 requests are 5-10x slower than steady state.
  • Connection pools — DB, HTTP client, and Redis pools start empty. Each first connection eats handshake/TLS cost; warmed pools serve from a ready connection.
  • Caches — application caches (e.g. user lookup), CDN caches, DB query plan caches, OS page cache, all benefit from warmup.
  • Autoscaling — if the system auto-scales on CPU, the test starts on N instances and grows during the run; you measure mid-scaling, not steady state.

How to structure:

  • 10-20% target load for 5-10 minutes (the warm-up phase).
  • Ramp to target.
  • Hold for measurement window (30+ minutes).
  • Report only the measurement window. Most tools support --summary-trend-stats and result tagging that lets you split phases.

Cold-start testing is the opposite.

  • Restart the service.
  • Drop caches (Redis FLUSHALL, restart, or warm-key eviction).
  • Hit the system from a cold start.
  • Measure first-request latency and time-to-warm.
  • This is an explicit test for incident-recovery scenarios — not the "default" load test.

Antipattern: setting up a warm-up but using the same metrics window as the rest of the test, so the JIT-cold first 30 seconds drag the p95 up artificially. Either tag the warm-up phase and exclude it from analysis, or use the load tool's built-in phasing (k6 stages with separate thresholds, JMeter setup thread group).

// WHAT INTERVIEWERS LOOK FOR

Awareness of why warm-up matters (JIT, pools, caches), how to structure phases, and the contrast with cold-start testing as a deliberate scenario.

// COMMON PITFALL

Reporting warm-up and steady-state as one number — the warm-up tail makes p95 look worse than it is in production, and devs chase a phantom regression.