How do you set realistic SLOs for a load test?

Question

Accepted Answer

Start from production data — historical p95/p99 latencies and user-impact studies. Set SLOs tighter than current performance to drive improvement, but loose enough to be achievable. Negotiate with product on tail trade-offs, and version SLOs alongside the features they cover. Made-up SLOs are worthless. The number "p95 < 500ms" carries weight only if there's a story behind it. Source 1 — production telemetry. What does p95 currently measure on production? Pull 30 days of data from Datadog/New Relic. If the current p95 is 600ms, an SLO of 500ms is tight-but-attainable; 200ms is fantasy. Setting the SLO at current minus a small improvement (e.g. 10-20% tighter) prevents regression while leaving room to improve. Source 2 — user-impact research. What latency makes users abandon? Amazon famously found 100ms latency cost 1% in sales; Google found 400ms cut search use by 0.6%. Industry numbers: <100ms feels instant, <1s feels responsive, >3s causes drop-off. Use these to bound the SLO from th

How do you set realistic SLOs for a load test?

// WHAT INTERVIEWERS LOOK FOR

// COMMON PITFALL

How do you set realistic SLOs for a load test?

Short answer

Detail

// WHAT INTERVIEWERS LOOK FOR

// COMMON PITFALL