Why do you report p95 instead of average response time?

Question

Accepted Answer

Averages hide the long tail. A request that's fast for 95% of users but takes 8 seconds for the slowest 5% will look 'fine' on average but enrages a noticeable slice of customers. Percentiles describe the experience real users actually have. Imagine 100 requests: 95 take 100ms each, and 5 take 8 seconds each. The average is 495ms — looks acceptable. But the p95 is 100ms (95th percentile of the fast group) and the p99 is somewhere in the 8-second range. The average smoothed over a real user-impacting tail. Percentiles describe the slowest user, not the average user. p50 (median) is "what does a typical request look like?" p95 is "what does the slowest 1-in-20 request look like?" p99 is "what does the slowest 1-in-100 request look like?" For a busy service that's thousands of slow requests per minute — every one of them a customer. Service-level objectives are almost always written against percentiles for this reason: "p95 latency under 500ms" is a meaningful contract; "average latency u

Why do you report p95 instead of average response time?

// EXAMPLE

// WHAT INTERVIEWERS LOOK FOR

// COMMON PITFALL

Why do you report p95 instead of average response time?

Short answer

Detail

// EXAMPLE

// WHAT INTERVIEWERS LOOK FOR

// COMMON PITFALL