How to read a k6 result without guessing

qa.codes · 13 June 2026 · 8 min read

IntermediatePerformance testersQA Engineers

performance-testingk6metricstutorial

A k6 summary throws a wall of numbers at you and it's tempting to skim to "did it pass". Here's how to actually read the output — which metrics matter, which mislead, and what each one is telling you.

Running a load test is the easy part; reading the result correctly is where people come unstuck. A k6 run prints a dense summary, and if you don't know which lines matter, you end up either rubber-stamping a bad result or panicking over a fine one. The good news is that a handful of metrics tell you almost everything, and once you know what each is for, the wall of numbers becomes a clear story. This assumes you've decided what you're actually testing for first.

The metrics that matter, in order

http_req_duration — the headline, but read the percentiles. This is response time. The trap is reading avg. The average lies — it smooths over the slow tail where users actually suffer. Read p(95) and p(99) instead: "95% of requests finished within X." p95 is the honest number; a healthy avg with an ugly p99 means a meaningful slice of users had a bad time.

http_req_failed — the error rate. The percentage of requests that failed. This is the first thing to check, before any latency number, because fast responses that are actually errors are worthless. A great-looking p95 with a 10% failure rate is a system falling over, not a fast one. Errors near zero is a precondition for caring about latency at all.

http_reqs and the rate — throughput. How many requests completed and at what per-second rate. This tells you the load you actually applied (did the test do what you thought?) and the throughput the system sustained.

vus / vus_max — the load profile. Virtual users active over time. Confirms your test ramped the way you intended — a result is only meaningful if you know the load that produced it.

http_req_waiting (TTFB) vs http_req_duration. Waiting is time-to-first-byte (server thinking time); duration includes transfer. If waiting dominates, the bottleneck is server-side processing; if the gap is large, it's data transfer/payload size. This split points you at where the slowness lives.

Thresholds — pass/fail you defined. If you set thresholds (e.g. p(95)<500), k6 marks them passed/failed and exits non-zero on failure. This is what turns a run into a CI gate — but it's only as good as the thresholds you chose.

Reading a k6 run

Check http_req_failed first — a fast result with errors is a failing system, not a fast one
Read http_req_duration at p(95)/p(99), not avg — the average hides the slow tail
Confirm vus/http_reqs match the load you intended to apply
Compare http_req_waiting (server time) vs total duration to locate the bottleneck
Check thresholds passed (and that you set meaningful ones)
Compare against a baseline — a number alone isn't a verdict
Watch for results that look good only because the load never actually ramped

The two mistakes to avoid

First, leading with the average. It's the most prominent-feeling number and the most misleading; train yourself to jump straight to p95/p99 and the error rate. Second, reading a number with no baseline. "p95 is 480ms" is not pass or fail until you know what it was last week and what users need — context is what makes the number mean something, which is why a smoke test compares to a baseline. Read the error rate, then the percentiles, then check them against a known-good run, and a k6 result stops being a wall of numbers and starts being a clear answer to "is this faster, slower, or breaking?"

// RELATED QA.CODES RESOURCES

Course

Performance testing course

Tool

Performance testing tools

// related

Tutorials·13 June 2026 · 8 min read

The performance smoke test I'd run before release

Not a full load test — a fast, fixed, repeatable check on a few critical endpoints, compared to baseline, that catches gross regressions before sign-off.

performance-testingsmoke-testreleasechecklist

Tutorials·13 June 2026 · 8 min read

How to test rate limits without annoying everyone

Test the full rate-limit contract — enforcement, 429, Retry-After headers, recovery, scope — with a low configurable limit and a dedicated key, not by flooding shared staging.

api-testingrate-limitinghttptutorial