Q3 of 38 · Performance
What are the key metrics in performance testing?
PerformanceJuniorperformancemetricsfundamentalsrpspercentiles
Short answer
Short answer: Response time (usually as percentiles — p50/p95/p99), throughput (requests or transactions per second), error rate (% of failed requests), and resource utilisation (CPU, memory, DB connection pool). Together they describe both user experience and where headroom exists.
Detail
Four metric families show up in almost every performance report:
- Response time — how long a request takes from the client's perspective. Reported as percentiles, not averages, because the slow tail is what users feel. p50 (median) is "the typical request"; p95 / p99 are "the slowest few percent" and tell you about pain on busy days.
- Throughput — how much work the system completes per unit time. RPS (requests per second) at the HTTP layer, TPS (transactions per second) at the business layer. Throughput tells you capacity; latency tells you experience. A system can be fast and low-throughput, or slow and high-throughput — neither alone is enough.
- Error rate — percentage of failed requests (HTTP 5xx, timeouts, business-rule violations). A test that ignores errors and reports only latency is misleading: a system can look "fast" because it's failing 30% of requests at the load balancer.
- Resource utilisation — CPU, memory, disk I/O, network bandwidth, database connection pool, queue depth. These are server-side metrics that explain why latency or throughput moved. The four go together. p95 latency rising while CPU saturates is "scale up the app tier." p95 rising while CPU is idle but DB connection pool is at 100% is "increase pool size or fix slow queries." Without all four, conclusions are guesswork.
// EXAMPLE
k6-metrics-snapshot.js
// k6 reports all four metric families by default
export const options = {
vus: 50,
duration: '5m',
thresholds: {
http_req_duration: ['p(50)<200', 'p(95)<500', 'p(99)<1500'], // response time
http_reqs: ['rate>100'], // throughput (RPS)
http_req_failed: ['rate<0.01'], // error rate
// resource utilisation comes from the server's monitoring,
// not the load tool — collect via Datadog, Prometheus, etc.
},
};// WHAT INTERVIEWERS LOOK FOR
Naming the four metric families and showing they're complementary — latency without throughput is incomplete, both without errors is misleading, and none of them explain themselves without resource metrics.
// COMMON PITFALL
Reporting only response time. A test that meets latency targets while silently failing 20% of requests passes the report and breaks production.