Q35 of 38 · Performance

How do you test API rate limiting under load?

PerformanceSeniorperformancerate-limitingapi-testing429k6load-testing

Short answer

Short answer: Test three scenarios: normal traffic below the limit (no 429s), burst traffic that hits the limit (429s appear at the expected threshold), and the recovery window (requests succeed again after the window resets). Verify the retry-after header is present and correct.

Detail

Rate limiting tests are a combination of functional and performance testing. The functional assertion is that 429 responses appear at precisely the documented limit — not before (overly restrictive) and not after (ineffective protection). The performance assertion is that the rate-limiting mechanism itself does not add significant latency to legitimate requests.

Design three k6 scenarios in parallel:

  1. Below limit: VUs send requests at 80% of the rate limit. Assert 0% error rate.
  2. At limit: VUs send requests at exactly the limit. Assert near-0% error rate with acceptable p95.
  3. Over limit: VUs send requests at 150% of the limit. Assert ~33% 429 rate, verify Retry-After header is present, verify the 429 response body matches documentation.

Also test client behaviour: does your application code respect Retry-After and back off, or does it hammer the API and make the problem worse? This is often more impactful than the rate-limiting mechanism itself.

// WHAT INTERVIEWERS LOOK FOR

Three-scenario design: below, at, and over limit. Functional assertion on the threshold, not just 'some 429s appear.' Testing client retry behaviour as well as server limiting.