Testing Against API Documentation — API Testing Masterclass

API documentation is a contract. Every consumer reads it, builds against it, and trusts it. When the docs say "POST /users returns 201 with the new user object" and reality says "POST /users returns 200 with {success: true}," every consumer is being lied to. As QA, you have a unique opportunity: be the person who systematically checks that the API actually does what the docs say. This lesson covers what to verify, how to organise the audit, and how to keep docs honest over time.

Why documentation testing matters

Three reasons:

Consumers depend on the docs. Internal teams, third-party integrators, and AI coding assistants all build against documentation. Drift causes broken integrations.
Drift accumulates silently. A field renamed in passing, a status code quietly changed, an endpoint removed — each is a tiny edit that no test catches and the docs page nobody re-reads.
The team often doesn't know. Engineers update code; docs are a separate artefact maintained elsewhere. A "small change" can break docs without anyone noticing.

The ideal — generating docs from code — eliminates most drift. But many teams aren't there yet, and the gap between intent and reality stays your problem.

Documentation reality check

What the docs claim vs what the API actually does

Docs say

POST /users → 201 Created
Returns the new user object with id, name, email, createdAt.
GET /users supports ?role= filter
Filters users by role (admin / editor / viewer).
401 returned for bad tokens
Auth failures use 401 with a token_invalid error code.
Response includes avatarUrl
Profile endpoints include an avatarUrl field.
Endpoints documented: 24
Public surface area as listed on the docs page.

API actually does

Returns 200 with {success: true, id}
Status code wrong; payload shape wrong; createdAt missing.
?role= silently ignored
Server returns full unfiltered list — typical sign of a recent rewrite.
401 with vague {error: 'auth failed'}
No error code; clients can't programmatically distinguish causes.
Field renamed to avatar_url
Snake_case version exists; camelCase version still in docs.
Endpoints actually exposed: 31
Seven undocumented endpoints — debug routes, internal admin, deprecated stubs.

Every one of those discrepancies is a real category of bug seen across real APIs. The audit job is to find them before consumers do.

A documentation testing checklist

A systematic pass through any API:

☐ Every documented endpoint exists. Hit each one. Confirm a 2xx (or appropriate documented error) response.
☐ Every documented endpoint matches its documented response shape. Status code, headers, body schema.
☐ Every documented parameter works. Send the parameter, confirm it has the documented effect.
☐ Every documented error code triggers correctly. Send the input that should produce it, confirm the right code comes back.
☐ Every documented example actually works. Copy-paste the curl example, run it, confirm the documented response.
☐ No undocumented endpoints exist (or, if they do, they're internal and not reachable to outside callers).
☐ Auth requirements match. If docs say "Bearer token required," confirm anonymous calls return 401.
☐ Versioning is honest. If /v1/ and /v2/ both exist, confirm the doc-stated differences are accurate.

That eight-item list is enough for a thorough first pass. Subsequent passes can focus on the areas where drift is most expensive.

Spec-driven automated audits

If the team has an OpenAPI spec (Lesson 3), most of the checklist is automatable:

Schemathesis — reads the spec, generates inputs, calls the API, validates responses. Run it nightly; investigate failures.
Dredd — similar but example-driven. Lighter, narrower coverage.
Custom checks — write a one-off script that lists every documented endpoint, hits each, and asserts on documented status codes.

A useful pattern: keep a golden OpenAPI file in your repo. Generate a current OpenAPI file from staging. Diff them. Any difference is either an intended change (update the golden file) or drift (file a bug). That single workflow catches whole classes of breakage.

When there's no spec

Plenty of APIs have docs in a Markdown file, a Confluence page, or a hand-written HTML page — no machine-readable spec. The audit then has to be manual, but you can still systematise it.

A scrappy approach:

Make a spreadsheet. Columns: endpoint, documented method, documented status, actual status, documented body shape, actual body shape, mismatch?
Walk the docs from top to bottom, filling in expectations.
Walk the API with curl, filling in actuals.
Highlight mismatches. File one bug per category, not one per row.

A morning spent on this finds real bugs almost every time.

Discrepancies you'll keep finding

After auditing a few APIs, you'll notice the same patterns:

Status code mismatches. Docs claim 201; API returns 200. Often because the team standardised everything to 200 at some point and didn't update docs.
Snake_case vs camelCase. A backend rewrite changed the casing; docs forgot.
Removed but still documented. Deprecated endpoint pulled, doc not updated, integrators still try to use it.
Undocumented endpoints. Internal routes exposed without intention. Sometimes innocuous; sometimes a security concern.
Examples that no longer work. Curl snippets that would 404 or 401 if you ran them today.
Auth scope drift. Docs say "requires read scope"; reality requires read:users since the scope rename.

When you find one, expect to find others — drift is rarely localised.

Reporting findings

The output of a docs audit should be actionable:

One bug per discrepancy, or one umbrella bug if they share a root cause.
Each bug clearly states: what the docs say, what the API does, where the doc lives, where the API code lives (if findable), why it matters.
A summary or report that lists all findings so the team sees the scale of the problem.

A useful framing for the team: "I'm not asking who to blame — I'm asking what process change keeps this in sync going forward." Often the answer is "generate the spec from code," and an audit is the catalyst for that investment.

Keeping docs honest over time

A few practices that stop drift from coming back:

Generate docs from code. FastAPI, Spring, and many others can emit OpenAPI from controller annotations. Make this your default.
Treat the spec as part of the API contract. Spec changes = code review. PRs that change behaviour without updating the spec get rejected.
Run spec-vs-runtime checks in CI. Every staging deploy validates that the running API matches the spec.
Quarterly audits. Even with all of the above, schedule a manual audit. Drift sneaks in around the edges of what tooling catches.

⚠️ Common mistakes

Treating "docs are wrong" as a docs problem only. Sometimes it's the docs; sometimes the API drifted from intent. Investigate which side is correct before fixing.
Auditing only the happy path. Error codes, edge cases, and rate limit behaviour are documented less rigorously and drift faster. They're worth a deliberate pass.
Filing one giant "docs are out of date" bug. Hard to action. Break it into specific findings the team can fix in chunks.

🎯 Practice task

Audit a real API against its docs. 45 minutes.

Pick an API you can hit — your team's, GitHub's, Stripe's, or a public one. Find its docs.
Pick five endpoints across different verbs and behaviours.
For each, list: documented method, documented status code(s), documented response shape, documented parameters.
Hit each endpoint with curl. Compare to expectations. Note any discrepancy.
Try one negative case for each: bad input, missing auth, wrong type. Compare actual error to documented error.
Try to find one undocumented endpoint. Common candidates: /health, /metrics, /admin, /debug, /_status. Note any that respond.
Stretch: read the API Testing Concepts cheat sheet and check whether any concept it lists isn't covered by the API's docs. That gap is itself a finding.

That wraps up Chapter 7. The final theory chapter — Chapter 8 — pulls everything together into a coherent test strategy you can take into any team.