Parsing API Responses

8 min read

Calling an API is the easy half. The interesting work in QA is what happens next — pulling fields out of the JSON, asserting they exist, asserting they have the right types, asserting they hold the right values, asserting it all came back fast enough. This lesson covers the parsing patterns and the validation patterns that turn a one-line requests.get(...) into a real API test. We'll meet assert, isinstance, response-time checks, and the safe-access tricks that keep tests from crashing on missing keys.

Parsing — .json() is almost always step one

import requests
 
response = requests.get("https://jsonplaceholder.typicode.com/users")
response.raise_for_status()
 
users = response.json()       # Python list of dicts
print(type(users))            # <class 'list'>
print(len(users))             # 10
print(users[0]["name"])       # 'Leanne Graham'

response.json() parses the response body as JSON and returns native Python (list, dict, str, int, float, True/False/None). Once parsed, you index it like any other Python collection — the chapter 3 toolkit applies directly.

.json() raises requests.exceptions.JSONDecodeError (a subclass of ValueError) if the body isn't valid JSON. Common cause: an HTML error page came back with a 500. That's why you call raise_for_status() first — it surfaces the HTTP error before you get a confusing parse error.

Accessing nested data

Most real APIs return nested structures: a list of records, each with sub-objects:

data = {
    "users": [
        {"name": "Alice", "address": {"city": "Lagos"}},
        {"name": "Bob",   "address": {"city": "Berlin"}}
    ],
    "meta": {"total": 2, "page": 1}
}
 
first_name = data["users"][0]["name"]
first_city = data["users"][0]["address"]["city"]
total = data["meta"]["total"]
 
print(first_name, first_city, total)   # Alice Lagos 2

Each ["…"] step drills one level. Mix lists ([0]) and dicts (["key"]) freely — that's exactly how parsed JSON lays out.

Validating the shape — assert and isinstance

Field assertions catch content bugs ("name should be Alice"). Shape assertions catch contract bugs ("the API stopped returning a list and now returns a dict"). Reach for both:

data = response.json()
 
# Shape: top-level must be a dict
assert isinstance(data, dict), f"expected dict at root, got {type(data).__name__}"
 
# Required key
assert "users" in data, "response missing 'users'"
 
# Right type
assert isinstance(data["users"], list), "'users' must be a list"
 
# Non-empty
assert len(data["users"]) > 0, "'users' is empty"

assert raises AssertionError if the condition is false. The optional second argument is the message that goes into the traceback. In a test framework like pytest, that message tells you exactly what broke.

isinstance(value, type) is the right way to check a type. (type(value) is dict works but breaks on subclasses; isinstance doesn't.)

One warning: assert is stripped from Python when the interpreter is run with python -O (optimised mode). For test code, that's fine — pytest is never run with -O. For production code that needs to validate inputs, use if not condition: raise ValueError(...) instead.

Validating individual fields

Once the shape is right, dig into the values:

user = data["users"][0]
 
assert user["name"] == "Alice"
assert user["email"].endswith("@test.com")
assert user["role"] in ("admin", "tester", "viewer")
assert isinstance(user["age"], int)
assert 0 <= user["age"] < 150

Each line tests one thing. When a test fails, the line that broke shows up in the traceback — and pytest (chapter 7) takes the line apart for you to show the actual values that didn't match. Plain Python asserts produce useful diffs in pytest without any extra helpers.

Processing a list of items

For collection responses you usually need to assert something across all the items, or filter to a subset. Comprehensions (chapter 2) shine here:

data = response.json()
 
# All items have a name
assert all("name" in u for u in data), "every user must have a name"
 
# Active admins only
active_admins = [u for u in data if u["role"] == "admin" and u["is_active"]]
assert len(active_admins) >= 1, "expected at least one active admin"
 
# All emails follow a pattern
assert all(u["email"].endswith("@test.com") for u in data), "off-domain email found"
 
# No duplicates
ids = [u["id"] for u in data]
assert len(ids) == len(set(ids)), "duplicate user ids in response"

all(condition for x in xs) and any(...) are tiny built-ins that read like English. They're the right shape for "every record must …" and "at least one record must …" assertions.

Safe access for optional fields

When a field might be missing — say, a verified_at timestamp that only some users have — bracket access raises KeyError. Use .get() with a default:

verified = user.get("verified_at")          # None if missing
city = user.get("address", {}).get("city", "Unknown")

The {} mid-chain stops a None from propagating into the next .get(). Two or three levels deep is fine; for more, refactor into a helper.

Response time as a contract

API SLAs are part of the contract. Use response.elapsed:

seconds = response.elapsed.total_seconds()
assert seconds < 2.0, f"response too slow: {seconds:.3f}s"

A response that's correct but slow is still a regression — and one of the easier kinds to catch automatically. Add a soft and hard threshold to your suite (warn at 1s, fail at 2s) and you'll spot performance drift before customers do.

Comparing to a fixture

For long-stable endpoints, save an expected response as a JSON fixture and diff against it:

import json
from pathlib import Path
 
with Path("fixtures/expected_users.json").open("r", encoding="utf-8") as f:
    expected = json.load(f)
 
response = requests.get(BASE + "/users", timeout=5)
response.raise_for_status()
actual = response.json()
 
# Spot-check fields rather than the whole structure
assert len(actual) == len(expected), "user count changed"
for a, e in zip(actual, expected):
    assert a["id"] == e["id"]
    assert a["email"] == e["email"]

A full equality check (assert actual == expected) is brittle — any new field on the API breaks the test. Field-by-field checks survive harmless schema additions.

Handling parse errors gracefully

If you can't trust the response is JSON (a misbehaving server, a scheduled outage page), wrap .json():

try:
    data = response.json()
except requests.exceptions.JSONDecodeError:
    print(f"non-JSON response (status {response.status_code}):")
    print(response.text[:200])
    raise

Re-raising after printing keeps the test failing while giving you the body for diagnosis. We'll cover try/except properly in chapter 6.

A QA example — full API check

Login → fetch users → validate shape, fields, and timing:

import requests
 
BASE = "https://api.example.com"
 
session = requests.Session()
login = session.post(f"{BASE}/login",
                     json={"email": "qa@test.com", "password": "..."},
                     timeout=5)
login.raise_for_status()
 
response = session.get(f"{BASE}/users?role=admin", timeout=5)
response.raise_for_status()
 
data = response.json()
 
# Shape
assert isinstance(data, dict)
assert "users" in data and isinstance(data["users"], list)
assert "meta" in data and isinstance(data["meta"], dict)
assert isinstance(data["meta"].get("total"), int)
 
# Fields
admins = data["users"]
assert len(admins) >= 1, "expected at least one admin"
for a in admins:
    assert isinstance(a.get("id"), int)
    assert isinstance(a.get("email"), str) and "@" in a["email"]
    assert a.get("role") == "admin"
 
# Performance
assert response.elapsed.total_seconds() < 1.5, \
    f"slow response: {response.elapsed.total_seconds():.3f}s"
 
print(f"OK — {len(admins)} admin users, {response.elapsed.total_seconds():.3f}s")

Eight assertions, each checking a single thing. When one fails, the message tells you exactly which contract broke — type, key, value, or timing.

The validation flow, drawn

Three layers of validation — status, shape, fields — plus a timing check. Anything fails, the test fails with a specific message; everything passes, you've got a real test.

⚠️ Common mistakes

  • Asserting on the whole response equality. assert actual == expected_dict breaks the moment the API adds a harmless field. Pick the fields that matter for your contract and assert each — schema additions then don't break the test.
  • Forgetting to call raise_for_status() (or check status_code). A 500 response that returns an HTML error page makes .json() raise a confusing JSONDecodeError instead of the real "the server is down" signal. Always check the status before parsing.
  • Skipping type checks. assert user["age"] > 0 succeeds if age is the string "7", because "7" > 0 raises a TypeError in Python 3 (different from JS's silent coercion). Check types first with isinstance(user["age"], int), then check the value — the diagnostic is much clearer.

🎯 Practice task

Build a real API test against JSONPlaceholder. 25-30 minutes.

  1. Create api_test.py. import requests.
  2. Define a base: BASE = "https://jsonplaceholder.typicode.com".
  3. Call GET /users, with timeout=5. Then response.raise_for_status(). Then users = response.json().
  4. Shape assertions:
    • assert isinstance(users, list)
    • assert len(users) >= 1
    • assert isinstance(users[0], dict)
  5. Field assertions on the first user:
    • assert isinstance(users[0]["id"], int)
    • assert isinstance(users[0]["name"], str) and len(users[0]["name"]) > 0
    • assert "@" in users[0]["email"]
  6. All-items assertions:
    • assert all("id" in u for u in users)
    • assert all(isinstance(u.get("address", {}).get("city"), str) for u in users)
    • ids = [u["id"] for u in users]; assert len(ids) == len(set(ids))
  7. Timing assertion: assert response.elapsed.total_seconds() < 2.0.
  8. Pull just the email addresses with a list comprehension and print them.
  9. Wrap the script's body in a try / except AssertionError as e: print(f"TEST FAIL: {e}"); raise. Confirm a deliberate broken assert (e.g. assert len(users) == 99999) prints a useful message.
  10. Stretch: save users.json to disk with json.dump(users, f, indent=2). On the next run, load the saved file and compare each user's id and email to the live response. Treat any difference as a regression.

You can now write API tests that catch real contract drift. The next chapter shifts gears from procedural code to object-oriented Python — classes, __init__, inheritance, and the dataclasses that model test fixtures cleanly.

// tip to track lessons you complete and pick up where you left off across devices.