Test Data Generation and Fixtures

Test data is the unsexy half of test automation. Realistic fixtures, boundary-value datasets, database seed scripts — the work is necessary, repetitive, and well-suited to AI assistance. This lesson covers the prompt patterns that produce useful test data, how to structure generated factories, and the validation step that prevents bad data from reaching your test suite.

Generating structured fixture data

The simplest case: you need a set of realistic user records.

Generate 50 test users in JSON format.
Save to tests/fixtures/users.json.
 
Requirements:
- Realistic UK names (varied)
- Valid email addresses using the @testmail.example.com domain
- Ages: varied between 18 and 75
- Roles: mix of admin, user, manager, viewer (roughly 10/70/10/10)
- UK phone numbers and major-city addresses

The generated file is immediately importable in your tests. Specific requirements — email domain, address format, role distribution — produce data that matches your environment rather than generic user1@test.com noise.

Generating boundary-value datasets

Our registration form accepts age between 18 and 120 (integers only).
Generate a JSON array of test cases covering:
 
- Lower boundary: 17 (invalid), 18 (valid), 19 (valid)
- Upper boundary: 119 (valid), 120 (valid), 121 (invalid)
- Edge cases: 0, -1, null, empty string, decimal (18.5), very large int
 
Each entry: { input, expectedValid, description }
Save to tests/fixtures/age-boundary.json.

This coverage matrix takes 15 minutes to write by hand and 45 seconds to generate. The structure — input, expected validity, description — makes it easy to drive a parameterised test directly from the file.

Building fixture factories

For more complex testing scenarios, a typed factory class beats a static JSON file:

Create a factory class at src/factories/UserFactory.ts.
 
Read src/types/User.ts first to get the correct type shape.
Use @faker-js/faker for realistic data generation.
 
Implement:
- UserFactory.random(): User — generates a random valid user
- UserFactory.admin(): User — admin role, rest random
- UserFactory.withEmail(email: string): User — overrides email
- UserFactory.withRole(role: UserRole): User — builder pattern
- UserFactory.list(count: number): User[] — array of random users
 
If src/factories/ProductFactory.ts exists, match its builder pattern.

Reading the User type before generating means the factory's return type matches your actual interface. Reading an existing factory means the builder pattern is consistent across your codebase.

Generating API payload fixtures

Read our OpenAPI spec at openapi.yaml, specifically POST /orders.
 
Generate tests/fixtures/order-payloads.json containing:
- A valid order payload
- An order with an out-of-stock item
- An order with an invalid payment method
- An order missing the required shipping address
- An order with a promo code (10-character alphanumeric)
 
Each entry: { description, payload, expectedStatusCode }

Pointing Claude Code at your actual OpenAPI spec means payload shapes match your real API contract — not what it assumes the API should look like.

Generating database seed scripts

Generate a PostgreSQL seed script for test environment setup.
Save to tests/fixtures/seed.sql.
 
Read schema.sql first for correct table and column names. Then insert:
- 5 product categories
- 50 products distributed across the categories
- 20 customers
- 100 orders linking customers to products
 
Use deterministic values where possible so tests don't depend on random ordering.

The "read schema.sql first" instruction is critical. Claude Code generating INSERT statements from memory produces wrong column names and wrong types. Generating after reading your schema produces valid SQL.

Step 1 of 5

Define what you need

What shape? How many records? What domains, boundaries, and constraints? Specificity determines quality.

⚠️ Common Mistakes

Trusting generated data without validation. AI generates plausible-looking data, not necessarily correct data. A generated phone number might be the wrong format for your validation rules. Always validate against your actual schema before using fixtures in tests.
Generating PII-shaped data for non-test environments. Realistic-looking fake data is appropriate for test environments. Never ask Claude Code to generate or work with actual PII from production, even for testing purposes.
Static fixtures for dynamic systems. If your product catalogue changes frequently, a static JSON file of hardcoded IDs will rot. Factory functions that generate fresh data at runtime are more durable.

🎯 Practice Task

Generate a fixture set for a real project. 15–20 minutes.

Identify a test in your suite that depends on hardcoded test data — IDs, names, email addresses.
Ask Claude Code to read the relevant type definition or schema first.
Generate a factory function or fixture file that replaces the hardcoded data.
Validate the output: run it through TypeScript compilation or test an import in a temporary file.
Update one test to use the generated factory instead of hardcoded values.

Chapter 3 covers the next phase of the test lifecycle: what to do when tests fail, when they flake, and when the UI changes underneath them.