Test Data Management — Builders and Factories — API Automation with Rest Assured

Tests need data. Bad tests inline that data — new User("Test", "test@test.com", "admin") — and pay for it twice: collisions when two tests use the same email, and noise on every test method that has to declare what it doesn't care about. Good tests centralise data construction in factories (which produce instances) backed by builders (which let each test override the fields it cares about). The result: every test method names only the fields meaningful to it, and the suite has one place that knows what a "user" looks like. The TypeScript for QA lesson on typed data factories applies the same idea in a different language; the principles travel.

The Lombok @Builder, recapped

Chapter 5 introduced @Builder. It's the foundation for everything in this lesson:

import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
 
@Data
@Builder(toBuilder = true)
@NoArgsConstructor
@AllArgsConstructor
public class CreateUserRequest {
    private String name;
    private String email;
    @Builder.Default private String role = "tester";
    @Builder.Default private boolean active = true;
}

Two refinements over the basic @Builder from Chapter 5:

toBuilder = true generates a toBuilder() method on the instance, so you can clone-and-mutate: existingUser.toBuilder().role("admin").build(). Crucial for variant factories (admin = random user, but admin).
@Builder.Default on every field with a default value — without it, the builder ignores inline initialisers and produces null/false.

A factory class

The factory wraps the builder for the common cases the suite actually uses:

package com.mycompany.apitests.factories;
 
import com.github.javafaker.Faker;
import com.mycompany.apitests.models.request.CreateUserRequest;
 
import java.util.UUID;
import java.util.concurrent.atomic.AtomicInteger;
 
public final class UserFactory {
 
    private static final Faker faker = new Faker();
    private static final AtomicInteger counter = new AtomicInteger(0);
 
    private UserFactory() {}
 
    public static CreateUserRequest random() {
        int n = counter.incrementAndGet();
        return CreateUserRequest.builder()
            .name(faker.name().fullName())
            .email("test+" + n + "_" + UUID.randomUUID().toString().substring(0, 8) + "@example.com")
            .role("tester")
            .build();
    }
 
    public static CreateUserRequest admin() {
        return random().toBuilder().role("admin").build();
    }
 
    public static CreateUserRequest viewer() {
        return random().toBuilder().role("viewer").build();
    }
 
    public static CreateUserRequest withInvalidEmail() {
        return random().toBuilder().email("not-an-email").build();
    }
 
    public static CreateUserRequest withEmptyName() {
        return random().toBuilder().name("").build();
    }
 
    public static CreateUserRequest withName(String name) {
        return random().toBuilder().name(name).build();
    }
}

Five named factories on a single base. Each named after what makes the case interesting (an admin, an empty name, a custom name) — so the test reads as English: UserFactory.admin() says exactly what it is.

Tests using the factory

@Test
public void adminCanCreateBook() {
    CreateUserRequest admin = UserFactory.admin();
    String token = AuthApiHelper.login(admin.getEmail(), "DefaultPass");
 
    given().auth().oauth2(token)
        .body(BookFactory.random())
    .when().post("/books")
    .then().statusCode(201);
}
 
@Test
public void emptyNameIsRejected() {
    given().spec(Specs.admin)
        .body(UserFactory.withEmptyName())
    .when().post("/users")
    .then().statusCode(400);
}

Two tests, two readable intents. Neither names the email, neither names the role beyond what matters — the factory hides what's irrelevant.

Why uniqueness is non-negotiable

A factory that produces name = "Test User" for every call hits a unique-name constraint on the second test of every run. A factory using email = "test@test.com" hits the same problem. The fix is unconditional: every field with a uniqueness constraint must include a per-call random component. The patterns:

UUID.randomUUID() — full UUID for cases where length doesn't matter.
UUID.randomUUID().toString().substring(0, 8) — short UUID for emails and names you'll read in logs.
AtomicInteger.incrementAndGet() — sequential per-process counter, useful for sortable test data.
System.currentTimeMillis() — works, but can collide across parallel runs of the same suite. Pair it with a UUID for safety.

The factory should choose. Tests should never construct uniqueness themselves — that's exactly the duplication factories exist to prevent.

Compose factories — no shared state

When a test needs a user and a book they own, two factory calls compose:

@Test
public void userCanReadOwnBookmark() {
    CreateUserRequest userReq = UserFactory.random();
    UserResponse user = UserApiHelper.createUser(userReq);
 
    Book book = BookApiHelper.createBook(BookFactory.random());
    BookmarkApiHelper.createBookmark(user.getId(), book.getId());
 
    String token = AuthApiHelper.login(userReq.getEmail(), "DefaultPass");
    given().auth().oauth2(token)
    .when().get("/bookmarks")
    .then()
        .statusCode(200)
        .body("size()", equalTo(1));
}

The test stays readable. Each factory call is one line, each helper call is one line. The composition reads top to bottom as the scenario unfolds.

Cleanup — closing the loop

A test that creates data without cleaning it up pollutes the environment. Over a long-running suite, the database fills with Test User 4892 rows that nobody owns. The fix is a per-test cleanup hook:

public class UserApiTest extends BaseApiTest {
 
    private final List<Integer> createdUserIds = new ArrayList<>();
 
    @Test
    public void createUser() {
        CreateUserRequest req = UserFactory.random();
        UserResponse user = UserApiHelper.createUser(req);
        createdUserIds.add(user.getId());
 
        Assert.assertEquals(user.getName(), req.getName());
    }
 
    @AfterMethod(alwaysRun = true)
    public void cleanup() {
        for (int id : createdUserIds) {
            try {
                UserApiHelper.deleteUser(id);
            } catch (AssertionError e) {
                // delete idempotency: 404 is fine if a previous step already removed it
            }
        }
        createdUserIds.clear();
    }
}

Three details:

alwaysRun = true — runs even if the test fails. Without it, failing tests leave their data behind.
List, not single ID — a test that creates three users tracks all three.
Catch the cleanup error — don't let a 404 (the user was deleted by the test itself) cascade into a test failure.

For tests that create many resources, a generic "context" object can help:

public class TestContext {
    private final List<Runnable> cleanupActions = new ArrayList<>();
 
    public void onCleanup(Runnable action) {
        cleanupActions.add(action);
    }
 
    public void cleanup() {
        // LIFO — clean up children before parents
        for (int i = cleanupActions.size() - 1; i >= 0; i--) {
            try { cleanupActions.get(i).run(); } catch (Exception ignored) {}
        }
        cleanupActions.clear();
    }
}

Each test calls ctx.onCleanup(() -> UserApiHelper.deleteUser(id)) after each create. The cleanup runs in reverse — so a borrowing-with-book test deletes the borrowing first, then the book.

Cleanup vs. test isolation

The other end of the spectrum is isolation — every test starts from a clean database (via a per-test seed, a transactional test container, or a /test/reset admin endpoint). When you can have it, isolation is stronger: failures don't leak state, tests run in parallel without conflict, no cleanup hook to forget.

For most teams, full isolation is expensive (database resets are slow, seeded data drifts) and partial cleanup is good enough. Pick by what your environment supports — but at least one of the two strategies must be in place. The third option ("we'll just remember to clean up manually") doesn't survive contact with reality.

The factory lifecycle

Step 1 of 6

Test calls factory

UserFactory.random() — builds a CreateUserRequest with unique email/name and sensible defaults via @Builder.

The pattern is simple but disciplined. The win compounds — over hundreds of tests, the data hygiene is the difference between a suite that runs reliably for years and one that drowns in stale data within months.

Realistic data with Faker

Compare two factories:

// Mechanical
.name("Test User " + counter)
.email("test" + counter + "@test.com")
 
// Realistic, via Faker
.name(faker.name().fullName())
.email("test+" + uuid + "@" + faker.internet().domainName())

The realistic version produces logs that read like real users — "Margaret Stoltenberg," "Joel Wuckert," addresses, phone numbers, sentences for descriptions. The benefit is small per test but cumulative: debugging logs of plausible data is cognitively easier than scrolling past Test User 1, Test User 2, Test User 3. Add the Faker dependency once; the call sites are no longer than the mechanical version.

⚠️ Common mistakes

Hardcoded test data in tests. Every test that says new User("Alice", "alice@test.com", ...) is one bug away from a unique-constraint failure. Centralise in factories and never look back.
No cleanup. A suite that creates 50 users per run and never deletes them ends up with 50,000 stale users in two months. Either cleanup or true test isolation — pick one.
Cleanup that assumes the test succeeded. If the test fails halfway through creating a borrowing, the user-delete in cleanup may fail because the borrowing still references the user. Cleanup hooks should swallow expected errors and try every step independently — see the LIFO cleanup pattern.

🎯 Practice task

Build a factory + cleanup hook against any free CRUD API (or your own staging). 30–40 minutes.

Create UserFactory with at least four methods: random(), admin(), withInvalidEmail(), withName(String). Use Lombok's @Builder(toBuilder = true) on the request POJO.
Add Faker (com.github.javafaker:javafaker:1.0.2) to the pom. Use it for name, replacing any mechanical "Test User N" strings. Run a test, read the log, note the readability win.
Uniqueness audit. Pick the email field. Confirm every factory call produces a unique value (UUID-suffixed). Try running two test methods in parallel with TestNG (parallel="methods") — they should not collide.
Cleanup hook. Add @AfterMethod(alwaysRun = true) that iterates a createdUserIds list and calls UserApiHelper.deleteUser on each. Confirm it runs after both passing and failing tests.
Force a failure mid-test. Add a deliberate bad assertion after a successful create. The test fails, but cleanup still runs. Inspect the API: confirm the created user was deleted.
Composing factories. Add a BookFactory.random() and write a test that creates a user, then a book, then a borrowing. Track three IDs. Cleanup in LIFO order.
Generic test context. Refactor the cleanup to use a TestContext.onCleanup(() -> ...) mechanism. Note how individual tests stop tracking lists themselves.
Stretch: add a BorrowingFactory.from(user, book) that constructs a borrowing referencing the user and book IDs. Use it in the composed test. The test method gets shorter; the factory layer absorbs the relationship logic.

Next lesson: running this whole suite in CI — GitHub Actions, environment variables, secrets, and the artefacts that make CI failures debuggable from the browser.