The previous chapters wrote one test method per scenario. That works until you realise you've written testCreateUserAlice, testCreateUserBob, testCreateUserWithEmptyName, testCreateUserWithBadEmail — eight near-identical methods diverging only in the input row and expected status. TestNG's @DataProvider is the canonical fix: one test method, many data rows, one result per row in the report. The Selenium with Java chapter on data-driven Selenium tests showed the pattern for UI tests; it's identical for API tests, just with given()/when()/then() in the body. The API Testing Masterclass lesson on positive/negative/edge case design is the strategy this lesson is the Java tooling for.
A first data provider
The shape: a method returning Object[][] (a "table" — outer array is rows, inner array is columns), wired to a test by name.
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;
public class CreateUserTests extends BaseApiTest {
@DataProvider(name = "userCreationData")
public Object[][] userCreationData() {
return new Object[][] {
// { name, email, role, expectedStatus }
{ "Alice Smith", "alice@test.com", "admin", 201 },
{ "Bob Jones", "bob@test.com", "tester", 201 },
{ "", "charlie@test.com","tester", 400 }, // empty name
{ "Dave", "not-an-email", "tester", 400 }, // malformed email
{ "Eve", "eve@test.com", "superadmin", 400 }, // invalid role
{ "Frank", "", "tester", 400 }, // empty email
};
}
@Test(dataProvider = "userCreationData")
public void createUserCases(String name, String email, String role, int expectedStatus) {
CreateUserRequest req = new CreateUserRequest(name, email, role);
given().spec(Specs.admin)
.body(req)
.when()
.post("/users")
.then()
.statusCode(expectedStatus);
}
}Six rows, six test executions, six lines in the report — each named with the row's data so a failure points at the exact bad combination. One method's worth of code, six tests' worth of coverage.
Why this beats six methods
Three concrete wins:
- One change site. Adding
Accept: application/jsonto the request body once updates all six tests. With separate methods, it's six edits. - Coverage at a glance. The data table is itself a coverage document — anyone reading the file sees the six scenarios laid out side by side. Six method names hide that pattern.
- Easy to extend. Add a row to the array, get a new test. No cut-and-paste of the body.
The trade-off: data providers can over-collapse if you cram too many different scenarios into one method. The rule: one data provider per behaviour, not per resource. A "create user — happy and validation errors" provider is right; a "create user — and login — and update — and delete" provider is a smell.
Reading the failure output
When row 3 fails ("" for name), TestNG's report shows:
FAILED: createUserCases("", "charlie@test.com", "tester", 400)
java.lang.AssertionError: Expected status code <400> but was <500>
The test name is the data. No mystery about which row broke — and the assertion message tells you the API returned 500 instead of the expected 400 (probably an uncaught NPE; a bug worth filing).
Loading data from JSON
When the table grows past 10 rows, inline Object[][] becomes painful to read. Move it to a JSON file under src/test/resources/testdata/:
[
{ "name": "Alice", "email": "alice@test.com", "role": "admin", "expectedStatus": 201 },
{ "name": "Bob", "email": "bob@test.com", "role": "tester", "expectedStatus": 201 },
{ "name": "", "email": "x@test.com", "role": "tester", "expectedStatus": 400 }
]Define a small POJO that matches the row shape, deserialise with Jackson, project to Object[][]:
@Data @NoArgsConstructor @AllArgsConstructor
public class UserTestCase {
private String name;
private String email;
private String role;
private int expectedStatus;
}
@DataProvider(name = "userCreationFromJson")
public Object[][] userCreationFromJson() throws IOException {
UserTestCase[] cases = new ObjectMapper().readValue(
new File("src/test/resources/testdata/user-creation.json"),
UserTestCase[].class);
return Arrays.stream(cases)
.map(c -> new Object[]{ c.getName(), c.getEmail(), c.getRole(), c.getExpectedStatus() })
.toArray(Object[][]::new);
}The non-developer wins: a tester who isn't comfortable in Java can edit the JSON file to add scenarios. The deserialised POJO carries types — no string-to-int parsing in the data provider, just a clean projection.
Loading from CSV or Excel
For QA teams used to spreadsheets, an Excel/CSV provider often goes down better than JSON. The Apache POI library reads Excel; OpenCSV reads CSV. The Selenium with Java course built a reusable ExcelReader you can drop straight into a Rest Assured project:
@DataProvider(name = "userCreationFromExcel")
public Object[][] userCreationFromExcel() throws IOException {
return ExcelReader.readData("src/test/resources/testdata/user-creation.xlsx", "UserTests");
}The data provider is a one-line wrapper; the heavy lifting is in ExcelReader. CSV is similar:
@DataProvider(name = "userCreationFromCsv")
public Object[][] userCreationFromCsv() throws IOException {
try (var reader = new CSVReader(new FileReader("src/test/resources/testdata/user-creation.csv"))) {
List<String[]> rows = reader.readAll();
rows.remove(0); // strip header
return rows.stream()
.map(r -> new Object[]{ r[0], r[1], r[2], Integer.parseInt(r[3]) })
.toArray(Object[][]::new);
}
}CSV's downside is the explicit type parsing; JSON deserialisation handles types for you. Pick based on who's editing the file.
Designing the data set
A good provider has intentional coverage — every row exists for a reason. The categories worth including, every time, for any input-bearing endpoint:
- Happy paths (the main 201/200 cases) — at least one per significant variant.
- Boundary values — empty strings, max-length strings, zero, one, very large numbers.
- Type confusion — strings where numbers go, numbers where strings go (where the API contract says strings).
- Special characters — Unicode (
café), emoji, quotes, backslashes, SQL-injection-y input ('; DROP TABLE). - Missing fields — null where required, missing where optional.
- Forbidden values — invalid enum values (
"superadmin"when onlyadmin/tester/viewerallowed). - Format violations — malformed emails, bad UUIDs, unparseable dates.
A six-row provider is rarely enough; twelve to fifteen rows is closer to the right shape.
One method, many test runs
One @Test method × six rows = six TestNG executions
| Inputs | Expected | Test result | Failure mode | |
|---|---|---|---|---|
| DataProvider row | Type | Status | Pass / Fail | What it catches |
| Row 1 (happy) | Alice / alice@... / admin | 201 | PASS | Smoke regression |
| Row 2 (happy) | Bob / bob@... / tester | 201 | PASS | Non-admin role works |
| Row 3 (empty name) | "" / charlie@... / tester | 400 | PASS / FAIL | Empty-name validation regression |
| Row 4 (bad email) | Dave / not-an-email / tester | 400 | PASS / FAIL | Email-format validation regression |
| Row 5 (bad role) | Eve / eve@... / superadmin | 400 | PASS / FAIL | Privilege-escalation regression |
| Row 6 (empty email) | Frank / "" / tester | 400 | PASS / FAIL | Required-field check |
Each row is its own TestNG result. A regression that breaks empty-name validation lights up exactly row 3 — and the matrix as a whole shows which behaviours the provider covers.
Parallel data rows
TestNG can run rows in parallel:
@Test(dataProvider = "userCreationData", threadPoolSize = 4)
public void createUserCases(...) { ... }Four threads, four concurrent rows. The catch: each row's data must be unique (a duplicate email between two rows running together becomes a 409 race). The factory patterns from Chapter 6 (UUID-suffixed emails) are what make this safe.
Naming rows for readability
By default, TestNG names a row by its values. For long rows, that becomes unreadable. The ITestContext.getName() and IDataProvidable hook lets you supply a name explicitly — but the simpler win is a separate "label" column in the data:
return new Object[][] {
{ "happy_admin", "Alice", "alice@test.com", "admin", 201 },
{ "happy_tester", "Bob", "bob@test.com", "tester", 201 },
{ "empty_name_400", "", "x@test.com", "tester", 400 },
// ...
};The label is the first parameter — your test method ignores it (it's purely for the report), and the result reads createUserCases("empty_name_400", ...) which is far easier to scan in a CI log than createUserCases("", "x@test.com", ...).
⚠️ Common mistakes
- Cramming too many concerns into one provider. A provider for "create user and login and delete" is a workflow, not a data set. Workflows belong in their own test methods. Providers feed one behaviour with multiple inputs.
- Forgetting unique data for parallel runs. Two rows with the same email running concurrently produce intermittent 409 conflicts that look like flakes. Always include a per-row UUID or timestamp in fields with uniqueness constraints.
- Asserting the same thing for every row. If half your rows expect 201 and half expect 400, the test must take the expected status as a parameter. A test that asserts a fixed status is just six near-identical happy-path tests with extra steps.
🎯 Practice task
Build a real, useful data provider against REQRES. 30 minutes.
- Create
CreateUserDataProviderTests.javawith a@DataProviderreturning at least six rows: two happy paths, four validation failures (empty name, missing email, invalid role, very long name). - Run. Confirm TestNG reports six test results, one per row, with the row's values in the test name.
- Move data to JSON. Create
src/test/resources/testdata/create-user.jsonwith the same rows. Build aUserTestCasePOJO. Switch the data provider to read the JSON. Confirm the same six results. - Add a label column. Prepend each row with a snake_case scenario name (
happy_admin,empty_name_400). Note how the test report becomes readable. - Stress the API. Add 10 more rows with edge cases: Unicode names, very long emails, leading/trailing whitespace, SQL-injection-style strings. Run; note which (if any) fail. File a bug for any that produce 500 instead of 400.
- Parallel rows. Change
@Testto@Test(dataProvider = "...", threadPoolSize = 4). Make sure each row's email is unique (useUUID.randomUUID()in the JSON or in a small post-processing step). Run. Confirm tests still pass. - Pull data from CSV. Convert the JSON to a CSV with the same columns. Wire OpenCSV into the data provider. Run; confirm parity with the JSON version.
- Stretch: add a second provider for update tests that depend on a setup created by the test class. The setup creates one user in
@BeforeClass; the provider rows describe how to update it. Note that this couples rows to a shared resource — discuss when that's worth it (rarely) vs. when each row should be independent (almost always).
Next lesson: managing test data — builders, factories, and the cleanup hooks that keep a long-running suite from polluting its environment.