@DataProvider Basics

8 min read

You got an introduction to TestNG in the Selenium course — this course goes much deeper. @DataProvider is where data-driven testing actually happens in TestNG. Instead of copy-pasting a test five times with different inputs, you write the test once and supply a table of data. TestNG calls the method once per row and reports every execution separately — each with its own pass/fail, each visible in the report. The resulting suite tells you exactly which inputs passed and which failed, not just "the login test failed." This lesson covers the full @DataProvider mechanism: the return type, how TestNG maps columns to parameters, how to put providers in their own class, and the two performance levers — lazy iteration and parallel execution.

The basic pattern

package com.mycompany.tests.tests;
 
import org.testng.Assert;
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;
 
public class LoginApiTest {
 
    @DataProvider(name = "loginCredentials")
    public Object[][] loginData() {
        return new Object[][] {
            // email,              password,       expectedRole, expectedStatus
            {"admin@test.com",  "AdminPass123",  "admin",       200},
            {"user@test.com",   "UserPass123",   "user",        200},
            {"wrong@test.com",  "BadPass",       null,          401},
            {"",                "password",      null,          400},
            {"admin@test.com",  "",              null,          400},
        };
    }
 
    @Test(dataProvider = "loginCredentials",
          description = "Login API accepts valid credentials and rejects invalid ones")
    public void testLogin(String email, String password,
                          String expectedRole, int expectedStatus) {
        // TestNG calls this method 5 times — once per row
        // Each call gets a different set of arguments
        System.out.printf("Testing: email=%s, expectedStatus=%d%n", email, expectedStatus);
 
        // Real implementation would call the API:
        // Response response = given()
        //     .body(Map.of("email", email, "password", password))
        //     .post("/auth/login");
        // Assert.assertEquals(response.statusCode(), expectedStatus);
        // if (expectedRole != null) {
        //     Assert.assertEquals(response.jsonPath().getString("role"), expectedRole);
        // }
        Assert.assertTrue(true, "Placeholder — replace with real API call");
    }
}

Running this produces five separate test results in the report — each named testLogin[0] through testLogin[4] by default. A failure on row 3 (empty email) shows up as testLogin[3] FAILED; the others remain PASSED. You know exactly which input broke.

How TestNG maps Object[][] to parameters

Each inner array new Object[]{...} is one test invocation. The array elements map positionally to the @Test method's parameters left to right. Types must be compatible — TestNG does basic coercion (String, int, boolean, long) but will throw if the types don't match. Use Object parameters when the column can be null.

DataProvider in a separate class

Keep data providers out of your test classes — they are data, not tests. A shared TestData class is far easier to maintain:

package com.mycompany.tests.data;
 
import org.testng.annotations.DataProvider;
 
public class LoginData {
 
    @DataProvider(name = "validLogins")
    public static Object[][] validLogins() {
        return new Object[][] {
            {"admin@test.com",  "AdminPass123"},
            {"user@test.com",   "UserPass123"},
        };
    }
 
    @DataProvider(name = "invalidLogins")
    public static Object[][] invalidLogins() {
        return new Object[][] {
            {"wrong@test.com",  "BadPass"},
            {"",                "password"},
            {"admin@test.com",  ""},
        };
    }
}

When the provider is in a different class, it must be static and the @Test must declare dataProviderClass:

@Test(dataProvider = "validLogins",
      dataProviderClass = LoginData.class,
      description = "Valid credentials return HTTP 200")
public void validLoginSucceeds(String email, String password) {
    // ...
}
 
@Test(dataProvider = "invalidLogins",
      dataProviderClass = LoginData.class,
      description = "Invalid credentials return 4xx")
public void invalidLoginRejected(String email, String password) {
    // ...
}

Lazy loading with Iterator

For large datasets (hundreds of rows from a file), returning Object[][] loads everything into memory at once. An Iterator<Object[]> loads rows on demand:

@DataProvider(name = "lazyLoginData")
public java.util.Iterator<Object[]> lazyData() {
    // Build list lazily — in practice, stream rows from a file
    java.util.List<Object[]> rows = new java.util.ArrayList<>();
    rows.add(new Object[]{"admin@test.com", "AdminPass123", 200});
    rows.add(new Object[]{"user@test.com",  "UserPass123",  200});
    rows.add(new Object[]{"bad@test.com",   "BadPass",      401});
    return rows.iterator();
}

For file-backed providers this matters: streaming 5,000 CSV rows via an iterator uses constant memory; materialising them all as Object[][] uses 5,000× that.

Parallel data provider

By default, TestNG runs all iterations of a data provider sequentially on a single thread. Adding parallel = true distributes them across threads:

@DataProvider(name = "loginCredentials", parallel = true)
public Object[][] loginData() {
    return new Object[][] {
        {"admin@test.com",  "AdminPass123", 200},
        {"user@test.com",   "UserPass123",  200},
        {"wrong@test.com",  "BadPass",      401},
    };
}

Control the thread pool in testng.xml:

<suite name="Suite" data-provider-thread-count="4">

This is useful when each iteration makes an independent network call. Each iteration must create its own resources — a shared WebDriver or RestAssured request spec will cause race conditions.

The DataProvider flow

Step 1 of 5

Define the provider

@DataProvider method returns Object[][] — one inner array per test invocation. Each element in the inner array maps to one @Test parameter in order.

Naming iterations in the report

Default names are methodName[0], methodName[1] — not very readable. You can override the test name per iteration by returning a special ITestNGMethod:

The simpler approach is to include a String description column that you use in your assertion message:

@DataProvider(name = "loginScenarios")
public Object[][] loginScenarios() {
    return new Object[][] {
        // scenario name, email, password, expected
        {"admin valid login",  "admin@test.com", "AdminPass123", 200},
        {"user valid login",   "user@test.com",  "UserPass123",  200},
        {"bad password",       "admin@test.com", "wrong",        401},
    };
}
 
@Test(dataProvider = "loginScenarios")
public void testLoginScenario(String scenario, String email,
                               String password, int expected) {
    // Use scenario in assertion messages — shows in failure reports
    // Assert.assertEquals(actualStatus, expected, "Scenario: " + scenario);
    System.out.printf("[%s] email=%s%n", scenario, email);
    Assert.assertTrue(true);
}

⚠️ Common mistakes

  • Forgetting static on a provider in a separate class. TestNG needs to call the method without instantiating the class first. Non-static providers in external classes throw org.testng.TestNGException: DataProvider ... requires non-static method but no instance was found. Make it static.
  • Mismatching column count and method parameters. If the Object[] has 4 elements but the @Test method has 3 parameters, TestNG throws at runtime with a confusing argument-count error. Count the columns and count the parameters — they must match exactly.
  • Mutating shared state inside a parallel DataProvider. @DataProvider(parallel = true) runs invocations on multiple threads. If your test method reads from or writes to a shared field (a list, a counter), you'll get race conditions. Each invocation must be completely self-contained.

🎯 Practice task

Build a data-driven API test. 25–35 minutes.

  1. Create LoginData.java in a data package with two static @DataProvider methods — validLogins (2 rows) and invalidLogins (3 rows).
  2. Write LoginApiTest.java with two @Test methods, each pointing at one of those providers via dataProvider + dataProviderClass. Use System.out.printf to print the arguments — confirm each row appears in the console output.
  3. Run via mvn test. Confirm the report shows 5 separate test results (2 + 3), each individually named.
  4. Intentionally fail one row. Add Assert.fail("Forced failure on row 2") inside an if ("wrong@test.com".equals(email)) block. Run — confirm only testLogin[2] is FAILED; the others remain PASSED.
  5. Try Iterator. Convert validLogins to return Iterator<Object[]> instead of Object[][]. Run — the tests still pass. Add a System.out.println("Loading row") in the loop and confirm the rows load as TestNG iterates.
  6. Stretch — parallel provider. Add parallel = true to one provider. Add Thread.sleep(500) inside the test method. Run with and without parallel = true. The parallel version should finish in roughly half the time.

Next lesson: loading DataProvider data from external files — Excel, CSV, and JSON — so test data lives outside source code.

// tip to track lessons you complete and pick up where you left off across devices.