Hardcoding test data inside @DataProvider methods works for simple cases, but as soon as the business team needs to add a new scenario, or a QA lead wants to bulk-edit 50 rows without touching Java, the hardcoded approach breaks down. External files solve this: the @DataProvider method becomes a thin reader that loads rows from a CSV, an Excel sheet, or a JSON file. The test logic stays unchanged; the data lives in a format that non-engineers can edit. This lesson covers all three external formats, a centralised DataReader utility that keeps the plumbing in one place, and how to load resources reliably in both local and CI environments.
Reading from CSV
CSV is the simplest external format — a plain text file, one row per test case, columns separated by commas. No dependencies beyond the standard library:
package com.mycompany.tests.data;
import org.testng.annotations.DataProvider;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;
public class CsvDataProvider {
@DataProvider(name = "csvLoginData")
public Object[][] readLoginCsv() throws Exception {
List<Object[]> rows = new ArrayList<>();
// getResourceAsStream reads from the test classpath — works in CI
try (BufferedReader br = new BufferedReader(
new InputStreamReader(
getClass().getClassLoader()
.getResourceAsStream("testdata/login.csv")))) {
br.readLine(); // skip header row
String line;
while ((line = br.readLine()) != null) {
String[] cols = line.split(",");
rows.add(new Object[]{
cols[0].trim(), // email
cols[1].trim(), // password
Integer.parseInt(cols[2].trim()) // expectedStatus
});
}
}
return rows.toArray(new Object[0][]);
}
}src/test/resources/testdata/login.csv:
email,password,expectedStatus
admin@test.com,AdminPass123,200
user@test.com,UserPass123,200
wrong@test.com,BadPass,401
,password,400
admin@test.com,,400
Always use getClassLoader().getResourceAsStream() — not new FileReader("testdata/login.csv"). The latter resolves relative to the working directory, which differs between IntelliJ and CI environments. The classpath-based approach always finds the file because Maven includes src/test/resources on the test classpath.
Reading from Excel with Apache POI
Excel files are popular with QA teams that want a spreadsheet UI for test data. Add Apache POI to your pom.xml (in test scope):
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.2.5</version>
<scope>test</scope>
</dependency>A generic Excel reader utility:
package com.mycompany.tests.util;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;
public class ExcelReader {
public static Object[][] readSheet(String resourcePath, String sheetName) throws Exception {
try (InputStream is = ExcelReader.class.getClassLoader()
.getResourceAsStream(resourcePath);
Workbook workbook = new XSSFWorkbook(is)) {
Sheet sheet = workbook.getSheet(sheetName);
List<Object[]> rows = new ArrayList<>();
for (int r = 1; r <= sheet.getLastRowNum(); r++) { // row 0 is header
Row row = sheet.getRow(r);
if (row == null) continue;
Object[] cells = new Object[row.getLastCellNum()];
for (int c = 0; c < row.getLastCellNum(); c++) {
Cell cell = row.getCell(c, Row.MissingCellPolicy.RETURN_BLANK_AS_NULL);
cells[c] = cell == null ? null : getCellValue(cell);
}
rows.add(cells);
}
return rows.toArray(new Object[0][]);
}
}
private static Object getCellValue(Cell cell) {
return switch (cell.getCellType()) {
case NUMERIC -> (long) cell.getNumericCellValue(); // avoid 200.0 → use long
case BOOLEAN -> cell.getBooleanCellValue();
default -> cell.getStringCellValue().trim();
};
}
}DataProvider using the utility:
@DataProvider(name = "excelLoginData")
public Object[][] excelLoginData() throws Exception {
return ExcelReader.readSheet("testdata/users.xlsx", "LoginTests");
}The users.xlsx file lives in src/test/resources/testdata/. The LoginTests sheet has columns: Email, Password, ExpectedStatus. Add rows in Excel, run mvn test — TestNG picks up every row automatically.
Reading from JSON with Jackson
JSON is the best format when data has structure (nested objects, typed fields). Add Jackson to pom.xml:
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.17.1</version>
<scope>test</scope>
</dependency>Define a POJO for the test case:
package com.mycompany.tests.model;
public class LoginTestCase {
private String email;
private String password;
private String expectedRole;
private int expectedStatus;
// getters and setters (or use Lombok @Data)
public String getEmail() { return email; }
public String getPassword() { return password; }
public String getExpectedRole() { return expectedRole; }
public int getExpectedStatus() { return expectedStatus; }
}src/test/resources/testdata/login-tests.json:
[
{"email": "admin@test.com", "password": "AdminPass123", "expectedRole": "admin", "expectedStatus": 200},
{"email": "user@test.com", "password": "UserPass123", "expectedRole": "user", "expectedStatus": 200},
{"email": "wrong@test.com", "password": "BadPass", "expectedRole": null, "expectedStatus": 401},
{"email": "", "password": "password", "expectedRole": null, "expectedStatus": 400}
]DataProvider:
package com.mycompany.tests.data;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.mycompany.tests.model.LoginTestCase;
import org.testng.annotations.DataProvider;
import java.io.InputStream;
import java.util.Arrays;
public class JsonDataProvider {
@DataProvider(name = "jsonLoginData")
public Object[][] jsonLoginData() throws Exception {
ObjectMapper mapper = new ObjectMapper();
try (InputStream is = getClass().getClassLoader()
.getResourceAsStream("testdata/login-tests.json")) {
LoginTestCase[] cases = mapper.readValue(is, LoginTestCase[].class);
return Arrays.stream(cases)
.map(tc -> new Object[]{
tc.getEmail(),
tc.getPassword(),
tc.getExpectedRole(),
tc.getExpectedStatus()
})
.toArray(Object[][]::new);
}
}
}A centralised DataReader utility
Keep all file-reading logic in one class so test classes never import file I/O:
package com.mycompany.tests.util;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.*;
import java.util.*;
public class DataReader {
public static Object[][] fromCsv(String resourcePath) throws Exception {
List<Object[]> rows = new ArrayList<>();
try (BufferedReader br = new BufferedReader(
new InputStreamReader(
DataReader.class.getClassLoader()
.getResourceAsStream(resourcePath)))) {
br.readLine(); // skip header
String line;
while ((line = br.readLine()) != null) {
rows.add(Arrays.stream(line.split(","))
.map(String::trim)
.toArray());
}
}
return rows.toArray(new Object[0][]);
}
public static <T> Object[][] fromJson(String resourcePath, Class<T[]> arrayType) throws Exception {
ObjectMapper mapper = new ObjectMapper();
try (InputStream is = DataReader.class.getClassLoader()
.getResourceAsStream(resourcePath)) {
T[] items = mapper.readValue(is, arrayType);
return Arrays.stream(items)
.map(item -> new Object[]{item})
.toArray(Object[][]::new);
}
}
public static Object[][] fromExcel(String resourcePath, String sheet) throws Exception {
return ExcelReader.readSheet(resourcePath, sheet);
}
}DataProvider using DataReader:
@DataProvider(name = "csvUsers")
public Object[][] csvUsers() throws Exception {
return DataReader.fromCsv("testdata/users.csv");
}Choosing the right format
External data format trade-offs
CSV
No dependencies — standard library only
Git-friendly — plain text diffs
Easy to generate from any system
No type info — everything is a string
No structure for complex objects
Best for simple tabular test data
Excel (.xlsx)
Needs Apache POI dependency
Non-engineers can edit comfortably
Multiple sheets = multiple datasets
Binary format — hard to diff in Git
Type handling needs explicit code
Best when QA team owns the data
JSON
Needs Jackson or Gson dependency
Typed, structured, nested data
Git-friendly — text diffs clearly
Maps directly to POJOs
Familiar to developers
Best for API test payloads and config
⚠️ Common mistakes
- Using
new FileReader("testdata/login.csv"). This resolves relative to the JVM's working directory — the project root in IntelliJ, but potentiallytarget/in CI. The file is never found in CI and the@DataProviderthrowsFileNotFoundException. Always usegetClass().getClassLoader().getResourceAsStream()— it reads from the classpath regardless of working directory. - Forgetting to skip the header row. A CSV
@DataProviderthat tries to parseemail,password,expectedStatusas actual test data will throw aNumberFormatExceptiononInteger.parseInt("expectedStatus"). Always callbr.readLine()once before the loop. - Storing Excel files in
src/main/resources. Maven putssrc/main/resourceson the production classpath, which is correct if production code reads the file. Test data belongs insrc/test/resources— it's on the test classpath only, keeping test infrastructure out of production artefacts.
🎯 Practice task
Load data from all three formats. 35–45 minutes.
- Create
src/test/resources/testdata/and addlogin.csv,users.xlsx, andlogin-tests.json. Each should have at least 4 rows with email, password, and expected status code. - Write
CsvDataProvider,ExcelReaderutility, andJsonDataProviderexactly as shown. Run the CSV-backed test first — confirm all 4 rows execute as separate test results. - Add Apache POI to
pom.xmland run the Excel-backed test. Opentest-output/emailable-report.htmland confirm each row appears as a named result. - Add Jackson and run the JSON-backed test. Verify that
nullvalues in the JSON (forexpectedRole) come through asnullin Java. - Test the classpath loading. Replace
getResourceAsStreamwithnew FileReader("src/test/resources/testdata/login.csv"). Runmvn test— it may work. Now runmvn test -pl . -Dtest=CsvDataProviderTestfrom a different working directory — it fails. Revert togetResourceAsStreamand confirm both scenarios work. - Stretch — add a
DataReaderutility that centralises all three formats. Update your three providers to delegate toDataReader. Confirm all tests still pass.
Next lesson: method injection — making one @DataProvider serve multiple test methods by receiving the calling method as a parameter.