Linear vs Modular vs Data-Driven vs Keyword-Driven

9 min read

Walk into any QA team and ask "what kind of framework do you use?" and you'll get one of four answers — often delivered with more confidence than accuracy. Most teams say "modular" but operate with linear scripts. A few say "data-driven" because they read one CSV file. Labels matter less than understanding what each type actually gives you, what it costs, and when to choose it. This lesson is a clean-room comparison of all four. By the end, you'll be able to look at any test codebase and immediately classify it — and know what that classification means for maintainability.

Linear (record-and-play)

A linear framework is a script that runs top to bottom. Every action is inlined: open browser, navigate, find element, type, click, assert, close browser. No reuse, no abstraction.

// Linear — everything inline, nothing shared
@Test
public void testCheckoutFlow() {
    WebDriver driver = new ChromeDriver();
    driver.get("https://staging.myapp.com/login");
    driver.findElement(By.id("email")).sendKeys("alice@test.com");
    driver.findElement(By.id("password")).sendKeys("s3cr3t");
    driver.findElement(By.id("submit")).click();
    driver.findElement(By.cssSelector(".add-to-cart")).click();
    driver.findElement(By.id("checkout")).click();
    assertEquals(driver.getCurrentUrl(), "https://staging.myapp.com/confirmation");
    driver.quit();
}

The same login block exists in every test that needs a logged-in user. If the email field's id changes, every test breaks. Selenium IDE's generated code is the canonical example of this type.

When it appears: early exploratory scripts, Selenium IDE exports, proof-of-concept work. When to graduate: immediately after you have more than one test that shares a step with another test — which is almost always test number two.

Modular (function libraries)

The modular approach extracts common steps into shared functions or utility classes. Tests call those functions instead of duplicating code. This is the first real improvement over linear scripts and the foundation everything else builds on.

# Python — modular: shared helpers called by multiple tests
def login(driver, email, password):
    driver.find_element(By.ID, "email").send_keys(email)
    driver.find_element(By.ID, "password").send_keys(password)
    driver.find_element(By.ID, "submit").click()
 
def test_checkout_flow(driver):
    login(driver, "alice@test.com", "s3cr3t")
    driver.find_element(By.CSS_SELECTOR, ".add-to-cart").click()
    # ...

The login logic lives in one function. When the form changes, one function update fixes every test. This is page objects before page objects were named — you're on the right track.

The remaining gap: the test data ("alice@test.com", "s3cr3t") is still hardcoded in tests. Running the same test for 10 different users means 10 test methods with different literal strings. That's where data-driven comes in.

Data-driven

Data-driven separates test logic from test data. The same test method runs against many inputs from an external source — a JSON file, a CSV, a database query, or an in-code data provider. One test, many cases.

// TypeScript/Playwright — data-driven: same test, multiple data rows
const loginCases = [
  { email: "admin@test.com",  password: "Admin123!", role: "admin" },
  { email: "user@test.com",   password: "User123!",  role: "user" },
  { email: "viewer@test.com", password: "View123!",  role: "viewer" },
];
 
for (const { email, password, role } of loginCases) {
  test(`logs in as ${role}`, async ({ page }) => {
    await page.goto("/login");
    await page.fill("#email", email);
    await page.fill("#password", password);
    await page.click("#submit");
    await expect(page).toHaveURL(new RegExp(role));
  });
}

Three tests generated from three data rows. Add a fourth row and you get a fourth test — no new test code. Data lives in one place; logic lives in one place. The same structure works in Java with TestNG @DataProvider, in pytest with @pytest.mark.parametrize, and in JUnit 5 with @ParameterizedTest.

When data-driven earns its cost: when you have 5+ meaningful variations of the same test scenario. For 2 variations, inline is often more readable. For 50, data-driven is the only sane option.

Keyword-driven

Keyword-driven frameworks express tests as tables of keywords, each keyword mapping to a function that performs an action. The goal: non-technical contributors (BAs, product managers) can read and write tests without understanding code.

# Robot Framework — keyword-driven, readable by non-developers
*** Test Cases ***
Checkout As Admin
    Open Browser    https://staging.myapp.com    chrome
    Login           admin@test.com    Admin123!
    Add To Cart     Wireless Keyboard
    Proceed To Checkout
    Verify Order Confirmation

*** Keywords ***
Login
    [Arguments]    ${email}    ${password}
    Input Text     id=email    ${email}
    Input Text     id=password    ${password}
    Click Button   id=submit

Login, Add To Cart, Proceed To Checkout are keywords defined in a keyword library. The test table is written in near-plain-English. Robot Framework is the industry-standard implementation; custom keyword libraries on top of WebDriver are another.

The cost: keyword-driven frameworks are the most complex to build and maintain. The keyword library needs its own development lifecycle. Teams that adopt keyword-driven without a strong reason often end up maintaining two codebases — the keyword library and the test tables — instead of one.

Four framework types at a glance

Linear

  • All steps inlined in every test

  • Zero reuse between tests

  • One UI change breaks everything

  • Fast to write test number 1

  • Never use in production

Modular

  • Shared helpers / page objects

  • Tests call functions, not selectors

  • UI changes fix in one place

  • Data still lives in test methods

  • Foundation of all other types

Data-Driven

  • Test logic separated from test data

  • One test method, many data rows

  • Scales coverage without new tests

  • Needs data management strategy

  • Built on top of modular layer

Keyword-Driven

  • Tests as keyword tables

  • Non-technical contributors can write tests

  • Most complex to build and maintain

  • Robot Framework is the standard tool

  • Use only when non-dev authoring is required

How they relate — not a hierarchy

These types are not a ladder where keyword-driven is the best. They solve different problems:

  • Modular solves duplication — shared steps in one place.
  • Data-driven solves combinatorial coverage — many inputs, one test.
  • Keyword-driven solves authorship — non-technical contributors.

A team that doesn't need non-technical contributors has no reason to build a keyword layer. A team that only has 3 data variations doesn't need a CSV file — inline the data and move on.

Most modern teams operate a hybrid framework that combines the modular base (page objects) with data-driven capability (parameterised tests or data providers) and optionally a BDD layer (Cucumber Gherkin on top). Keyword-driven in the Robot Framework style is common in enterprise contexts and mobile testing but less common in modern web automation.

⚠️ Common mistakes

  • Calling something modular when it's actually linear. If your "modular" framework has @BeforeMethod that duplicates 10 lines from another @BeforeMethod in a different class, it's still linear — the duplication has just moved to setup methods.
  • Going data-driven too early. Parameterising tests that only ever run with one dataset creates complexity with no benefit. Add the data layer when you have real data variations to cover.
  • Building a keyword library for an audience that never writes tests. Keyword-driven frameworks are only worth the investment when the non-technical contributors actually use them. Build for the audience you have, not the audience you imagine.

🎯 Practice task

Classify and refactor — 30 minutes.

  1. Classify your current project. Look at three of your existing tests. Which type are they? Look for: duplicate setup code (→ linear), shared helper functions (→ modular), hardcoded data variations (→ missing data-driven), keyword tables (→ keyword-driven).
  2. Modular refactor. Pick one test that duplicates steps from another test. Extract the shared steps into a helper method or page object method. Run both tests — they should still pass. How many lines did you eliminate?
  3. Data-driven refactor. Find a test that has a near-identical duplicate with different input values. Rewrite both as a single parameterised test. In Java, use @DataProvider (TestNG) or @ParameterizedTest (JUnit 5). In Python, use @pytest.mark.parametrize. In TypeScript, use a for...of loop over a data array. Confirm you get the same coverage with less code.
  4. Stretch — keyword vocabulary sketch. Imagine your team's BAs need to write test scenarios for a checkout flow without writing code. Draft a keyword vocabulary on paper: what 8-10 keywords would cover the whole flow? (OPEN_APP, LOGIN_AS, SEARCH_FOR_PRODUCT, etc.) How many functions would you need to implement? This exercise reveals whether keyword-driven would earn its cost for your team.

Next lesson: hybrid frameworks — how modern teams combine modular, data-driven, and optionally BDD into one coherent architecture.

// tip to track lessons you complete and pick up where you left off across devices.