Most engineers learn test automation one pattern at a time: first page objects, then factories, then ThreadLocal, then reporting. That sequenced approach is the right way to learn, but it leaves one thing untested — the ability to look at a blank repository and make every architectural decision yourself. This capstone closes that gap. You will design and build a production-quality hybrid test automation framework from scratch, for a hypothetical e-commerce application, applying every principle and pattern from this course in a single coherent system.
This is a design-first capstone. Before you write a line of test code, you will produce an architecture diagram, justify your folder structure, and write three Architecture Decision Records explaining the non-obvious choices. The deliverable is not just a working suite — it is a framework that a new engineer could onboard to in under an hour, extend without fear, and scale to thousands of tests without rewriting the core.
The scenario
The application is a mid-sized e-commerce platform with four main areas: authentication, product catalogue, shopping cart, and checkout. The QA team currently has no automated test suite. You are the first automation engineer, building the framework from nothing. The engineering team runs CI on every pull request and expects a smoke suite to complete in under 5 minutes. The test suite will eventually grow to several hundred tests, so the architecture must support parallel execution from day one.
Choose your stack
Select one stack and commit to it. The architecture applies equally to all three; use the one your team actually works with.
Java: Selenium WebDriver + TestNG + Maven + ExtentReports + SLF4J/Log4j2
TypeScript: Playwright + built-in test runner + custom utilities + Playwright HTML report + pino logging
Python: pytest + Playwright (or Selenium) + Allure + structlog
The code examples throughout this capstone use the Java stack as the reference implementation. Every pattern has a direct equivalent in TypeScript and Python — refer to the earlier chapters for those translations.
The 15 deliverables
A complete framework submission covers all of the following:
1. Architecture diagram. A hand-drawn or diagramming-tool sketch showing the five layers (Test, Page/Service, Util, Config, Driver), the direction of dependencies, and the cross-cutting concerns (logging, reporting, retry) that span layers. This is produced before any code — it is the blueprint.
2. Project skeleton with justified folder structure. Every top-level folder exists for a reason. Document those reasons in one sentence per folder — this is the seed of your CONTRIBUTING guide.
3. BaseTest and BasePage classes. BaseTest manages the driver lifecycle. BasePage encapsulates common interactions: waitForVisible, click, type, getText. Neither class contains a single business-logic statement.
4. Driver management. A DriverManager using ThreadLocal<WebDriver> (Java) or Playwright's per-worker context isolation (TypeScript/Python). The driver is fully disposed in teardown — driver.remove() in Java, context.close() in Playwright — with alwaysRun = true to guarantee teardown even on test failure.
5. Configuration management. A Config singleton with a three-level precedence chain: environment variable overrides properties file overrides default. The suite runs against local, staging, and production environments by setting one variable — ENV=staging mvn test — with no code changes.
6. Page object initialisation pattern. A factory or initialisation convention that makes creating page objects consistent. In Java: PageFactory.initElements(driver, this) in BasePage. In TypeScript: a pages(page) fixture that returns all pages together. In Python: a page_objects fixture.
7. Test data factory using the Builder pattern. A UserBuilder (or dataclass equivalent) that constructs test users with sensible defaults and explicit overrides — UserBuilder.defaults().withRole("admin").withVerifiedEmail(true).build(). The factory is parallel-safe: every call produces a unique user with a UUID-based email.
8. Logging strategy. SLF4J + Log4j2 in Java, pino in TypeScript, structlog in Python. Logging is configured per environment: DEBUG in local runs, INFO in CI. Every test start and end is logged at INFO. Every page navigation is logged at DEBUG. Every exception is logged at ERROR before re-throwing.
9. Reporting integration. ExtentReports (Java), Playwright HTML report (TypeScript), or Allure (Python). The report is generated automatically after every run, in a consistent output path that CI can upload as an artifact.
10. Listener for screenshot-on-failure. A TestListener (Java) or fixture (Python/TypeScript) that captures a full-page screenshot immediately when a test fails and attaches it to the report. The screenshot is named with the test name and a timestamp.
11. Retry analyser. A maximum of 1 retry for genuinely flaky tests. The retry decision is based on a configurable list of exception types — StaleElementReferenceException, TimeoutException — not applied to all tests indiscriminately. Retries are logged at WARN level with the original failure message.
12. Sample tests demonstrating the framework. Three complete tests:
- A smoke test: verifies the application is reachable and login works
- A data-driven test: verifies login with three sets of credentials (valid, locked-out, empty) using TestNG
@DataProvideror pytestparametrize - A cross-browser test: verifies the smoke test passes in Chrome and Firefox, driven by a config parameter
13. CI/CD pipeline. A GitHub Actions workflow (or equivalent) that runs the smoke suite on every pull request, with HEADLESS=true, and uploads the test report as a build artifact. A separate nightly workflow runs the full regression suite.
14. README. Covers prerequisites, clone and run commands, environment variables, folder structure, how to add a test, and how to contribute. A new engineer with the correct tools installed should be able to run the smoke suite in under 10 minutes from a clean checkout.
15. Architecture Decision Records. Three ADRs in docs/adr/:
001-driver-management.md— ThreadLocal vs static field vs dependency injection002-reporting-choice.md— chosen reporting library vs the main alternative003-test-data-strategy.md— Builder-based factories vs file-based data vs database seeding
- – Smoke tests
- – Data-driven tests
- – Cross-browser tests
- – LoginPage
- – ProductsPage
- – CartPage / CheckoutPage
- – UserBuilder
- – ProductFactory
- – Parallel-safe UUIDs
- – Env vars → file → defaults
- – Multi-environment support
- – Browser + headless flags
- ThreadLocal / Playwright context –
- DriverManager –
- alwaysRun teardown –
- SLF4J + Log4j2 –
- ExtentReports / Allure –
- Retry analyser –
Stretch goals
These extend the framework beyond the minimum viable deliverable. Attempt them only after all 15 core deliverables are complete and verified.
Selenium Grid with Docker Compose. A docker-compose.yml that spins up a Selenium Grid hub with Chrome and Firefox nodes. DriverManager detects a GRID_URL environment variable and creates a RemoteWebDriver instead of a local driver.
Sharded execution across CI machines. A GitHub Actions matrix that splits the full regression suite across 3 parallel jobs. For Playwright: --shard=N/3. For TestNG: separate XML suite files per shard. CI runtime drops to roughly one-third.
Custom Slack reporter for failures. A listener or fixture that posts a Slack message (via webhook) when any test fails in CI, including the test name, failure message, and a link to the report artifact.
Performance metrics collection. A PerformanceListener that measures and logs page load times for every navigation action. Summary report: slowest 5 pages by average load time, across all test runs.
Self-healing locators with Healenium. Replace ChromeDriver with SelfHealingDriver from the Healenium library. The first time a locator fails, Healenium searches for the nearest matching element and updates its internal selector cache. Document what it heals and what it cannot.
Visual regression integration. Add Applitools Eyes or Percy to the smoke suite. Three checkpoints: login page, products page, order confirmation. Run the baseline once; subsequent runs flag visual diffs.
Project work
This is a multi-session project. Use the following breakdown across your available time:
Session 1 — Architecture (60 minutes). Draw the architecture diagram. Write the folder structure with one-line justifications. Write the three ADR templates (context, options, decision, consequences) with placeholders — you will fill in decisions as you build. Do not write code yet.
Session 2 — Foundation (90 minutes). Implement Config, DriverManager, BaseTest, BasePage. Write a single "can the browser open the app" smoke test that uses them. Run it locally — headless and headful. Run it twice with thread-count="2". Both runs green. This is your foundation; everything else builds on it.
Session 3 — Page objects and factories (90 minutes). Implement page objects for all four flows. Implement UserBuilder and ProductFactory. Wire them into two test classes: LoginTest and CheckoutTest. Run both in parallel — green. This is the point at which the framework structure is validated.
Session 4 — Infrastructure (60 minutes). Add logging, reporting, screenshot-on-failure, and retry analyser. Trigger a deliberate failure — confirm the screenshot appears in the report and the retry fires once. Fix the deliberate failure. Infrastructure layer is complete.
Session 5 — CI and documentation (60 minutes). Write the GitHub Actions workflow. Push to a remote repository. Observe the smoke suite running in CI. Fix any environment-specific failures (headless flag, environment variables). Complete the README and fill in the ADR decisions based on what you actually built.
The total is approximately 6 hours of focused work. Space it across multiple days — the architecture decisions benefit from time to think.