Q39 of 40 · Core Java

How would you structure code review for Java in a QA team transitioning from manual to automation?

Core JavaLeadcode-reviewteam-leadershipautomation-transitionjavatesting-culture

Short answer

Short answer: Structure reviews in three layers: a lightweight automated gate (compilation, checkstyle, test run) that every PR must pass without human involvement; a human review focusing on test design quality (coverage intent, edge cases, flakiness risk) rather than syntax; and occasional longer design sessions for shared infrastructure (base classes, utilities, test data builders). The goal is to build team capability, not to gatekeep.

Detail

Layer 1 — Automated gate (CI, no human needed)

<!-- pom.xml — enforce style and static analysis automatically -->
<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-checkstyle-plugin</artifactId>
  <executions>
    <execution>
      <phase>validate</phase>
      <goals><goal>check</goal></goals>
    </execution>
  </executions>
</plugin>

Automate everything with an objective answer. Human reviewers shouldn't spend time on style, import order, or null-check consistency — tools own those.

Layer 2 — Human review checklist (pair or async)

Test design (highest value)

  • Does each test verify one clear behaviour? (Arrange/Act/Assert separation)
  • Are assertions specific? (assertEquals(expected, actual) not just assertNotNull)
  • Is the test name descriptive? (givenExpiredToken_whenLogin_thenReturn401)
  • Does the test clean up after itself? (no shared mutable state leaking between tests)
  • Would this test catch a regression that matters?

Java correctness

  • No Thread.sleep() — use explicit waits or polling with timeout
  • Resources closed (try-with-resources for streams, JDBC connections)
  • Checked exceptions handled or declared, not swallowed
  • No raw types — parameterise generics

Flakiness risk

  • Hardcoded delays? Flag them.
  • Depends on execution order? Flag it.
  • Shares state across threads without synchronisation?
// Code review comment template
// Context: what the test is trying to prove
// Concern: why this specific line/pattern is risky
// Suggestion: concrete alternative

// Example:
// Context: login flow test
// Concern: Thread.sleep(3000) makes this test slow and brittle —
//          the element may appear in 200ms or take 5s on CI
// Suggestion: use WebDriverWait.until(ExpectedConditions.visibilityOf(...))

Layer 3 — Design sessions (for shared infra) Hold weekly or fortnightly sessions for changes to:

  • Base test classes, page objects, test data builders
  • Shared fixtures, factories, environment config
  • New framework dependencies

Review these synchronously. Async review on shared code leads to inconsistent patterns multiplied across the suite.

Culture for a transitioning team

  • First PRs from manual testers get mentoring reviews, not blocking feedback.
  • Approval required from 1 peer + 1 senior; but never block on trivial style (automation handles that).
  • Link every review comment to a reason — "this causes flakiness because…" teaches faster than "change this."
  • Track flaky tests by commit that introduced them. Use that data in retrospectives, not blame.

// WHAT INTERVIEWERS LOOK FOR

A tiered approach that separates automated vs human concerns. Focus on test design quality (what the test proves) over Java syntax. Sensitivity to team capability growth — this is about coaching, not gatekeeping. Strong answers mention flakiness tracking and how to handle the first contributions from manual testers without discouraging them.

// COMMON PITFALL

Making all review comments blocking regardless of severity. A transitioning team needs encouragement; a 20-comment review that blocks a first PR demoralises without teaching. Distinguish must-fix (breaks CI, flaky) from nice-to-have (style, naming) and require only the former for merge.