The Streams API turns "loop, filter, transform, collect" into a fluent pipeline. If you've used JavaScript's array.filter(...).map(...).reduce(...) or Python's list comprehensions, the shape will feel familiar. A Java stream takes a collection, threads each element through a chain of operations, and lands in a result — a list, a count, a sum, a Boolean, an Optional. Streams are how modern Java codebases process test data: counting failures, grouping by priority, calculating pass rates, finding the slowest test. This lesson covers the operations you'll use most and the mental model that makes them easy to reason about.
Creating a stream
Most often you stream a collection:
List<String> tests = List.of("LoginTest", "Search", "CheckoutTest");
tests.stream() // Stream<String>Or an array:
String[] arr = {"a", "b", "c"};
Arrays.stream(arr); // Stream<String>Or a literal sequence:
Stream.of("a", "b", "c");
IntStream.range(0, 10); // 0, 1, ..., 9A Stream<T> is a lazy view over a sequence of T. It doesn't store data; it pulls from the source as you ask for results.
The everyday operations
Three families:
Intermediate (return a new stream, lazy):
filter(Predicate)— keep matching elementsmap(Function)— transform each elementsorted()/sorted(Comparator)— sortdistinct()— drop duplicateslimit(n)/skip(n)— paginationpeek(Consumer)— debug-style "look at each element"
Terminal (run the pipeline, produce a result):
collect(Collectors.toList())or.toList()— into aListforEach(Consumer)— side effect on eachcount()— longfindFirst()/findAny()—Optional<T>anyMatch(Predicate)/allMatch(...)/noneMatch(...)— booleanmin(Comparator)/max(...)/reduce(...)—Optional<T>
Numeric specialised (when working with primitives):
mapToInt(...),mapToLong(...),mapToDouble(...)— switch toIntStreametc.sum(),average(),min(),max()— only on numeric streams
A pipeline always has zero or more intermediate operations followed by exactly one terminal operation. No terminal call = nothing runs; the stream is lazy and only does work when you ask for results.
A real QA pipeline
import java.util.*;
import java.util.stream.*;
public class TestStreams {
record TestResult(String name, String status, String priority, long durationMs) {}
public static void main(String[] args) {
List<TestResult> results = List.of(
new TestResult("Login", "PASSED", "P0", 1450),
new TestResult("Search", "PASSED", "P2", 820),
new TestResult("Checkout", "FAILED", "P0", 3120),
new TestResult("Logout", "PASSED", "P2", 990),
new TestResult("Export", "FAILED", "P1", 4800),
new TestResult("BillingReport", "PASSED", "P1", 2100)
);
// 1) Count failures
long failCount = results.stream()
.filter(r -> r.status().equals("FAILED"))
.count();
System.out.println("Failures: " + failCount);
// 2) Failure names, sorted alphabetically
List<String> failureNames = results.stream()
.filter(r -> r.status().equals("FAILED"))
.map(TestResult::name)
.sorted()
.toList();
System.out.println("Failure names: " + failureNames);
// 3) Average duration across all tests
double avg = results.stream()
.mapToLong(TestResult::durationMs)
.average()
.orElse(0);
System.out.printf("Average: %.0fms%n", avg);
// 4) Slowest test
TestResult slowest = results.stream()
.max(Comparator.comparingLong(TestResult::durationMs))
.orElseThrow();
System.out.println("Slowest: " + slowest.name() + " (" + slowest.durationMs() + "ms)");
// 5) Did every P0 test pass?
boolean allP0Passed = results.stream()
.filter(r -> r.priority().equals("P0"))
.allMatch(r -> r.status().equals("PASSED"));
System.out.println("All P0 passed? " + allP0Passed);
}
}Output:
Failures: 2
Failure names: [Checkout, Export]
Average: 2213ms
Slowest: Export (4800ms)
All P0 passed? false
Read each pipeline as a sentence: "of the results, keep failures, project to names, sort, collect into a list." That readability is half the point — the imperative loop equivalent for each pipeline is 6–10 lines and harder to skim.
Method references — the User::name shorthand
map(r -> r.name()) is fine; map(TestResult::name) is shorter and reads more naturally. Method references (lesson 3) shine in stream pipelines because most maps are exactly "call one method on each element." When the lambda body is literally one method call, prefer the reference form.
Optional — find without null
findFirst(), min(), max(), reduce() all return Optional<T> — a typed wrapper that's either present or empty. The point is to force you to handle the empty case rather than silently returning null:
Optional<TestResult> firstFail = results.stream()
.filter(r -> r.status().equals("FAILED"))
.findFirst();
if (firstFail.isPresent()) {
System.out.println("first failure: " + firstFail.get().name());
}
// or — handle empty inline
String name = firstFail.map(TestResult::name).orElse("(no failures)");
System.out.println(name);The most common operations on Optional:
.isPresent()— boolean check.orElse(default)— value or default.orElseThrow()— value or throwNoSuchElementException.map(...)/.filter(...)— chain operations on the value if present.ifPresent(consumer)— run code only if present
Optional is one of those features that feels weird until it's saved you from a NullPointerException. Then it stays saved.
Grouping with Collectors.groupingBy
For grouping data by a key, Collectors.groupingBy is the heavyweight tool:
import java.util.*;
import java.util.stream.*;
Map<String, List<TestResult>> byPriority = results.stream()
.collect(Collectors.groupingBy(TestResult::priority));
byPriority.forEach((p, list) -> System.out.println(p + " -> " + list.size() + " tests"));Output:
P0 -> 2 tests
P1 -> 2 tests
P2 -> 2 tests
Or count failures by priority in one pass:
Map<String, Long> failuresByPriority = results.stream()
.filter(r -> r.status().equals("FAILED"))
.collect(Collectors.groupingBy(TestResult::priority, Collectors.counting()));
System.out.println(failuresByPriority); // {P0=1, P1=1}The two-argument groupingBy(keyFn, downstream) lets you group and aggregate in one call. Collectors.counting(), Collectors.summingLong(...), Collectors.mapping(...) are the building blocks. They feel verbose at first; once you've replaced your fifth nested loop with one of them, you'll see why they exist.
Streams are lazy
Intermediate operations don't do work until a terminal operation kicks the pipeline. Notice this peek only fires for elements the terminal operation actually consumes:
long count = Stream.of("a", "bb", "ccc", "dddd")
.peek(s -> System.out.println("seen: " + s))
.filter(s -> s.length() >= 2)
.limit(2)
.count();
System.out.println("count: " + count);Output:
seen: a
seen: bb
seen: ccc
count: 2
Only three elements are walked — limit(2) short-circuits the pipeline once it has two matches. Imperative code can't get that for free; you'd have to add a counter and a break. Streams handle it because each operation is asked one element at a time, on demand.
Streams vs loops — when to pick which
Streams shine when you're transforming data — filter, map, group, collect. Plain loops are still the better choice when:
- The body is genuinely complex (multiple conditional branches, mutating multiple counters).
- You need explicit early termination with a
returnfrom the enclosing method. - The collection is small enough that readability wins over expressiveness.
A single-pass stream of 6 results is not faster than a loop. It's not noticeably slower either. The choice is about clarity. If a pipeline reads like a sentence, use it; if it reads like a puzzle, use a loop.
A pipeline, step by step
Step 1 of 6
Source: List<TestResult>
Six TestResult objects in memory. Calling .stream() returns a Stream<TestResult> view — no data is copied.
Five steps, one fluent expression: results.stream().filter(...).map(...).sorted().toList(). The diagram is also the shape of every stream pipeline you'll write — the verbs change, but the rhythm of source → intermediate → intermediate → terminal stays the same.
⚠️ Common mistakes
.toList()returns an unmodifiable list. Calling.add(...)on the result throwsUnsupportedOperationException. If you need a mutable result, eithercollect(Collectors.toList())(mutable in current implementations, though documented as "no guarantee") or wrap withnew ArrayList<>(stream.toList()).- Iterating a stream twice. Streams can be consumed once. After a terminal operation, the same stream is closed. If you need two passes, build two streams from the source:
list.stream().count()andlist.stream().anyMatch(...). - Chaining
forEachand expecting a result.list.stream().filter(...).forEach(...)runs the side effect; it doesn't return a list. To collect and print, split into two stages:var kept = list.stream().filter(...).toList(); kept.forEach(System.out::println);.
🎯 Practice task
Build a real test report with streams. 30 minutes.
- Create
StreamReport.java. Definerecord TestResult(String name, String status, String priority, long durationMs) {}. - Build a
List<TestResult>with at least 8 entries — mix priorities (P0,P1,P2), statuses (PASSED,FAILED,SKIPPED), and durations. - Compute and print:
- Failure count with
.stream().filter(...).count(). - Pass rate with
(double) passed / total * 100. Use.filter(...).count()twice (or once + arithmetic). - Failure names sorted alphabetically with
.filter(...).map(TestResult::name).sorted().toList(). - Slowest test with
.max(Comparator.comparingLong(TestResult::durationMs)).orElseThrow(). - Average duration of P0 tests with
.filter(r -> r.priority().equals("P0")).mapToLong(TestResult::durationMs).average().orElse(0).
- Failure count with
- Use
Collectors.groupingBy(TestResult::priority, Collectors.counting())to print a count of tests per priority. Verify the totals add up to your list size. - Use
Collectors.groupingBy(TestResult::priority, Collectors.summingLong(TestResult::durationMs))for a "total duration per priority" view. - Use
.allMatch(r -> r.priority().equals("P0") ? r.status().equals("PASSED") : true)(or a cleaner equivalent) to confirm "all P0 tests passed." Try changing one P0 to FAILED and re-running. - Stretch: rewrite the slowest-test query with
.sorted(Comparator.comparingLong(TestResult::durationMs).reversed()).findFirst(). Confirm both forms produce the same result. The.max(comparator)form is shorter; the sorted-then-findFirst form generalises to "top N" with a.limit(n).
That closes Chapter 8 — and the data-handling foundation of the course. Chapter 9 is the capstone: putting strings, regex, exceptions, file I/O, OOP, and streams together to build a real test data management utility.