Q15 of 40 · Core Java

Explain the Stream API with a realistic data transformation example.

Core JavaMidjava-streamsfunctional-programmingcollectionslambda

Short answer

Short answer: The Stream API provides a declarative pipeline for processing sequences of elements: source → zero-or-more intermediate operations (filter, map, sorted, flatMap) → one terminal operation (collect, forEach, reduce, count). Streams are lazy — intermediate operations don't execute until a terminal operation is called.

Detail

A Stream<T> is not a data structure — it's a pipeline that describes a sequence of transformations. You get a stream from a source (collection, array, Stream.of(), Files.lines()), chain intermediate operations, and end with one terminal operation that triggers execution.

Lazy evaluation: intermediate operations like filter() and map() are lazy — they build a description of work but do nothing until a terminal operation is called. This means a filter().map().findFirst() pipeline can short-circuit after finding the first match, avoiding processing the rest of the stream.

Key intermediate operations:

  • filter(Predicate) — keep elements matching a predicate
  • map(Function) — transform each element to a new type
  • flatMap(Function) — flatten a stream-of-streams to a single stream
  • sorted() / sorted(Comparator) — sort elements
  • distinct() — remove duplicates
  • limit(n) / skip(n) — windowing

Key terminal operations:

  • collect(Collectors.toList()) / toSet() / toMap() — materialise into a collection
  • forEach(Consumer) — side-effect for each element
  • reduce(BinaryOperator) — fold to a single value
  • count(), findFirst(), anyMatch(), allMatch() — aggregation / short-circuit

Streams are not reusable — a stream can be consumed only once. If you need to iterate a result multiple times, collect to a list first.

For test data: streams excel at filtering test cases by tag, grouping scenarios by category, or transforming a list of raw API responses into typed domain objects.

// EXAMPLE

StreamExample.java

record TestCase(String id, String tag, boolean passed, long durationMs) {}

List<TestCase> results = loadTestResults();

// Collect IDs of failed smoke tests sorted by duration descending
List<String> failedSmokeIds = results.stream()
    .filter(tc -> !tc.passed())               // keep failures
    .filter(tc -> "smoke".equals(tc.tag()))   // smoke tag only
    .sorted(Comparator.comparingLong(TestCase::durationMs).reversed())
    .map(TestCase::id)                        // extract id
    .toList();                                // Java 16+ (immutable list)

// Count failures per tag
Map<String, Long> failuresByTag = results.stream()
    .filter(tc -> !tc.passed())
    .collect(Collectors.groupingBy(TestCase::tag, Collectors.counting()));

// Average duration of passing tests (OptionalDouble for empty stream)
OptionalDouble avgPassMs = results.stream()
    .filter(TestCase::passed)
    .mapToLong(TestCase::durationMs)
    .average();

// flatMap — flatten a list-of-lists into one stream
List<String> allSteps = results.stream()
    .flatMap(tc -> getSteps(tc.id()).stream())
    .distinct()
    .toList();

// WHAT INTERVIEWERS LOOK FOR

Correct lazy-evaluation explanation, knowledge of common intermediate and terminal operations, and a realistic example beyond trivial number lists. Strong answers mention the non-reusability of streams and use Records or real domain types in the example.

// COMMON PITFALL

Calling collect(Collectors.toList()) and then iterating the stream again, causing an IllegalStateException. Streams are single-use. Also common: using forEach() for transformations that should use map() — forEach is for side effects only.