ArrayList and HashMap cover most of the territory; the two siblings worth knowing fill the remaining corners. HashSet stores unique values — adding the same value twice is silently ignored, and "is X in the set?" is fast. LinkedList is a doubly-linked list optimised for adding and removing at both ends, which makes it the natural fit for queues. Neither is your everyday default — you'll reach for ArrayList and HashMap 80% of the time — but recognising when one of these two is the right tool is the difference between elegant and over-complicated code.
HashSet — uniqueness for free
HashSet is a collection that enforces uniqueness. Add an element that's already there and add returns false and the set is unchanged:
import java.util.HashSet;
import java.util.Set;
public class UniqueErrors {
public static void main(String[] args) {
Set<String> seenErrors = new HashSet<>();
seenErrors.add("Element not found");
seenErrors.add("Timeout waiting for AJAX");
boolean wasNew = seenErrors.add("Element not found"); // duplicate
System.out.println("was new? " + wasNew); // false
System.out.println("size: " + seenErrors.size()); // 2, not 3
System.out.println("seen Timeout? " + seenErrors.contains("Timeout waiting for AJAX"));
System.out.println("contents: " + seenErrors);
}
}Output:
was new? false
size: 2
seen Timeout? true
contents: [Element not found, Timeout waiting for AJAX]
Three things to internalise:
addreturnsboolean—trueif the element was new,falseif it was already there. Useful for "have I already seen this?" checks in one line.containsis fast — average constant time, likeHashMap.containsKey. BeatsArrayList.contains(which scans every element) once you have more than a handful.- The set is unordered, just like
HashMap. If you need insertion order, useLinkedHashSet; for sorted,TreeSet.
When to reach for HashSet:
- You want to deduplicate a list —
new HashSet<>(list)strips duplicates in one line. - You're tracking whether you've seen something — flaky test fingerprints, executed test cases, error messages.
- You need fast membership tests where a
List.containswould be too slow.
A real QA snippet — collect unique error messages across a test run:
import java.util.HashSet;
import java.util.Set;
public class ErrorBucket {
public static void main(String[] args) {
String[] runErrors = {
"Element not found: #submit",
"Timeout waiting for AJAX",
"Element not found: #submit",
"Stale element reference",
"Timeout waiting for AJAX",
"Element not found: #submit"
};
Set<String> unique = new HashSet<>();
for (String e : runErrors) {
unique.add(e);
}
System.out.println("Raw errors: " + runErrors.length);
System.out.println("Unique errors: " + unique.size());
System.out.println(unique);
}
}Output:
Raw errors: 6
Unique errors: 3
[Stale element reference, Element not found: #submit, Timeout waiting for AJAX]
Six raw failures collapse to three real fingerprints. That collapse is why test reporters often pipe error messages through a Set before showing them.
LinkedList — queues, not random access
LinkedList implements both the List interface (so you can use it like an ArrayList) and the Deque interface — a double-ended queue with cheap operations at both ends:
import java.util.LinkedList;
import java.util.Queue;
public class TestQueueDemo {
public static void main(String[] args) {
LinkedList<String> queue = new LinkedList<>();
queue.addLast("smoke");
queue.addLast("regression");
queue.addLast("e2e");
queue.addFirst("setup");
System.out.println("Queue: " + queue);
while (!queue.isEmpty()) {
String next = queue.removeFirst();
System.out.println("running: " + next);
}
}
}Output:
Queue: [setup, smoke, regression, e2e]
running: setup
running: smoke
running: regression
running: e2e
addFirst, addLast, removeFirst, removeLast — that's the queue/deque API. They're constant-time on LinkedList because the structure is a chain of nodes; adding at the head or tail is just rewiring two pointers.
LinkedList also has get(i) and set(i, value), but they're slow — finding index i walks i nodes from one end. For random access, ArrayList is dramatically faster (constant time vs linear).
The decision rule:
- Mostly random access (
get(i),set(i, v)): useArrayList. Faster, less memory. - Mostly insert/remove at either end: use
LinkedList. - In doubt:
ArrayList. It's faster more often than people expect.
In real QA codebases LinkedList shows up rarely — usually as a queue of pending tasks or a sliding window of recent events. ArrayList covers most of the job.
The Java collections hierarchy
- – ArrayList — default
- – LinkedList — queue/deque
- – Vector — legacy, avoid
- – HashSet — fast, unordered
- – LinkedHashSet — preserves insertion order
- – TreeSet — sorted
- – HashMap — fast, unordered
- – LinkedHashMap — preserves insertion order
- – TreeMap — sorted by key
- LinkedList — doubly-linked, also a List –
- ArrayDeque — array-backed deque –
- PriorityQueue — sorted by priority –
The four families: ordered with duplicates (List), unordered without (Set), keyed (Map), and FIFO/LIFO (Queue/Deque). Each family has a default class (the unordered hash one) and ordered/sorted variants. Memorise the four families; the specific classes are easy to look up.
Programming to the interfaces
Same pattern as List/ArrayList from lesson 1. Declare via the interface, build via the implementation:
Set<String> seenErrors = new HashSet<>();
Map<String, String> config = new HashMap<>();
Queue<String> pending = new LinkedList<>();A method that takes Set<String> accepts any set type. A test runner that returns List<TestResult> can switch from ArrayList to a custom List implementation in the future without changing every caller.
A combined example — deduplicating a queue
import java.util.HashSet;
import java.util.LinkedList;
import java.util.Queue;
import java.util.Set;
public class FlakeQueue {
public static void main(String[] args) {
Queue<String> retryQueue = new LinkedList<>();
Set<String> alreadyEnqueued = new HashSet<>();
String[] flakyTests = {"login", "checkout", "login", "search", "checkout", "login"};
for (String name : flakyTests) {
if (alreadyEnqueued.add(name)) { // add returns true ONLY if it was new
retryQueue.add(name);
}
}
System.out.println("Queue: " + retryQueue);
while (!retryQueue.isEmpty()) {
System.out.println("retrying: " + retryQueue.remove());
}
}
}Output:
Queue: [login, checkout, search]
retrying: login
retrying: checkout
retrying: search
alreadyEnqueued.add(name) does the membership test and the insertion in one call: it returns true if it was newly added, false if it was already present. Using a Set as a "seen" guard alongside a Queue is a common pattern in retry logic and crawlers.
⚠️ Common mistakes
- Reaching for
LinkedListwhenever you hear "list."ArrayListis faster more often than people expect — array-backed structures benefit from CPU caches in ways linked structures don't. UseLinkedListonly when the access pattern is genuinely queue/deque. - Storing custom objects in a
HashSetwithout overridingequalsandhashCode. TwoTestUserobjects with identical fields will both fit in the set becauseObject's defaultequalsis reference equality. If your set should treat "same fields" as duplicates, override both methods. (IntelliJ's Generate → equals() and hashCode() writes them for you.) - Confusing
HashSet's unordered iteration for "random." It isn't random — it's deterministic given the keys' hash codes — but it isn't insertion order or sorted order either. If your output flickers between runs (which can happen across JVM versions), switch toLinkedHashSet.
🎯 Practice task
Build a deduplicating retry queue. 25 minutes.
- Create
RetryDemo.java.import java.util.HashSet;,import java.util.LinkedList;,import java.util.Queue;,import java.util.Set;. - Declare
String[] failed = {"login", "search", "login", "checkout", "search", "login", "logout"};— a list of failed test names with duplicates. - Build
Set<String> seen = new HashSet<>();andQueue<String> retryQueue = new LinkedList<>();. - Loop over
failed. For each name, callseen.add(name)— if it returnstrue, also add toretryQueue. After the loop, the queue should hold every unique failure once, in first-seen order. - Print
retryQueueand confirm:[login, search, checkout, logout]. - Drain the queue with
while (!retryQueue.isEmpty()) { retryQueue.remove() ... }and print each name. - Try replacing
HashSetwithLinkedHashSetandQueuewithArrayDeque— same APIs, different classes. Confirm everything still works. That swappability is what programming to the interface buys you. - Stretch: make a small
class Failure { String name; long when; }with a constructor andequals/hashCodebased only onname. PutFailureobjects (with differentwhenvalues but samename) into aHashSet<Failure>and confirm the set still treats same-name failures as duplicates. That's the hash contract working for custom objects.
Lesson 4 brings the chapter together with the iteration patterns — for-each, Iterator, lambdas, streams — that walk every collection you've now met.