Lists and dicts cover most QA collection work, but two more built-in collections — tuple and set — fill specific niches. A tuple is an ordered, immutable record: data that shouldn't change once set. A set is an unordered bag of unique values, perfect for deduplication and membership checks. This lesson covers both, plus a side-by-side recap of all four built-in collection types so you know which to reach for next time.
Tuples — fixed-shape records
A tuple looks like a list with parentheses instead of brackets:
test_result = ("Login Test", True, 1250) # name, passed, duration
print(test_result[0]) # "Login Test"
print(test_result[2]) # 1250The big difference from a list: tuples are immutable. You can't change, add, or remove items after creation:
test_result[1] = False # TypeError: 'tuple' object does not support item assignmentThat immutability is the whole point. A tuple says "this is a record with this exact shape — don't mess with it."
The parentheses are optional in many places — 1, 2, 3 and (1, 2, 3) are the same tuple. The comma is what makes it a tuple, not the parens. (One quirk: a one-element tuple needs a trailing comma — ("only",). Without the comma, ("only") is just a parenthesised string.)
Unpacking — tuples shine here
The cleanest use of tuples is unpacking — splitting a tuple into named variables in a single statement:
test_result = ("Login Test", True, 1250)
name, passed, duration = test_result
print(name) # "Login Test"
print(passed) # True
print(duration) # 1250This is the same syntax we used for lists in the previous lesson — it works for any sequence. Tuples are the conventional choice when the values are heterogeneous (different types, fixed positions).
Returning multiple values
Python doesn't have a special "multiple return values" feature — it just returns a tuple. The caller unpacks it:
def run_test(name):
# ... pretend we ran it ...
return True, 1250 # actually returns the tuple (True, 1250)
passed, duration_ms = run_test("Login Test")
print(passed) # True
print(duration_ms) # 1250That's the cleanest "return two things" syntax in any mainstream language. Java would need a Pair<Boolean, Integer> class; Python just packs and unpacks tuples.
For functions returning more than two or three values, use a dict or a dataclass instead — covered in chapter 5. Anything beyond a triple becomes hard to remember the order of.
When to use a tuple
- A function returns multiple values (
return passed, duration_ms). - You need a dict key that holds composite info — only immutable types can be dict keys, so
{("staging", "chromium"): "fastest"}works but[…]doesn't. - You want to signal "this record is fixed shape —
(x, y)always, in that order." - You're storing many records and want a small, fast container without method overhead.
When you'd prefer a list: anything that grows, shrinks, or gets reordered. When you'd prefer a dict: anything where the names of fields matter more than positions.
Sets — unordered, unique
A set holds unique values. Curly braces, like dicts, but with values not key/value pairs:
tested_browsers = {"Chrome", "Firefox", "Chrome"}
print(tested_browsers) # {'Chrome', 'Firefox'} — duplicate dropped
print(len(tested_browsers)) # 2Adding a duplicate is silently ignored:
tested_browsers.add("Chrome")
tested_browsers.add("Safari")
print(tested_browsers) # {'Chrome', 'Firefox', 'Safari'}Two things to know up front:
- Sets are unordered. Don't expect insertion order. Don't index with
[0]— that raisesTypeError. If order matters, use a list. - An empty set is
set(), not{}.{}is an empty dict, as we saw in the last lesson.
Set operations — the killer feature
Sets shine when you need to compare two collections. Python supports the mathematical set operations directly:
suite_a = {"login", "checkout", "search", "logout"}
suite_b = {"login", "logout", "profile", "settings"}
# Tests in BOTH suites
common = suite_a & suite_b # {'login', 'logout'}
# Tests in EITHER suite (no duplicates)
all_tests = suite_a | suite_b # {'checkout', 'login', 'logout', 'profile', 'search', 'settings'}
# Tests in A but NOT in B
only_a = suite_a - suite_b # {'checkout', 'search'}
# Tests in exactly ONE suite (symmetric difference)
divergent = suite_a ^ suite_b # {'checkout', 'profile', 'search', 'settings'}| Operator | Method | Meaning |
|---|---|---|
a & b | a.intersection(b) | In both |
a | b | a.union(b) | In either |
a - b | a.difference(b) | In a, not in b |
a ^ b | a.symmetric_difference(b) | In exactly one |
Use the operators for short snippets, the named methods when you want the code to read aloud.
A QA case where sets are exactly right: comparing the tests that passed in two CI runs to find regressions:
passed_yesterday = {"login", "checkout", "search", "logout", "profile"}
passed_today = {"login", "checkout", "logout", "settings"}
new_failures = passed_yesterday - passed_today # {'profile', 'search'}
new_passes = passed_today - passed_yesterday # {'settings'}
print(f"Newly failing: {new_failures}")
print(f"Newly passing: {new_passes}")Output:
Newly failing: {'profile', 'search'}
Newly passing: {'settings'}
Three lines of set arithmetic do what a manual loop comparison would take twenty.
Membership is fast
Checking if "Chrome" in browsers on a list is O(n) — Python scans every element. On a set it's O(1) — sets are hashed. For small lists you'll never notice; for thousands of values, sets are dramatically faster.
seen = set()
for name in test_names:
if name in seen: # constant time
print(f"Duplicate: {name}")
seen.add(name)This "track what we've seen" pattern is one of the most common reasons to reach for a set.
The four built-in collections — a recap
list vs tuple vs set vs dict
list — [a, b, c]
Ordered
Mutable — change, add, remove freely
Duplicates allowed
Index by position: x[0], x[-1]
Use for: a growing collection of similar items
QA example: a run's accumulating test results
tuple — (a, b, c)
Ordered
Immutable — can't change after creation
Duplicates allowed
Index by position; usually unpacked into names
Use for: fixed-shape records, multi-return values
QA example: returning (passed, duration_ms) from a test
set — {a, b, c}
Unordered
Mutable — add, remove freely
No duplicates — added duplicates are silently dropped
Fast O(1) membership: x in s
Use for: deduplication and comparing two collections
QA example: tests that passed today minus those that passed yesterday
dict — {k: v}
Insertion-ordered (since Python 3.7)
Mutable — set/update keys freely
Keys unique; values can repeat
Look up by key: x['name']
Use for: name → value mappings, JSON, config
QA example: launch config, API headers, parsed JSON
The decision tree most QA engineers carry in their head:
- Need a name → value lookup? Dict.
- Need fast "is this value in there" or to compare two groups? Set.
- Need a fixed-shape record or multi-return? Tuple.
- Otherwise, default to list.
⚠️ Common mistakes
- One-element tuple without the trailing comma.
("only")is a string in parentheses;("only",)is a one-element tuple. The comma is what makes the tuple. Forget it andlen(x)returns the string's length, not 1. - Treating
{}as an empty set.{}is an empty dict. To create an empty set you have to writeset(). The literal-set syntax{1, 2, 3}only works once there's at least one element. - Indexing or slicing a set.
tested[0]andtested[:2]raiseTypeError. Sets are unordered — there's no "first" element. Convert to a list (list(tested)[0]) if you absolutely must, but if order matters you probably want a list to begin with.
🎯 Practice task
Use tuples and sets to compare two runs. 20-25 minutes.
- Create
compare_runs.py. - Define two sets —
yesterday = {"login", "checkout", "search", "logout", "profile"}andtoday = {"login", "checkout", "logout", "settings", "profile"}. - Print four lines using the set operators:
- tests in both runs (
&) - tests in either run (
|) - tests that ran yesterday but not today (
-) - tests that ran in exactly one run (
^)
- tests in both runs (
- Build a regression-style report:
regressed = passed_yesterday - passed_today(use the same sets but read them as "tests that passed"). Print the count. - Define a function
def run_test(name: str) -> tuple:that returns the tuple(name, True, 1250). Call it three times with different names and unpack each call's result withname, passed, duration = run_test(...). Print each. - Define
summary = (4, 1, 0)— a(passed, failed, skipped)triple. Unpack and print each piece. Trysummary[0] = 99and observe theTypeError. - Use a set to deduplicate a list with repeats:
raw = ["P0", "P1", "P0", "P2", "P1"], thenunique_priorities = set(raw). Print both lengths. - Stretch: write a function
def diff_runs(a: set, b: set) -> dict:that returns{"both": …, "only_a": …, "only_b": …, "either": …}using all four set operators. Call it with the suites from step 2 and pretty-print the result.
You can now pick the right collection for any QA job. The next lesson stitches together everything we've learned about lists and dicts to read and write JSON — the data format every API and every fixture file speaks.