A single table can hold one kind of fact — a user, an order, a product. The interesting questions a tester needs to answer almost always span multiple tables: "which user placed this order?", "which products are in this order?", "did deleting that user also remove their orders?" The mechanism that links one table to another is a pair of small but powerful concepts: primary keys and foreign keys. They turn a stack of independent spreadsheets into a structured model of the real world.
Primary key — every row has a unique handle
A primary key (PK) is a column whose value uniquely identifies a single row in a table. By convention it's called id, and by convention it's an auto-incrementing integer — the database hands out 1, 2, 3, 4, 5… as you insert new rows.
CREATE TABLE users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL
);Primary keys obey three rules:
- Unique. No two rows can have the same primary-key value. The database refuses any INSERT that would duplicate it.
- Not NULL. A primary key must always have a value. There's no such thing as "a row with no id."
- One per table. Each table has exactly one primary key. (It can be made up of multiple columns — a composite key — but conceptually it's still one identifier.)
Once a row exists, its id is its forever-handle. Other tables refer to it by that id; queries find it by that id; the application's URLs include it (/users/42).
Foreign key — pointing at another table's row
A foreign key (FK) is a column that holds the primary-key value from another table. It's the database equivalent of saying "this order belongs to user 42." The pointer is just a number; the database remembers which table it points into and enforces the relationship.
CREATE TABLE orders (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER REFERENCES users(id),
total DECIMAL(10, 2) NOT NULL,
status VARCHAR(20) DEFAULT 'pending',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);user_id INTEGER REFERENCES users(id) says: "This column holds an id from the users table. The database will enforce that the value points at a real user." Try to insert an order with user_id = 9999 when no user has id 9999 and the database refuses with a foreign-key constraint violation. That refusal — called referential integrity — is one of the database's most useful guarantees, and one of the most important things testers verify.
The foreign keys in our e-commerce schema:
orders.user_id→users.id— every order belongs to a user.order_items.order_id→orders.id— every line item belongs to an order.order_items.product_id→products.id— every line item references a product.products.category_id→categories.id— every product belongs to a category.
Relationship types
Four lines of foreign keys describe three different kinds of relationship:
- One-to-many. One user has many orders; one category has many products. The "many" side holds the foreign key. This is by far the most common type — most relationships you encounter will be one-to-many.
- Many-to-many. One order contains many products, and one product appears in many orders. This is modelled with a junction table —
order_items— which has two foreign keys (order_id,product_id). Each row inorder_itemsrepresents one product appearing in one order. - One-to-one. One user has one user_profile. Less common; usually modelled as two tables sharing the same id, where the profile is a logical extension of the user.
The whole e-commerce model on one page
Drawn as an entity-relationship diagram (ERD), our e-commerce schema looks like this:
Read the chain left to right: a user places many orders; an order contains many line items; each line item references one product; each product belongs to one category. Once you can read an ERD like this, you can read any application's data model — the shapes are always similar.
Why this matters for testing
Three places where keys and relationships create real, testable behaviour:
-
Referential integrity. Try to insert an order with
user_id = 9999(no such user). The database should reject it. If your team's schema doesn't enforce that — sometimes foreign keys are declared loosely or dropped for performance — you have a real bug to surface, because orphaned orders are now possible.-- This INSERT must fail if FK is enforced INSERT INTO orders (user_id, total) VALUES (9999, 50.00); -
Cascade behaviour on delete. Delete a user — what should happen to their orders? Three sensible choices, and which one your schema picks is a correctness question, not a style question:
ON DELETE CASCADE— deleting the user also deletes their orders. Aggressive; sometimes correct (e.g., for sessions or carts).ON DELETE SET NULL— the orders stay butuser_idis set to NULL. Often used for audit-relevant data.- No cascade — the database refuses to delete the user as long as orders reference them. Forces explicit cleanup.
Test the choice. Don't assume it.
-
Join queries. To answer "show me each order with its user's name," you have to combine
ordersanduserson the foreign-key relationship. That's a JOIN, and it's the entire focus of Chapter 3. Foreign keys are what make joins possible.
Spotting orphaned and dangling references
When foreign keys are missing or weakly enforced, you can end up with two classic bugs:
- Orphaned rows. A child row exists with no matching parent — an
order_itemwhoseorder_iddoesn't exist inorders. Usually caused by a buggy DELETE that didn't cascade. - Dangling references. A foreign-key column holds a value that doesn't match any parent — perhaps because the parent was deleted by a tool that bypassed referential checks.
Both are exactly the kind of data-integrity bug testers should be hunting. We'll write the queries to find them in Chapter 5.
⚠️ Common Mistakes
- Assuming
idalways means "primary key of this table." Most of the time it does, butorder_items.idis the primary key of order_items —order_items.order_idis a foreign key. Read the column name suffix:_idusually means a foreign key (user_id,product_id,category_id). - Forgetting that foreign keys can be NULL.
orders.user_idmay be declaredNULL-able if the schema allows guest checkouts. AWHERE user_id = 42query won't match guest orders — and aWHERE user_id IS NULLquery is the right way to find them. - Treating cascade behaviour as "the database's choice." It's a deliberate decision someone made. If a test reveals that deleting a user doesn't delete their cart but does delete their orders, ask: "is that intentional?" The answer is sometimes "no," and you've found a bug.
🎯 Practice Task
20 minutes — sketch a real-world schema using the keys you just learned.
- Pick a domain you know — a library, a school, a fitness app. List 4–5 tables you'd expect. (For a library:
members,books,loans,authors,categories.) - For each table, write down: the primary key and any foreign keys. Use the convention
<other_table>_idfor foreign keys. - Sketch the relationships. For each pair of tables that's connected, label it one-to-many, many-to-many, or one-to-one. Where you find a many-to-many, draw the junction table you'd need.
- Stretch: for one of your foreign keys, decide what should happen when the parent is deleted. Cascade? Set NULL? Refuse? Justify the choice in one sentence.
You now have all the structural concepts you need. Next lesson we install a database and start running real queries — finally typing SQL into a prompt and watching it answer.