Q17 of 26 · SQL

How do you verify referential integrity using SQL after a data migration?

SQLMidsqlreferential-integritymigrationdata-validationforeign-key

Short answer

Short answer: Run LEFT JOIN checks for every foreign key relationship to find child rows with no matching parent, and compare row counts against the source to catch missing records.

Detail

A migration checklist using SQL:

1. Row count comparison (source vs target):

SELECT 'orders' AS tbl, COUNT(*) FROM source_db.orders
UNION ALL
SELECT 'orders',         COUNT(*) FROM target_db.orders;
-- Counts should match

2. Orphan check for every FK relationship:

-- order_items with no matching order
SELECT COUNT(*) AS orphaned_items
FROM   order_items oi
LEFT JOIN orders o ON o.id = oi.order_id
WHERE  o.id IS NULL;
-- Expect 0

-- orders with no matching customer
SELECT COUNT(*) AS orphaned_orders
FROM   orders o
LEFT JOIN customers c ON c.id = o.customer_id
WHERE  c.id IS NULL;

3. Duplicate primary key check:

SELECT id, COUNT(*) FROM orders GROUP BY id HAVING COUNT(*) > 1;
-- Expect 0 rows

4. NULL where NOT NULL is expected:

SELECT COUNT(*) FROM orders WHERE customer_id IS NULL;
-- Expect 0

5. Value range / enum validation:

SELECT DISTINCT status FROM orders
WHERE  status NOT IN ('pending','paid','shipped','completed','cancelled');
-- Expect 0 rows (no unknown statuses)

These checks form a post-migration test suite that catches common data corruption patterns before the application goes live.

// WHAT INTERVIEWERS LOOK FOR

A structured checklist approach: row counts, orphan checks, PK uniqueness, NOT NULL, enum validity. Framing it as a repeatable test suite, not a one-off query.