Q24 of 26 · SQL

How do you use SQL to compare data across environments (e.g., staging vs production)?

SQLSeniorsqlcross-environmentdata-comparisonstagingproduction

Short answer

Short answer: Use database links, linked servers, or export-to-CSV + staging import to run EXCEPT or FULL OUTER JOIN queries that show rows that differ, rows present in one environment but not the other, and aggregate discrepancies.

Detail

Cross-environment comparison is common after a production deployment, migration rollout, or environment refresh.

Approach 1 — EXCEPT on exported snapshots:

-- Export from prod (last 24h orders)
SELECT order_id, customer_id, status, total_amount
FROM   prod_orders_export
ORDER BY order_id;

-- Run EXCEPT against staging
SELECT order_id, customer_id, status, total_amount FROM prod_orders_export
EXCEPT
SELECT order_id, customer_id, status, total_amount FROM staging_orders;
-- Rows in prod but NOT in staging = missing or mismatched

Approach 2 — Summary comparison (no row-by-row data exposure):

-- Compare aggregates only (safer for sensitive data)
SELECT 'prod'    AS env, COUNT(*), SUM(total_amount), MAX(created_at)
FROM   prod_orders_export
UNION ALL
SELECT 'staging' AS env, COUNT(*), SUM(total_amount), MAX(created_at)
FROM   staging_orders;

Approach 3 — Hash comparison for large tables:

-- PostgreSQL: compare MD5 of concatenated row values
SELECT MD5(STRING_AGG(order_id::text || status || total_amount::text, ',' ORDER BY order_id))
FROM orders;
-- Run on both environments; matching hashes = identical data

Practical notes:

  • Never query production directly from a staging DB connection if they're on different networks. Export to a neutral staging table first.
  • Mask or exclude PII columns (email, name, card details) before any cross-environment comparison.
  • Focus on key business fields, not timestamps (created_at will always differ between environments).

// WHAT INTERVIEWERS LOOK FOR

Awareness of EXCEPT / FULL OUTER JOIN for comparison. The PII masking concern. Summary aggregates as a safer alternative to row-by-row comparison.