Guided Walkthrough — Queries for Order Verification, Data Cleanup, Reporting

12 min read

This lesson is the build. Last lesson described what ShopEasy DB Pack should do; this one shows the SQL for each component, with expected results and the reasoning behind every choice. By the end you'll have the core queries you need; the next lesson is your self-review and stretch goals.

The end-to-end flow

Step 1 of 6

01_setup.sql

Seed the database with deterministic test data inside a transaction.

Step 1 — Data setup (01_setup.sql)

Wrap every INSERT in a single transaction so partial failure can't leave you with broken seed data. Use the practice-database schema from Chapter 1, Lesson 4 — this is purely the data layer on top.

BEGIN TRANSACTION;
 
-- 5 categories
INSERT INTO categories (name) VALUES
  ('Electronics'), ('Books'), ('Home'), ('Office'), ('Outdoor');
 
-- 10 users — recognisable +ci@qa.local pattern for safe cleanup
INSERT INTO users (name, email, role, is_active) VALUES
  ('Alice Khan',   'alice+ci@qa.local',   'admin',  TRUE),
  ('Bob Patel',    'bob+ci@qa.local',     'tester', TRUE),
  ('Carol Singh',  'carol+ci@qa.local',   'tester', FALSE),
  ('Dan Müller',   'dan+ci@qa.local',     'viewer', TRUE),
  ('Esha Roy',     'esha+ci@qa.local',    'tester', TRUE),
  ('Farah Ali',    'farah+ci@qa.local',   'tester', TRUE),
  ('Gabriel Sosa', 'gabriel+ci@qa.local', 'admin',  TRUE),
  ('Hina Mehta',   'hina+ci@qa.local',    'tester', TRUE),
  ('Ivan Petrov',  'ivan+ci@qa.local',    'viewer', FALSE),
  ('June Park',    'june+ci@qa.local',    'tester', TRUE);
 
-- 30 products — six per category, mix of in-stock and out-of-stock
-- (truncated for the lesson; expand for your run)
INSERT INTO products (name, price, category_id, in_stock) VALUES
  ('Wireless Mouse',           25.00, 1, TRUE),
  ('Mechanical Keyboard Pro', 120.00, 1, TRUE),
  ('USB-C Hub',                45.00, 1, FALSE),
  ('27-inch Monitor',         320.00, 1, TRUE),
  ('Webcam HD',                75.00, 1, FALSE),
  ('Bluetooth Speaker',        65.00, 1, TRUE),
  ('SQL for Testers',          18.50, 2, TRUE),
  ('Clean Code',               30.00, 2, TRUE);
  -- ... add the remaining 22 products to reach 30
 
-- 20 orders — spread across users, statuses, and the last 60 days
INSERT INTO orders (user_id, total, status, created_at) VALUES
  (1, 145.00, 'completed', DATE('now', '-55 days')),
  (2,  48.50, 'completed', DATE('now', '-42 days')),
  (2, 320.00, 'shipped',   DATE('now', '-38 days')),
  (4,  63.00, 'pending',   DATE('now', '-3 days')),
  (5,  25.00, 'cancelled', DATE('now', '-1 day'));
  -- ... add the remaining 15 orders
 
-- 60 order_items — 1-5 per order
-- For two orders, deliberately make sum(items) != orders.total so the
-- integrity sweep has something to find.
INSERT INTO order_items (order_id, product_id, quantity, price) VALUES
  (1, 1, 1, 25.00),
  (1, 2, 1, 120.00),
  (2, 7, 1, 18.50),
  (2, 5, 1, 75.00),  -- 18.50 + 75.00 = 93.50, but order 2 records 48.50 (mismatch — intentional)
  (3, 4, 1, 320.00);
  -- ... add the remaining 55 line items
 
COMMIT;

The two principles to take away: deterministic data (same INSERTs → same expected results, every time) and identifiable test rows (every test email ends +ci@qa.local, every test product can be filtered down later). Both make cleanup trivial.

Step 2 — Verification queries (02_verification.sql)

Three of the ten, with expected output.

Verify user registration

-- Run after a POST /api/users with email = 'qa-2026-05-06@qa.local'
SELECT id, name, email, role, is_active, created_at
FROM users
WHERE email = 'qa-2026-05-06@qa.local';

Expected: exactly one row, with name and role matching what the API claimed; is_active = TRUE; created_at within the last minute. The test wraps this query and asserts on each field.

Verify order creation — order and items together

-- Given the order id returned by POST /api/orders
SELECT
    o.id,
    o.user_id,
    o.total,
    o.status,
    COUNT(oi.id)                AS item_count,
    SUM(oi.quantity * oi.price) AS items_total
FROM orders o
LEFT JOIN order_items oi ON oi.order_id = o.id
WHERE o.id = 21
GROUP BY o.id, o.user_id, o.total, o.status;

Expected:

| id | user_id | total  | status   | item_count | items_total |
|----|---------|--------|----------|------------|-------------|
| 21 | 5       | 95.00  | pending  | 3          |       95.00 |

The test asserts: order exists, has the expected user, item_count = 3 (matching the cart), items_total = total (no calculation drift).

Verify order total = SUM(items) — the integrity check, scoped to one order

-- Returns 0 rows if the order is consistent;
-- Returns a row with the diff if it isn't.
SELECT
    o.id,
    o.total                            AS recorded,
    SUM(oi.quantity * oi.price)        AS computed,
    o.total - SUM(oi.quantity * oi.price) AS diff
FROM orders o
JOIN order_items oi ON oi.order_id = o.id
WHERE o.id = 21
GROUP BY o.id, o.total
HAVING o.total <> SUM(oi.quantity * oi.price);

Same shape as the database-wide integrity sweep, scoped down to one order. Useful as a per-test post-condition.

Step 3 — Data integrity queries (03_integrity.sql)

Two of the five.

Duplicate emails

-- Returns one row per offending email
SELECT email, COUNT(*) AS dupes
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

Expected against your seed data: zero rows. If anything comes back, either the UNIQUE constraint is missing or a bulk import bypassed it. Both are bugs.

Orphaned order items

-- Returns line items whose parent order doesn't exist
SELECT oi.id, oi.order_id, oi.product_id, oi.quantity
FROM order_items oi
LEFT JOIN orders o ON oi.order_id = o.id
WHERE o.id IS NULL;

Zero rows in a healthy database. Each row that comes back is a real defect — usually caused by a DELETE on orders that didn't cascade.

The other three integrity queries — mismatched totals, dormant users, negative prices — follow the same shapes (Chapter 5, Lesson 3). Write them for your pack.

Step 4 — Reporting queries (04_reporting.sql)

Two of the five.

Orders per status — count and percentage

SELECT
    status,
    COUNT(*)                                                      AS count,
    ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*) FROM orders), 1)    AS percentage
FROM orders
GROUP BY status
ORDER BY count DESC;

Sample result:

| status     | count | percentage |
|------------|-------|------------|
| completed  | 8     |       40.0 |
| shipped    | 5     |       25.0 |
| pending    | 4     |       20.0 |
| cancelled  | 2     |       10.0 |
| processing | 1     |        5.0 |

Drop straight into a Slack message. PM-friendly.

Revenue by category

SELECT
    c.name                              AS category,
    COUNT(DISTINCT o.id)                AS orders_with_this_category,
    SUM(oi.quantity * oi.price)         AS revenue
FROM categories c
JOIN products p     ON p.category_id = c.id
JOIN order_items oi ON oi.product_id = p.id
JOIN orders o       ON o.id          = oi.order_id
GROUP BY c.id, c.name
ORDER BY revenue DESC;

Sample result:

| category    | orders_with_this_category | revenue |
|-------------|---------------------------|---------|
| Electronics | 12                        | 1450.00 |
| Home        |  6                        |  340.00 |
| Books       |  4                        |   97.00 |
| Office      |  3                        |   84.00 |
| Outdoor     |  2                        |   45.00 |

Notice the four-table JOIN. Each ON clause reads naturally — just walk the schema's foreign keys.

The other three reports — record counts, top customers, weekly avg order value — use shapes from Chapter 5, Lesson 4. Build them out in your pack.

Step 5 — Cleanup (05_cleanup.sql)

Reverse the dependency order: children first, then parents.

BEGIN TRANSACTION;
 
DELETE FROM order_items
WHERE order_id IN (
    SELECT id FROM orders WHERE user_id IN (
        SELECT id FROM users WHERE email LIKE '%+ci@qa.local'
    )
);
 
DELETE FROM orders
WHERE user_id IN (
    SELECT id FROM users WHERE email LIKE '%+ci@qa.local'
);
 
DELETE FROM users    WHERE email LIKE '%+ci@qa.local';
DELETE FROM products WHERE name LIKE 'Test%' OR name LIKE 'CI %';
DELETE FROM categories WHERE name IN ('Outdoor', 'Office');  -- if you added these
 
COMMIT;

Two principles: filter on the recognisable suffix (+ci@qa.local) and delete in dependency order. Both protect you from accidentally wiping rows other tests created.

If your environment is single-tenant (a dedicated CI database that only runs your test pack), TRUNCATE is faster:

-- Single-tenant alternative
TRUNCATE TABLE order_items;
TRUNCATE TABLE orders;
TRUNCATE TABLE products;
TRUNCATE TABLE users;
TRUNCATE TABLE categories;

Use TRUNCATE only when you control the whole database. For shared environments, the filtered DELETE is the safe default.

Putting it together

Your CI pipeline now does:

01_setup.sql          → seed the database
{ run application tests, calling 02_verification.sql per assertion }
03_integrity.sql      → fail the build if any integrity query returns rows
04_reporting.sql      → publish the report as a build artefact
05_cleanup.sql        → restore zero state

That's the whole capstone. Five files, around 25 queries, every concept from this course in the same place. Next lesson is your self-review checklist plus the stretch goals — views, stored procedures, and window-function trend queries — that take this pack from "competent" to "team superpower."

// tip to track lessons you complete and pick up where you left off across devices.