Seed Data

Test Databeginneraka Seeding

// Definition

A known, controlled dataset loaded into a system before tests run, so every test starts from a predictable state. Seeding is what makes assertions reliable: if the database always begins with "User 1, 3 orders", a test can assert against those exact values instead of whatever happens to be there. The opposite of testing against a shared, drifting environment.

// Why it matters

Tests that depend on pre-existing data are flaky by construction — the data changes, the test breaks, and nobody trusts the result. Seed data gives each run an identical starting point, which is the foundation of test independence. QA cares because most "works on my machine" failures trace back to an unseeded or differently-seeded environment.

// How to test

// Seed a known state before each test; assert against it deterministically
beforeEach(() => {
  cy.task('db:seed', { users: 1, ordersPerUser: 3 }) // reset to a known baseline
})
it('shows the seeded orders', () => {
  cy.visit('/orders')
  cy.get('[data-cy=order-row]').should('have.length', 3) // exact, because seeded
})

// Common mistakes

  • Seeding once globally instead of per-test, so tests pollute each other's data
  • Asserting against environment data that wasn't seeded (flaky by design)
  • Seed data that drifts from the real schema, so tests pass on fiction

// Related terms