Q8 of 38 · CI/CD & DevOps

What's the difference between blue-green and canary deployments from a testing perspective?

CI/CD & DevOpsMidci-cddeploymentblue-greencanaryrelease-strategy

Short answer

Short answer: Blue-green flips 100% of traffic at once after smoke tests pass on green — easy rollback, but if smoke missed a bug, everyone hits it. Canary shifts traffic gradually (1% → 10% → 100%) while watching error rate and latency — small blast radius, but slower and needs strong observability.

Detail

Blue-green runs two identical environments. Blue is current production. Green is the new version, deployed and warmed but receiving zero user traffic. You run smoke tests against green; on pass, flip the load balancer to send 100% of traffic to green. Blue stays around for instant rollback.

Testing implications:

  • Smoke tests on green need to be comprehensive — once you flip, every user sees it.
  • Database migrations are tricky — if green needs a schema change, both versions need to coexist during the flip (expand-contract pattern).
  • Rollback is fast (flip the LB back).
  • One-shot risk: if smoke missed a bug or the bug only manifests under real production load, every user feels it.

Canary sends a small percentage of traffic to the new version (1% or one cell), watches error rate and latency vs. baseline for a soak period, then ramps to 10%, 50%, 100%.

Testing implications:

  • Smoke tests less critical — production traffic is the test, with auto-rollback as the safety net.
  • Rich observability is required: per-version metrics, automated comparison to baseline, alerting on error budget burn.
  • Slower — full rollout takes hours or days.
  • Catches bugs only real production triggers — DB load, third-party rate limits, real user behaviour.
  • Multiple versions live simultaneously: clients, schemas, and feature flags must handle both.

The hybrid that wins in practice: blue-green for stateless services where rollback is cheap; canary for high-traffic or risky services where blast radius matters.

For a QA practitioner, the canary world demands different skills: synthetic monitoring, SLO-based gating, automated rollback design. Less "test before release," more "observe in production safely."

// WHAT INTERVIEWERS LOOK FOR

Articulating the trade-off (blast radius vs. speed/complexity), naming the testing implications of each, and awareness of expand-contract for schema changes.

// COMMON PITFALL

Treating canary as 'just slower' — it requires a fundamentally different observability and gating setup, not just a new traffic-routing config.