Defect density
The number of confirmed defects per thousand lines of code — the headline measure of code quality for a release.
// Formula
// About this metric
Defect density normalises defect counts against code size, making it possible to compare quality across releases, teams, and products of different sizes. The formula is simple: divide the number of confirmed defects by the size of the codebase in KLOC (thousands of lines of code).
The metric is most useful tracked over time. A single number tells you little; a trend line tells you whether your process is improving. A rising defect density sprint-on-sprint is a signal that either code quality is degrading or that testing is becoming more effective at finding latent issues — you need to know which.
Capers Jones' industry surveys place the average for commercial software at around 0.5–1.0 defects/KLOC before release testing, with defect density as low as 0.1 in high-maturity teams. Embedded and safety-critical systems often require processes targeting sub-0.1 densities using formal verification and static analysis.
KLOC as a denominator has known weaknesses — it penalises verbosity and rewards terseness in ways that don't always track quality. Function points are a more theoretically sound denominator, but KLOC is widely available from standard tooling, which is why it remains the practical default.
// Calculator
🧮 Calculator
In the period or release
Thousands of lines of code
// Benchmark
Source: Capers Jones, Software Engineering Best Practices (2023)
Embedded and safety-critical systems run 5-10× tighter.
// When to use this metric
Use defect density when you want a release-gate signal: "are we shipping more or fewer defects per unit of code than last release?" It is a useful lagging indicator for quarterly quality reviews and retrospectives.
Avoid it as a sprint-level metric — the denominator (codebase size) changes slowly while the numerator (confirmed defects) can spike for reasons unrelated to code quality, such as new test coverage or a changed defect triage policy. It also breaks down for very small codebases where a single defect swings the number wildly.
// Common pitfall
Goodhart's law applies hard here. The moment "low defect density" becomes a team target, defect classification drifts: minor issues become "feedback", reproducible bugs become "won't fix", and your number improves while quality doesn't. Use this metric to spot trends, not to set targets.