Q37 of 38 · Performance
How would you build a performance regression detection system that scales across multiple services?
Short answer
Short answer: Standardise the test-and-report pipeline across services: a common schema for result storage, per-service baselines versioned with code, and a central dashboard that surfaces regressions across all services with a single daily view.
Detail
In a microservices organisation, the main failure modes for performance regression detection are: each team uses a different format so results can't be compared, baselines aren't maintained so comparisons are against stale data, and there's no central view so regressions are only noticed when users complain.
Standardised schema: agree on a common output format for all performance tests (e.g., OpenMetrics / InfluxDB line protocol) and a set of mandatory metrics every service must report: p50/p95/p99 per endpoint, RPS, error rate. Teams can add additional metrics but must include the standard set.
Versioned baselines: each service's repository includes a perf-baseline.json generated by the last release's performance test. The CI regression check compares the current run against this file. Regenerating the baseline is a deliberate, reviewed action.
Central aggregation: results flow to a shared time-series store (InfluxDB, Datadog, Prometheus remote write). A Grafana dashboard shows all services' p95 trends on one screen. A daily automated report highlights any service where the 7-day average has drifted more than a threshold from the baseline.
The hardest part is not the tooling — it is getting teams to maintain baselines and act on regressions. Make the failure state visible (a red service card on the shared dashboard) and assign ownership explicitly.