Q17 of 38 · Performance
How do you load test a system with heavy WebSocket or SSE traffic?
PerformanceSeniorperformancewebsocketssereal-timek6gatling
Short answer
Short answer: Use k6's ws module or Gatling for WebSocket — open many concurrent connections, send/receive messages, measure connection-establishment time, message round-trip latency, and concurrent-connection cap. For SSE, treat it as long-poll — track open connections and per-message latency.
Detail
Long-lived connections are a different beast from request/response. The test design changes substantially.
What to measure (different metrics from REST):
- Connection establishment time — TCP + TLS + WebSocket handshake. Often dominated by TLS.
- Concurrent connection cap — how many open connections before something breaks (file descriptor limit, memory per connection, broker capacity).
- Message latency — for chat/trading systems, p95/p99 from publisher to subscriber. End-to-end including the broker.
- Broadcast fan-out — when one publish event fans out to N subscribers, how does latency change with N?
- Reconnection storms — what happens when the server restarts and 10k clients reconnect simultaneously?
Tools:
- k6
ws— opens WebSocket from a VU; good for RPS-style tests. The connection blocks the VU, so you need many VUs for many connections. - k6
xk6-websockets— newer extension supporting WebSocket from a single VU more efficiently. - Gatling — Akka-based, great at long-lived async; idiomatic for WebSocket fan-out tests.
- Custom Locust scenarios with
gevent— flexible if you need application-level message correlation.
Test scenarios:
- Steady-state subscribers — N clients connected for an hour, baseline latency and stability.
- Connection storm — 10k clients connect within 30s, measure handshake p95 and broker CPU.
- Reconnect storm — kill the server, watch reconnect behaviour. Backoff right? Thundering herd?
- Message broadcast at scale — publish 1k msg/s with 10k subscribers, measure latency distribution.
- Subscriber slowness — what happens when one subscriber stalls? Does the broker buffer? Drop? Kill the connection?
SSE (Server-Sent Events) is simpler — HTTP long-poll under the hood, unidirectional server→client. Test concurrent open connections and per-event delivery time. Tools without first-class SSE support work fine — just don't close the response stream.
// WHAT INTERVIEWERS LOOK FOR
Awareness that long-lived connections need different metrics, knowledge of relevant tools, and senior scenarios like reconnect storms and slow-subscriber backpressure.
// COMMON PITFALL
Measuring only handshake latency — the system can establish 50k connections fine but collapse under broadcast load. Steady-state and broadcast tests are where production failures hide.