Q17 of 38 · Performance

How do you load test a system with heavy WebSocket or SSE traffic?

PerformanceSeniorperformancewebsocketssereal-timek6gatling

Short answer

Short answer: Use k6's ws module or Gatling for WebSocket — open many concurrent connections, send/receive messages, measure connection-establishment time, message round-trip latency, and concurrent-connection cap. For SSE, treat it as long-poll — track open connections and per-message latency.

Detail

Long-lived connections are a different beast from request/response. The test design changes substantially.

What to measure (different metrics from REST):

  • Connection establishment time — TCP + TLS + WebSocket handshake. Often dominated by TLS.
  • Concurrent connection cap — how many open connections before something breaks (file descriptor limit, memory per connection, broker capacity).
  • Message latency — for chat/trading systems, p95/p99 from publisher to subscriber. End-to-end including the broker.
  • Broadcast fan-out — when one publish event fans out to N subscribers, how does latency change with N?
  • Reconnection storms — what happens when the server restarts and 10k clients reconnect simultaneously?

Tools:

  • k6 ws — opens WebSocket from a VU; good for RPS-style tests. The connection blocks the VU, so you need many VUs for many connections.
  • k6 xk6-websockets — newer extension supporting WebSocket from a single VU more efficiently.
  • Gatling — Akka-based, great at long-lived async; idiomatic for WebSocket fan-out tests.
  • Custom Locust scenarios with gevent — flexible if you need application-level message correlation.

Test scenarios:

  1. Steady-state subscribers — N clients connected for an hour, baseline latency and stability.
  2. Connection storm — 10k clients connect within 30s, measure handshake p95 and broker CPU.
  3. Reconnect storm — kill the server, watch reconnect behaviour. Backoff right? Thundering herd?
  4. Message broadcast at scale — publish 1k msg/s with 10k subscribers, measure latency distribution.
  5. Subscriber slowness — what happens when one subscriber stalls? Does the broker buffer? Drop? Kill the connection?

SSE (Server-Sent Events) is simpler — HTTP long-poll under the hood, unidirectional server→client. Test concurrent open connections and per-event delivery time. Tools without first-class SSE support work fine — just don't close the response stream.

// WHAT INTERVIEWERS LOOK FOR

Awareness that long-lived connections need different metrics, knowledge of relevant tools, and senior scenarios like reconnect storms and slow-subscriber backpressure.

// COMMON PITFALL

Measuring only handshake latency — the system can establish 50k connections fine but collapse under broadcast load. Steady-state and broadcast tests are where production failures hide.