Executors — Per VU, Constant Rate, Ramping — Performance Testing with K6

Every K6 scenario runs on an executor that controls how VUs are scheduled. Choosing the wrong executor is the most common source of misinterpreted load test results. This lesson covers all six executor types, when to use each, and the fundamental distinction between VU-based and arrival-rate-based load models.

The six executors

K6 executor types

VU-based executors

shared-iterations: N total iterations split across VUs (work queue)
per-vu-iterations: each VU runs N iterations independently
constant-vus: N VUs run for a fixed duration
ramping-vus: VU count changes over time via stages
Throughput depends on response time — if server slows, RPS drops
Best for: simulating concurrent users with think time

Arrival-rate executors

constant-arrival-rate: fixed requests per time unit
ramping-arrival-rate: request rate changes over time
Throughput is independent of response time
K6 spawns more VUs if needed to sustain the rate (up to maxVUs)
If server slows, VUs accumulate — just like real traffic
Best for: API throughput targets, open workload models

shared-iterations

N total iterations distributed across VUs like a work queue. The test ends when all iterations are consumed, regardless of how long it takes.

scenarios: {
  default: {
    executor: 'shared-iterations',
    vus: 10,
    iterations: 100,
    maxDuration: '5m',   // safety timeout
  },
},

Use case: seeding a database with exactly 100 records, running a fixed benchmark, or generating a known number of test transactions. Each VU picks up the next available iteration — faster VUs complete more iterations.

per-vu-iterations

Each VU runs exactly N iterations. Total iterations = VUs × iterations.

scenarios: {
  default: {
    executor: 'per-vu-iterations',
    vus: 10,
    iterations: 50,
    maxDuration: '5m',
  },
},

With vus: 10, iterations: 50, each VU runs 50 iterations — 500 total. Compare with shared-iterations where 10 VUs × 100 iterations = 100 total (10 per VU on average).

Use case: smoke tests where you want a known number of requests per user, integration verification with fixed sample sizes.

constant-vus

The default when you specify vus and duration at the top level. N VUs run the default function continuously for the specified duration.

scenarios: {
  default: {
    executor: 'constant-vus',
    vus: 50,
    duration: '10m',
  },
},

Equivalent to:

export const options = { vus: 50, duration: '10m' };

Use case: steady-state load testing at a known VU count.

ramping-vus

VU count changes over time using stages. This is the executor used when you specify stages at the top level.

scenarios: {
  default: {
    executor: 'ramping-vus',
    startVUs: 0,
    stages: [
      { duration: '3m', target: 100 },
      { duration: '5m', target: 100 },
      { duration: '2m', target: 0 },
    ],
    gracefulRampDown: '30s',
  },
},

gracefulRampDown gives in-flight iterations time to complete when VUs are being removed. Without it, removing VUs abruptly terminates their current iteration.

constant-arrival-rate

Maintains a fixed number of requests per time unit, independent of response time. If requests slow down, K6 spawns more VUs (up to maxVUs) to maintain the rate.

scenarios: {
  default: {
    executor: 'constant-arrival-rate',
    rate: 200,
    timeUnit: '1s',        // 200 requests per second
    duration: '10m',
    preAllocatedVUs: 100,  // VU pool size — allocate enough to sustain the rate
    maxVUs: 500,           // hard cap on VU spawning
  },
},

The closed vs open workload model distinction:

With VU-based executors, when the server slows down, VUs spend more time waiting — which inadvertently reduces the request rate. The slow server gets less traffic, which may make it appear to recover. This is a closed workload model — VU count is fixed, RPS is variable.

With constant-arrival-rate, the rate is maintained regardless of latency. When the server slows, K6 spawns more VUs to sustain the rate. VUs accumulate waiting. This is an open workload model — RPS is fixed, VU count is variable. Real production traffic follows the open model: users do not stop making requests because your server is slow.

Use case: API throughput testing ("can this endpoint handle 500 RPS?"), SLA verification against rate requirements, realistic traffic modelling for high-throughput APIs.

ramping-arrival-rate

Like ramping-vus but for arrival rate — the request rate changes over time.

scenarios: {
  default: {
    executor: 'ramping-arrival-rate',
    startRate: 10,
    timeUnit: '1s',
    stages: [
      { duration: '2m', target: 100 },   // ramp from 10 to 100 RPS
      { duration: '5m', target: 100 },   // hold at 100 RPS
      { duration: '2m', target: 200 },   // ramp to 200 RPS
      { duration: '5m', target: 200 },   // hold at 200 RPS
      { duration: '2m', target: 0 },     // ramp down
    ],
    preAllocatedVUs: 50,
    maxVUs: 300,
  },
},

Use case: stress testing with a rate-based model, finding the maximum sustainable RPS before the system degrades.

Choosing between VU-based and arrival-rate-based

The practical decision:

Question	Use
"How does the system behave with N concurrent users?"	`constant-vus` or `ramping-vus`
"Can the system sustain N requests per second?"	`constant-arrival-rate`
"What is the maximum RPS before degradation?"	`ramping-arrival-rate`
"Run exactly N transactions total"	`shared-iterations` or `per-vu-iterations`
"Simulate N users, each doing M actions"	`per-vu-iterations`

VU-based tests naturally model interactive users with think time. Arrival-rate-based tests model API traffic, integrations, or mobile clients that have their own retry logic.

⚠️ Common mistakes

Confusing shared-iterations with per-vu-iterations. shared-iterations: vus: 10, iterations: 100 runs 100 total iterations (about 10 per VU). per-vu-iterations: vus: 10, iterations: 100 runs 1,000 total (100 per VU). The naming is explicit but the difference is easy to miss.
Not setting maxVUs on arrival-rate executors. Without maxVUs, K6 can spawn arbitrarily many VUs to sustain the target rate under a slow server. Set maxVUs to a reasonable cap — if you need 1,000 VUs to hit 200 RPS, your server is too slow to sustain the rate and the test results tell you that.
Using constant-vus when you need constant-arrival-rate. If your SLA is "handle 500 RPS" and you test with 100 VUs each making 5 requests/second, you get 500 RPS — but only while the server is fast. If the server slows to 2 requests/second per VU, you are only sending 200 RPS and not testing the SLA. Use constant-arrival-rate for rate-based SLAs.
Forgetting gracefulRampDown on ramping-vus. When a stage reduces VU count, K6 abruptly terminates those VUs' current iterations by default. Set gracefulRampDown: '30s' to give in-flight iterations time to complete before the VU is removed.

🎯 Practice task

Compare executor behaviour by measuring throughput under different configurations. 40 minutes.

Use https://httpbin.org/get — it has a consistent, fast response time for executor experiments.

Write a script with a constant-vus scenario: 20 VUs, 1 minute. Add sleep(0.5) in the function. Note the http_reqs count and rate in the output.
Change to constant-arrival-rate: rate: 40, timeUnit: '1s', 1 minute, preAllocatedVUs: 20, maxVUs: 100. Run again. The request rate should be approximately constant at 40 RPS regardless of response time.
Add sleep(2) instead of sleep(0.5) in the VU function. Run the constant-vus scenario — observe how the request rate drops. Run constant-arrival-rate — observe that it maintains 40 RPS (by using more VUs). This demonstrates the open vs closed workload distinction.
Write a per-vu-iterations scenario: vus: 5, iterations: 10. Add a console.log showing __VU and __ITER. Run and confirm you see exactly 50 log lines (5 VUs × 10 iterations each).
Change to shared-iterations: vus: 5, iterations: 10. Run again. You should still see 10 total log lines, distributed unevenly among the 5 VUs (faster VUs pick up more iterations).