// Interview Prep/Mock Interviews/Performance QA mock interview
⚡ Performance QA mock interview
Set a timer, work through each round out loud, then score your answers against the rubric. No one is listening — the goal is honest self-assessment, not a perfect performance.
// ROUND STRUCTURE
- 1Warm-up— 5 min
Background, tools used, and types of performance tests you have run.
- 2Test type and design— 12 min
Test type taxonomy, NFR and SLA definition, workload modelling from real traffic data.
- 3Tooling and scripting— 12 min
k6 or JMeter, parameterisation, correlation, assertions, CI integration.
- 4Results analysis— 10 min
Key metrics, reading a results chart, bottleneck isolation across infra layers.
- 5Wrap-up— 6 min
Candidate questions and interviewer summary.
// INTERVIEW QUESTIONS
- 01
Explain the difference between load testing, stress testing, spike testing, and soak testing. When do you use each?
- 02
How do you define NFRs and SLAs for a new feature? What data sources do you use and who do you involve?
- 03
How do you model the workload for a load test — concurrency, throughput, think time, and ramp-up profile?
- 04
Walk me through how you would write a k6 or JMeter test for a login-then-browse-to-product-page flow.
- 05
What is correlation in performance testing, when is it required, and what happens if you skip it?
- 06
A load test shows response time increasing steadily as concurrent users increase from 100 to 500. How do you identify which layer is the bottleneck?
- 07
What is the difference between response time, throughput, and error rate? How do you use each metric to evaluate a test result?
- 08
You run a load test at 500 concurrent users and see a 2% error rate and p99 response time of 4.2 seconds. What is your verdict?
- 09
How do you communicate performance test results to a non-technical product stakeholder?
// EXPECTED ANSWER POINTS
Compare your answers to these points — one per question, in order.
Load test: sustained realistic user load at or above expected peak — confirms the system meets NFRs under normal conditions. Stress test: pushes beyond peak to find the breaking point and observe degradation behaviour. Spike test: sudden sharp traffic increase then return to baseline — tests elasticity, recovery time, and whether the system sheds or queues excess traffic gracefully. Soak/endurance test: sustained moderate load for hours or days — reveals memory leaks, connection pool exhaustion, and gradual degradation that only appear over time. Use spike testing before high-traffic events (product launches, sales); soak testing before long-running production deployments.
NFR sources: involve product (business SLA commitments to customers), engineering (technical constraints and infrastructure budget), and operations (on-call thresholds that trigger alerts). Data sources: APM tool baselines (Datadog, New Relic — what does p95 response time look like today?), access logs for peak TPS, user research (industry benchmark: 53% of mobile users abandon after 3 seconds), contractual SLAs with enterprise customers. Typical NFR set: p95 response time threshold, maximum acceptable error rate under load, minimum throughput at peak, CPU and memory headroom limits.
Workload model components: (a) Concurrency — simultaneous virtual users, derived from peak concurrent session count in analytics. (b) Throughput — requests per second, derived from access log analysis at peak hour. (c) Think time — pause between requests per user, models realistic browsing behaviour; without it, the test generates artificially high throughput. (d) Ramp-up — gradual increase to the target user count, avoids an artificial spike at test start that skews early results. Base the model on real traffic: pull peak-hour concurrent users from Google Analytics or your APM tool, add 20-30% headroom above measured peak.
k6 structure: default function executes the flow for each virtual user. POST /login with credentials from open('users.csv') via SharedArray. Extract auth token from the response JSON using response.json('token'). GET /products using params.headers['Authorization'] = 'Bearer ' + token. Add sleep(randomIntBetween(1, 3)) for think time. Assert with check(): check(res, { 'status 200': (r) => r.status === 200, 'response time OK': (r) => r.timings.duration < 500 }). Configure stages in options: {stages:[{duration:'2m',target:100},{duration:'5m',target:100},{duration:'2m',target:0}]}. JMeter equivalent: Thread Group -> CSV Dataset Config -> HTTP Sampler (login) -> JSON Extractor (token) -> HTTP Header Manager (Bearer token) -> HTTP Sampler (products) -> Response Assertion.
Correlation: extracting a dynamic server-generated value from one response and injecting it into the next request. Required for session tokens, CSRF tokens, ViewState parameters, or server-generated IDs that change per session. Without correlation, the script replays the static value recorded during script creation — the server has already invalidated it, causing authentication failures, inflated error rates, and misleading results. Identify correlation candidates by running the recorded script twice and comparing requests: any value that differs between runs is a correlation candidate. In k6: parse from response.json() or response.headers. In JMeter: use a Regular Expression Extractor or JSON Extractor post-processor.
Bottleneck isolation order: (1) Application server CPU and memory — if CPU is pegged near 100%, the app layer is the bottleneck; add capacity or optimise the hot code path. (2) Database — check slow query log, active connection count, lock wait time, and query execution plan at peak load. (3) Connection pool saturation — if the thread or DB connection pool is exhausted, requests queue internally; response time rises with no visible CPU spike. (4) External service dependencies — if a downstream API (payment gateway, email service) slows under load, response time rises without local infrastructure impact. (5) JVM garbage collection pauses — show as periodic spikes in the response time graph correlated with GC log events. Correlate load tool output with APM metrics (Datadog, Dynatrace, CloudWatch) simultaneously — never diagnose from the load tool chart alone.
Response time: duration from request send to full response received — the user experience metric. Assess at p95 and p99, not mean (mean hides tail latency). Throughput: requests the system successfully processed per second — the capacity metric. Error rate: percentage of requests that returned an error — the quality metric. Interpret together: rising response time with stable throughput = saturation approaching. Rising error rate before response time spikes = the error path is being hit (bad requests, quota limits) rather than a true capacity problem. Flat response time with declining throughput under load = the system is shedding load rather than slowing down — may indicate a circuit breaker or queue overflow.
Verdict depends on the defined NFR. If the SLA states p99 < 2s and error rate < 0.5% under 500 users, this result is a clear fail on both counts — block the release and investigate. If no SLA was defined, this test establishes the current baseline and should prompt an NFR conversation. Next steps regardless: (a) Inspect the 2% errors — are they timeouts (server overwhelmed), 5xx (application errors), or 4xx (test data or auth issue)? Each has a different cause and fix. (b) Identify where response time rises: correlate with infra metrics. (c) Write a clear recommendation: 'Result fails the defined p99 NFR by 2.2x. Recommended action: do not release until response time is below 2s under 500 users.'
Stakeholder translation: convert metrics into business impact. Instead of 'p99 response time is 4.2 seconds at 500 concurrent users,' say 'at our projected peak lunchtime traffic, 1 in 100 customers waits over 4 seconds for the page to load — twice our 2-second target, which historical data shows increases cart abandonment by approximately 20%.' Use a traffic-light summary: Green (meets NFR), Amber (within 20% of threshold, monitor), Red (NFR breached, action required). Include the recommendation and the proposed next step — stakeholders need to know what to decide, not just what was measured.
// SCORING RUBRIC
Distinguishes all four test types fluently with concrete use cases. Builds a realistic workload model from real traffic data rather than guessing virtual users. Explains correlation with a specific dynamic value example. Reads a results chart and names a specific bottleneck layer with supporting evidence from infrastructure metrics. Communicates findings in business-impact terms, not raw metrics.
Correctly distinguishes load vs stress vs soak. Can write a parameterised k6 or JMeter script and knows response time vs throughput vs error rate. May not know correlation by name or struggle to name specific bottleneck isolation steps. Results communication is metric-first rather than business-impact-first.
Confuses load testing and stress testing. Has only run recorded scripts without parameterisation or assertions. Cannot read a results chart beyond 'response time went up.' Does not know what p95 or p99 means. Cannot define an NFR or explain what makes a result a pass or fail without a defined threshold.
// RED FLAGS
Answers or behaviours that signal a weak candidate to the interviewer.
Cannot distinguish between load testing and stress testing with a concrete example of each.
Has only run recorded, un-parameterised scripts — no scripting, no data injection, no correlation.
Believes the goal of a load test is to find the point where the server crashes, with no mention of NFRs or SLAs.
Cannot explain what correlation is or why replaying a static token causes elevated error rates.
Does not know what p95 or p99 response time means — only discusses mean or average latency.
Has never correlated a load tool result with infrastructure metrics (CPU, DB connections, APM data) to isolate a bottleneck.
// FOLLOW-UP QUESTIONS
Questions a strong interviewer adds if you answered the main round well.
- 01
How would you set up a baseline performance test that runs in CI and alerts on regressions?
- 02
What is the difference between client-side and server-side performance testing? When do you use Lighthouse or WebPageTest versus k6 or JMeter?
- 03
How do you performance test a WebSocket or a long-lived streaming connection?
- 04
A soak test shows heap memory growing by 10MB every 10 minutes. What do you investigate and how?
// SELF-ASSESSMENT CHECKLIST
Tick these off mentally after the mock. Be honest — this is for you, not for the interviewer.
- I correctly distinguished all four test types — load, stress, spike, and soak — and gave a concrete use case for each.
- I described a workload model built from real traffic data — analytics or access logs — not an arbitrary virtual user count.
- I outlined a k6 or JMeter script structure that included parameterisation, correlation or token handling, and response assertions.
- I explained correlation with a specific example of a dynamic value (session token, CSRF, ViewState) and what happens without it.
- I named at least three bottleneck layers in order — application CPU, database, connection pool — and explained what each looks like in a results chart.
- I answered the 2%-error-rate question in terms of SLA compliance rather than just restating the metric.
// RECOMMENDED NEXT MOCK
🔬 SDET technical mock interview
Senior · 60 min