Q15 of 21 · Testing AI systems

How do you test for bias and fairness in an AI feature?

Testing AI systemsSeniortesting-ai-systemsbiasfairnessdemographicevaluationresponsible-ai

Short answer

Short answer: Construct demographically paired inputs that differ only on protected attributes (name, gender, race, nationality) and measure whether the model's output quality, tone, or content differs systematically. A fair model should produce equivalent quality for equivalent inputs regardless of demographic signals.

Detail

Bias testing requires deliberate construction of demographically paired inputs — you cannot find bias by accident.

Paired testing: write two versions of the same prompt that differ only in demographic signals (name, pronoun, nationality) and compare the outputs. Does the model's sentiment, length, or willingness to help differ based on the name "Latanya" vs "Emily"? Based on "he" vs "she"?

Output distribution analysis: for a classification feature (sentiment, priority, risk score), compute the output distribution separately for inputs associated with each demographic group. A fair classifier should not have significantly different false-positive rates across groups.

Representational harm: for generative features, probe whether the model produces stereotyped or demeaning content when demographic groups are mentioned, even without an explicit harmful intent in the prompt.

Intersectionality: test on combinations of demographic dimensions — single-dimension fairness testing misses intersectional harms (e.g., Black women facing compounded bias that neither "Black" nor "women" alone would reveal).

See Bias and fairness testing for a full methodology.

// WHAT INTERVIEWERS LOOK FOR

Paired testing as the core technique. Output distribution analysis for classification tasks. Intersectionality as a specific gap in single-dimension testing. Representational harm beyond just decision discrimination.