What are output property checks and how do you use them to test LLM responses?

Question

Accepted Answer

Property checks test invariants that must hold on every valid output regardless of phrasing: required JSON fields exist, response length is within bounds, banned content is absent, claims cite the source. They replace exact-match assertions for non-deterministic outputs. Property checks are assertions about constraints and content rules that define a valid response — not a specific valid response. Common categories: Structural: does the response parse as valid JSON? Are required top-level fields present and the right type? Constraint: is the length within the documented range? Does the language match the requested locale? Safety: does the response contain PII, profanity, or competitor brand names? Use a regex or a secondary classifier. Groundedness: for RAG features, do all factual claims in the response appear in the retrieved source documents? A grounding check can be a secondary LLM call ("does claim X appear in context Y?") or an embedding similarity check. Instruction following: i

What are output property checks and how do you use them to test LLM responses?

Short answer

Detail

// WHAT INTERVIEWERS LOOK FOR