Retrieval-Augmented Generation (RAG)

AI & LLM Testing

// Definition

A pattern where an LLM is given relevant context retrieved from an external source (a vector database, a search index, a document store) before being asked to generate an answer. The LLM doesn't 'know' the answer from training — it reads what was retrieved and synthesises a response. RAG is how chatbots answer questions about your company's docs without those docs being baked into the model. From a QA perspective, RAG systems have two failure surfaces: retrieval (did the system find the right context?) and generation (did the LLM use the context faithfully, or did it hallucinate?). Testing must cover both, separately.

// Related terms