RAG (Retrieval-Augmented Generation)
Definition
An architectural pattern in which a language model, when answering a query, first retrieves relevant documents from a knowledge base — typically through semantic search in a vector database — and integrates them as context in the prompt.
Noise — Signal
RAG is often sold as the solution to three separate problems: freshness, hallucination and compliance. It reliably solves only the first. Hallucinations are reduced, not eliminated — the model can misinterpret retrieved sources, mix contradictory ones or simply ignore the context. And compliance is a property of the overall system of source rights, audit trails, reasoning chains and access control, not of the architecture. A RAG pipeline alone fulfils no regulatory requirement.
The right question
Not: "Should we introduce RAG to avoid hallucinations?" But: "How do we measure the quality of the retrieved sources, how do we document the reasoning chain of an answer, and who is accountable when the model gets the right context and still uses it wrong?"