RAG (Retrieval-Augmented Generation)

An architecture that connects an LLM to an external knowledge base before generating an answer. Instead of relying only on what the model memorized during training, you retrieve relevant chunks from your own documents and inject them into the prompt.

The gain isn't just reduced hallucination — it's traceability. Every claim can be tied back to a source, which matters in corporate and regulated settings. In practice, a RAG system's quality depends less on the model and more on the retrieval engineering: chunking, embeddings, hybrid search, and reranking.