Retrieval Augmented Generation (RAG) Patterns #
Retrieval Augmented Generation (RAG) is a technique that enhances generative AI outputs by first retrieving relevant information from a knowledge base and then using that information to guide the generation process.
Why RAG? #
RAG addresses several limitations of foundation models:
- Knowledge Cutoffs: Foundation models have knowledge limited to their training data
- Hallucinations: Models may generate plausible but incorrect information
- Sourcing: RAG can provide attribution for generated content
- Customization: Allows models to access domain-specific knowledge without fine-tuning
Basic RAG Architecture #

Basic RAG Architecture
- Document Processing: Convert documents into chunks and embeddings
- Retrieval: Find relevant information based on the query
- Augmented Prompting: Enhance the prompt with retrieved information
- Generation: Produce the final output using the augmented prompt