elastic · leemthompo · Jan 7, 2025 · Jan 3, 2025 · Jan 6, 2025 · Jan 6, 2025
diff --git a/docs/reference/images/search/rag-schema.svg b/docs/reference/images/search/rag-schema.svg
diff --git a/docs/reference/images/search/rag-venn-diagram.svg b/docs/reference/images/search/rag-venn-diagram.svg
diff --git a/docs/reference/search/search-your-data/retrieval-augmented-generation.asciidoc b/docs/reference/search/search-your-data/retrieval-augmented-generation.asciidoc
@@ -0,0 +1,77 @@
+[rag-elasticsearch]
+== Retrieval augmented generation
+
+.🍿 Prefer a video introduction?
+***********************
+Check out https://www.youtube.com/watch?v=OS4ZefUPAks[this short video] from the Elastic Snackable Series.
+***********************
+
+Retrieval augmented generation (RAG) is a technique where additional context is retrieved from an external datastore before prompting a language model to generate a response using the retrieved context.
+This grounds the model with in-context learning.
+Compared to finetuning or continuous pretraining, RAG can be implemented faster and cheaper, and it has several advantages.
+
+image::images/search/rag-venn-diagram.svg[RAG sits at the intersection of information retrieval and generative AI, align=center, width=500]
+
+RAG sits at the intersection of https://www.elastic.co/what-is/information-retrieval[information retrieval] and generative AI.
+{es} is an excellent tool for implementing RAG, because it offers various retrieval capabilities, such as full-text search, vector search, and hybrid search.
+
+[discrete]
+[[rag-elasticsearch-advantages]]
+=== Advantages of RAG
+
+RAG has several advantages:
+
+* *Improved context:* Enables grounding the language model with additional, up-to-date, and/or private data.
+* *Reduced hallucination:* Helps minimize factual errors by enabling models to cite authoritative sources.
+* *Cost efficiency:* Requires less maintenance compared to finetuning or continuously pretraining models.
-* *Cost efficiency:* Requires less maintenance compared to finetuning or continuously pretraining models.
+* *Cost efficiency:* Requires less maintenance compared to fine-tuning or continuously pre-training models.
-* *Cost efficiency:* Requires less maintenance compared to finetuning or continuously pretraining models.
+* *Cost efficiency:* Requires less maintenance compared to fine-tuning or continuously pre-training models.
+* *Enhanced security:* Controls data access by leveraging {es}'s <<authorization, user authorization>> features, such as role-based access control and field/document-level security.
+* *Simplified response parsing:* Eliminates the need for custom parsing logic by letting the language model handle parsing {es} responses and formatting the retrieved context.
+* *Flexible implementation:* Works with basic 
+// TODO: uncomment when page is live <<full-text-search,full-text search>> 
+full-text search and can be gradually updated to add more advanced and computationally intensive <<semantic-search,semantic search>> capabilities.
-full-text search and can be gradually updated to add more advanced and computationally intensive <<semantic-search,semantic search>> capabilities.
+full-text search, and can be gradually updated to add more advanced and computationally intensive <<semantic-search,semantic search>> capabilities.
-full-text search and can be gradually updated to add more advanced and computationally intensive <<semantic-search,semantic search>> capabilities.
+full-text search, and can be gradually updated to add more advanced and computationally intensive <<semantic-search,semantic search>> capabilities.
+
+[discrete]
+[[rag-elasticsearch-components]]
+=== RAG system overview
+
+The following diagram illustrates a simple RAG system using {es}.
+
+image::images/search/rag-schema.svg[Components of a simple RAG system using Elasticsearch, align=center, width=800]
+
+The system consists of the following components:
-The system consists of the following components:
+The system augments search results using the following process:
-The system consists of the following components:
+The system augments search results using the following process:
+
+. User submits a query
+. Elasticsearch retrieves relevant documents, using full-text search, vector search, or hybrid search
+. Language model processes the context and generates a response, using custom instructions, such as "Cite a source" or "Provide a concise summary of the `content` field in markdown format"
+. Model returns final response to the user
+
+[TIP]
+====
+A more advanced setup might include query rewriting between steps 1 and 2. This intermediate step could use one or more additional language models with different instructions to reformulate queries for more specific and detailed responses.
+====
+
+[discrete]
+[[rag-elasticsearch-getting-started]]
+=== Getting started
+
+Start building RAG applications quickly with Playground, which seamlessly integrates {es} with language model providers.
+The Playground UI enables you to build, test, and deploy RAG interfaces on top of your {es} indices.
+
+Playground automatically selects the best retrieval methods for your data, while providing full control over the final {es} queries and language model instructions.
+You can also download the underlying Python code to integrate with your existing applications.
+
+Learn more in the {kibana-ref}/playground.html[documentation] and 
+try the https://www.elastic.co/demo-gallery/ai-playground[interactive lab] for hands-on experience.
+
+[discrete]
+[[rag-elasticsearch-learn-more]]
+=== Learn more
+
+Learn more about building RAG systems using {es} in these blog posts:
+
+* https://www.elastic.co/blog/beyond-rag-basics-semantic-search-with-elasticsearch[Beyond RAG Basics: Advanced strategies for AI applications]
+* https://www.elastic.co/search-labs/blog/building-a-rag-system-with-gemma-hugging-face-elasticsearch[Building a RAG system with Gemma, Hugging Face, and Elasticsearch]
+* https://www.elastic.co/search-labs/blog/rag-agent-tool-elasticsearch-langchain[Building an agentic RAG tool with Elasticsearch and Langchain]
+
+
+
diff --git a/docs/reference/search/search-your-data/search-your-data.asciidoc b/docs/reference/search/search-your-data/search-your-data.asciidoc
@@ -48,6 +48,7 @@ include::../../how-to/recipes.asciidoc[]
 include::retrievers-overview.asciidoc[]
 include::knn-search.asciidoc[]
 include::semantic-search.asciidoc[]
+include::retrieval-augmented-generation.asciidoc[]
 include::search-across-clusters.asciidoc[]
 include::search-with-synonyms.asciidoc[]
 include::search-application-overview.asciidoc[]