wip

leemthompo · leemthompo · commit e52188d73c3a · 2025-01-03T12:26:00.000+01:00
diff --git a/docs/reference/images/search/rag-venn-diagram.svg b/docs/reference/images/search/rag-venn-diagram.svg
@@ -0,0 +1,19 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 600 400">
+  <!-- Left circle (Information Retrieval) -->
+  <circle cx="220" cy="200" r="150" fill="#4A90E2" opacity="0.6"/>
+  
+  <!-- Right circle (Generative AI) -->
+  <circle cx="380" cy="200" r="150" fill="#50C878" opacity="0.6"/>
+  
+  <!-- Text labels -->
+  <text x="160" y="200" font-family="Arial" font-size="20" fill="#2C3E50" text-anchor="middle">Information
+    <tspan x="160" y="225">Retrieval</tspan>
+  </text>
+  
+  <text x="440" y="200" font-family="Arial" font-size="20" fill="#2C3E50" text-anchor="middle">Generative
+    <tspan x="440" y="225">AI</tspan>
+  </text>
+  
+  <!-- RAG label in intersection -->
+  <text x="300" y="200" font-family="Arial" font-size="28" font-weight="bold" fill="#2C3E50" text-anchor="middle">RAG</text>
+</svg>
diff --git a/docs/reference/search/search-your-data/retrieval-augmented-generation.asciidoc b/docs/reference/search/search-your-data/retrieval-augmented-generation.asciidoc
@@ -0,0 +1,36 @@
+[rag-elasticsearch]
+== Retrieval augmented generation
+
+Retrieval augmented generation (RAG) is a technique that retrieves additional context  from an external datastore before prompting an LLM.
+This grounds the LLM with in-context learning.
+Compared to finetuning or continuous pretraining, RAG can be implemented faster and cheaper, and it has several advantages.
+
+image::images/search/rag-venn-diagram.svg[RAG sits at the intersection of information retrieval and generative AI, align=center, width=500]
+
+RAG sits at the intersection of information retrieval and generative AI.
+{es} is an excellent tool for implementing RAG, because it offers various retrieval capabilities, such as full-text search, vector search, and hybrid search.
+
+[discrete]
+[[rag-elasticsearch-advantages]]
+=== Advantages of RAG
+
+RAG has several advantages:
+
+* It enables grounding the LLM with additional, up-to-date and/or private data.
+* It is much cheaper and easier to maintain compared to finetuning or continuously pretraining a model.
+* It ensures data privacy and security because you control what data the model sees. Different indices have different access controls.
+* You can rely on the language model to parse and format the retrieved context in a style or format of your choice.
+* You can start with a simple BM25-based full-text search system and gradually improve it by adding more advanced semantic and hybrid search capabilities.
+
+[discrete]
+[[rag-elasticsearch-example]]
+=== Example
+
+Here's a simple example of a RAG system using {es}, where a user has a question about the company travel policy:
+
+1. User makes natural language queries about company travel policy
+2. System retrieves relevant documents from {es}
+3. LLM generates response using retrieved context
+
+The result is accurate, up-to-date answers based on company documents.
+
diff --git a/docs/reference/search/search-your-data/search-your-data.asciidoc b/docs/reference/search/search-your-data/search-your-data.asciidoc
@@ -48,6 +48,7 @@ include::../../how-to/recipes.asciidoc[]
 include::retrievers-overview.asciidoc[]
 include::knn-search.asciidoc[]
 include::semantic-search.asciidoc[]
+include::retrieval-augmented-generation.asciidoc[]
 include::search-across-clusters.asciidoc[]
 include::search-with-synonyms.asciidoc[]
 include::search-application-overview.asciidoc[]