Skip to content

Commit da060e9

Browse files
committed
Refactor, add workflow diagram, links
1 parent e52188d commit da060e9

File tree

2 files changed

+141
-15
lines changed

2 files changed

+141
-15
lines changed
Lines changed: 85 additions & 0 deletions
Loading
Lines changed: 56 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,18 @@
11
[rag-elasticsearch]
22
== Retrieval augmented generation
33

4-
Retrieval augmented generation (RAG) is a technique that retrieves additional context from an external datastore before prompting an LLM.
5-
This grounds the LLM with in-context learning.
4+
.🍿 Prefer a video introduction?
5+
***********************
6+
Check out https://www.youtube.com/watch?v=OS4ZefUPAks[this short video] from the Elastic Snackable Series.
7+
***********************
8+
9+
Retrieval augmented generation (RAG) is a technique where additional context is retrieved from an external datastore before prompting a language model to generate a response using the retrieved context.
10+
This grounds the model with in-context learning.
611
Compared to finetuning or continuous pretraining, RAG can be implemented faster and cheaper, and it has several advantages.
712

813
image::images/search/rag-venn-diagram.svg[RAG sits at the intersection of information retrieval and generative AI, align=center, width=500]
914

10-
RAG sits at the intersection of information retrieval and generative AI.
15+
RAG sits at the intersection of https://www.elastic.co/what-is/information-retrieval[information retrieval] and generative AI.
1116
{es} is an excellent tool for implementing RAG, because it offers various retrieval capabilities, such as full-text search, vector search, and hybrid search.
1217

1318
[discrete]
@@ -16,21 +21,57 @@ RAG sits at the intersection of information retrieval and generative AI.
1621

1722
RAG has several advantages:
1823

19-
* It enables grounding the LLM with additional, up-to-date and/or private data.
20-
* It is much cheaper and easier to maintain compared to finetuning or continuously pretraining a model.
21-
* It ensures data privacy and security because you control what data the model sees. Different indices have different access controls.
22-
* You can rely on the language model to parse and format the retrieved context in a style or format of your choice.
23-
* You can start with a simple BM25-based full-text search system and gradually improve it by adding more advanced semantic and hybrid search capabilities.
24+
* *Improved context:* Enables grounding the LLM with additional, up-to-date, and/or private data.
25+
* *Reduced hallucination:* Helps minimize factual errors by enabling models to cite authoritative sources.
26+
* *Cost efficiency:* Requires less maintenance compared to finetuning or continuously pretraining models.
27+
* *Enhanced security:* Controls data access by leveraging {es}'s <<authorization, user authorization>> features, such as role-based access control, and field/document-level security.
28+
* *Simplified response parsing:* Eliminates the need for custom parsing logic by letting the language model handle parsing {es} responses and formatting the retrieved context.
29+
* *Flexible implementation:* Works with basic
30+
// TODO: uncomment when page is live <<full-text-search,full-text search>>
31+
full-text search and can be gradually updated to use advanced <<semantic-search,semantic search>> and hybrid search capabilities.
32+
33+
[discrete]
34+
[[rag-elasticsearch-components]]
35+
=== RAG system overview
36+
37+
The following diagram illustrates a simple RAG system using {es}.
38+
39+
image::images/search/rag-schema.svg[Components of a simple RAG system using Elasticsearch, align=center, width=800]
40+
41+
The system consists of the following components:
42+
43+
. User submits a query
44+
. Elasticsearch retrieves relevant documents, using full-text search, vector search, or hybrid search
45+
. Language model processes the context and generates a response, using custom instructions, such as "Cite a source" or "Provide a concise summary of the `content` field in markdown format"
46+
. Model returns final response to the user
47+
48+
[TIP]
49+
====
50+
A more advanced setup might include query rewriting between steps 1 and 2. This intermediate step could use one or more additional language models with different instructions to reformulate queries for more specific and detailed responses.
51+
====
2452

2553
[discrete]
26-
[[rag-elasticsearch-example]]
27-
=== Example
54+
[[rag-elasticsearch-getting-started]]
55+
=== Getting started
56+
57+
Start building RAG applications quickly with Playground, which seamlessly integrates {es} with language model providers.
58+
The Playground UI enables you to build, test, and deploy RAG interfaces on top of your {es} indices.
59+
60+
Playground automatically selects the best retrieval methods for your data, while providing full control over the final {es} queries and language model instructions.
61+
You can also download the underlying Python code to integrate with your existing applications.
62+
63+
Learn more in the {kibana-ref}/playground.html[documentation] and
64+
try the https://www.elastic.co/demo-gallery/ai-playground[interactive lab] for hands-on experience.
65+
66+
[discrete]
67+
[[rag-elasticsearch-learn-more]]
68+
=== Learn more
69+
70+
Learn more about building RAG systems using {es} in these blog posts:
2871

29-
Here's a simple example of a RAG system using {es}, where a user has a question about the company travel policy:
72+
* https://www.elastic.co/blog/beyond-rag-basics-semantic-search-with-elasticsearch[Beyond RAG Basics: Advanced strategies for AI applications]
73+
* https://www.elastic.co/search-labs/blog/building-a-rag-system-with-gemma-hugging-face-elasticsearch[Building a RAG system with Gemma, Hugging Face, and Elasticsearch]
74+
* https://www.elastic.co/search-labs/blog/rag-agent-tool-elasticsearch-langchain[Building an agentic RAG tool with Elasticsearch and Langchain]
3075

31-
1. User makes natural language queries about company travel policy
32-
2. System retrieves relevant documents from {es}
33-
3. LLM generates response using retrieved context
3476

35-
The result is accurate, up-to-date answers based on company documents.
3677

0 commit comments

Comments
 (0)