Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions docs/reference/images/search/rag-schema.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
19 changes: 19 additions & 0 deletions docs/reference/images/search/rag-venn-diagram.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
[rag-elasticsearch]
== Retrieval augmented generation

.🍿 Prefer a video introduction?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very cute

***********************
Check out https://www.youtube.com/watch?v=OS4ZefUPAks[this short video] from the Elastic Snackable Series.
***********************

Retrieval augmented generation (RAG) is a technique where additional context is retrieved from an external datastore before prompting a language model to generate a response using the retrieved context.
This grounds the model with in-context learning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is a little colloquial - could rephrase to clarify what you mean by "grounds"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 grounding is the term of art but agree shouldn't assume knowledge

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh consider my comment retracted then

Compared to finetuning or continuous pretraining, RAG can be implemented faster and cheaper, and it has several advantages.

image::images/search/rag-venn-diagram.svg[RAG sits at the intersection of information retrieval and generative AI, align=center, width=500]

RAG sits at the intersection of https://www.elastic.co/what-is/information-retrieval[information retrieval] and generative AI.
{es} is an excellent tool for implementing RAG, because it offers various retrieval capabilities, such as full-text search, vector search, and hybrid search.

[discrete]
[[rag-elasticsearch-advantages]]
=== Advantages of RAG

RAG has several advantages:

* *Improved context:* Enables grounding the language model with additional, up-to-date, and/or private data.
* *Reduced hallucination:* Helps minimize factual errors by enabling models to cite authoritative sources.
* *Cost efficiency:* Requires less maintenance compared to finetuning or continuously pretraining models.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* *Cost efficiency:* Requires less maintenance compared to finetuning or continuously pretraining models.
* *Cost efficiency:* Requires less maintenance compared to fine-tuning or continuously pre-training models.

* *Enhanced security:* Controls data access by leveraging {es}'s <<authorization, user authorization>> features, such as role-based access control and field/document-level security.
* *Simplified response parsing:* Eliminates the need for custom parsing logic by letting the language model handle parsing {es} responses and formatting the retrieved context.
* *Flexible implementation:* Works with basic
// TODO: uncomment when page is live <<full-text-search,full-text search>>
full-text search and can be gradually updated to add more advanced and computationally intensive <<semantic-search,semantic search>> capabilities.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
full-text search and can be gradually updated to add more advanced and computationally intensive <<semantic-search,semantic search>> capabilities.
full-text search, and can be gradually updated to add more advanced and computationally intensive <<semantic-search,semantic search>> capabilities.


[discrete]
[[rag-elasticsearch-components]]
=== RAG system overview

The following diagram illustrates a simple RAG system using {es}.

image::images/search/rag-schema.svg[Components of a simple RAG system using Elasticsearch, align=center, width=800]

The system consists of the following components:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "components" is the wrong word here. possibly bad edit:

Suggested change
The system consists of the following components:
The system augments search results using the following process:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree! I'll reformulate


. User submits a query
. Elasticsearch retrieves relevant documents, using full-text search, vector search, or hybrid search
. Language model processes the context and generates a response, using custom instructions, such as "Cite a source" or "Provide a concise summary of the `content` field in markdown format"
. Model returns final response to the user

[TIP]
====
A more advanced setup might include query rewriting between steps 1 and 2. This intermediate step could use one or more additional language models with different instructions to reformulate queries for more specific and detailed responses.
====

[discrete]
[[rag-elasticsearch-getting-started]]
=== Getting started

Start building RAG applications quickly with Playground, which seamlessly integrates {es} with language model providers.
The Playground UI enables you to build, test, and deploy RAG interfaces on top of your {es} indices.

Playground automatically selects the best retrieval methods for your data, while providing full control over the final {es} queries and language model instructions.
You can also download the underlying Python code to integrate with your existing applications.

Learn more in the {kibana-ref}/playground.html[documentation] and
try the https://www.elastic.co/demo-gallery/ai-playground[interactive lab] for hands-on experience.

[discrete]
[[rag-elasticsearch-learn-more]]
=== Learn more

Learn more about building RAG systems using {es} in these blog posts:

* https://www.elastic.co/blog/beyond-rag-basics-semantic-search-with-elasticsearch[Beyond RAG Basics: Advanced strategies for AI applications]
* https://www.elastic.co/search-labs/blog/building-a-rag-system-with-gemma-hugging-face-elasticsearch[Building a RAG system with Gemma, Hugging Face, and Elasticsearch]
* https://www.elastic.co/search-labs/blog/rag-agent-tool-elasticsearch-langchain[Building an agentic RAG tool with Elasticsearch and Langchain]



Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ include::../../how-to/recipes.asciidoc[]
include::retrievers-overview.asciidoc[]
include::knn-search.asciidoc[]
include::semantic-search.asciidoc[]
include::retrieval-augmented-generation.asciidoc[]
include::search-across-clusters.asciidoc[]
include::search-with-synonyms.asciidoc[]
include::search-application-overview.asciidoc[]
Expand Down