Skip to content

Commit 75e2bef

Browse files
committed
updating retrievers documentation to add more examples
1 parent 744eb50 commit 75e2bef

File tree

2 files changed

+428
-44
lines changed

2 files changed

+428
-44
lines changed

docs/reference/search/search-your-data/retrievers-overview.asciidoc

Lines changed: 116 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -67,59 +67,131 @@ When using compound retrievers, only the query element is allowed, which enforce
6767

6868
[discrete]
6969
[[retrievers-overview-example]]
70-
==== Example
70+
==== Examples
7171

72-
The following example demonstrates the powerful queries that we can now compose, and how retrievers simplify this process. We can use any combination of retrievers we want, propagating the
73-
results of a nested retriever to its parent. In this scenario, we'll make use of all 4 (currently) available retrievers, i.e. `standard`, `knn`, `text_similarity_reranker` and `rrf`.
74-
We'll first combine the results of a `semantic` query using the `standard` retriever, and that of a `knn` search on a dense vector field, using `rrf` to get the top 100 results.
75-
Finally, we'll then rerank the top-50 results of `rrf` using the `text_similarity_reranker`
72+
Let's work through some examples to see how we can use and leverage retrievers to build an awesome search experience!
73+
74+
To show the full functionality, in this exercise, we'll assume that we have access to a reranker model through the inference service,
75+
as well as access to Elser, for building semantic queries!
76+
77+
So, first things first, let's start by setting up these services and have them in place for later use!
78+
79+
[source,js]
80+
----
81+
// Setup rerank task stored as `my-awesome-rerank-model`
82+
PUT _inference/rerank/my-awesome-rerank-model
83+
{
84+
"service": "cohere",
85+
"service_settings": {
86+
"model_id": "rerank-english-v3.0",
87+
"api_key": "{{COHERE_API_KEY}}"
88+
}
89+
}
90+
----
91+
//NOTCONSOLE
92+
93+
[source,js]
94+
----
95+
// Setup ELSER as `my-elser-endpoint`
96+
PUT _inference/sparse_embedding/my-elser-endpoint
97+
{
98+
"service": "elser",
99+
"service_settings": {
100+
"num_allocations": 1,
101+
"num_threads": 1
102+
},
103+
"task_settings": {}
104+
}
105+
----
106+
//NOTCONSOLE
107+
108+
Now that we have our services in place, lets create our `retrievers_example` index, and add some documents to it.
109+
[source,js]
110+
----
111+
PUT retrievers_example
112+
{
113+
"mappings": {
114+
"properties": {
115+
"vector": {
116+
"type": "dense_vector",
117+
"dims": 3,
118+
"similarity": "l2_norm",
119+
"index": true
120+
},
121+
"text": {
122+
"type": "text",
123+
"copy_to": "inference_field"
124+
},
125+
"year": {
126+
"type": "integer"
127+
},
128+
"topic": {
129+
"type": "keyword"
130+
},
131+
"inference_field": {
132+
"type": "semantic_text",
133+
"inference_id": "my-elser-endpoint"
134+
}
135+
}
136+
}
137+
}
138+
----
139+
//NOTCONSOLE
76140

77141
[source,js]
78142
----
79-
GET example-index/_search
143+
POST /retrievers_example/_doc/1
144+
{
145+
"vector": [0.23, 0.67, 0.89],
146+
"text": "Large language models are revolutionizing information retrieval by boosting search precision, deepening contextual understanding, and reshaping user experiences in data-rich environments.",
147+
"year": 2024,
148+
"topic": ["llm", "ai", "information_retrieval"]
149+
}
150+
151+
POST /retrievers_example/_doc/2
152+
{
153+
"vector": [0.12, 0.56, 0.78],
154+
"text": "Artificial intelligence is transforming medicine, from advancing diagnostics and tailoring treatment plans to empowering predictive patient care for improved health outcomes.",
155+
"year": 2023,
156+
"topic": ["ai", "medicine"]
157+
}
158+
159+
POST /retrievers_example/_doc/3
80160
{
81-
"retriever": {
82-
"text_similarity_reranker": {
83-
"retriever": {
84-
"rrf": {
85-
"retrievers": [
86-
{
87-
"standard": {
88-
"query": {
89-
"semantic": {
90-
"field": "inference_field",
91-
"query": "state of the art vector database"
92-
}
93-
}
94-
}
95-
},
96-
{
97-
"knn": {
98-
"query_vector": [
99-
0.54,
100-
...,
101-
0.245
102-
],
103-
"field": "embedding",
104-
"k": 10,
105-
"num_candidates": 15
106-
}
107-
}
108-
],
109-
"rank_window_size": 100,
110-
"rank_constant": 10
111-
}
112-
},
113-
"rank_window_size": 50,
114-
"field": "description",
115-
"inference_text": "what's the best way to create complex pipelines and retrieve documents?",
116-
"inference_id": "my-awesome-rerank-model"
117-
}
118-
}
161+
"vector": [0.45, 0.32, 0.91],
162+
"text": "AI is redefining security by enabling advanced threat detection, proactive risk analysis, and dynamic defenses against increasingly sophisticated cyber threats.",
163+
"year": 2024,
164+
"topic": ["ai", "security"]
119165
}
166+
167+
POST /retrievers_example/_doc/4
168+
{
169+
"vector": [0.34, 0.21, 0.98],
170+
"text": "Elastic introduces Elastic AI Assistant, the open, generative AI sidekick powered by ESRE to democratize cybersecurity and enable users of every skill level.",
171+
"year": 2023,
172+
"topic": ["ai", "elastic", "assistant"]
173+
}
174+
175+
POST /retrievers_example/_doc/5
176+
{
177+
"vector": [0.11, 0.65, 0.47],
178+
"text": "Learn how to spin up a deployment of our hosted Elasticsearch Service and use Elastic Observability to gain deeper insight into the behavior of your applications and systems.",
179+
"year": 2024,
180+
"topic": ["documentation", "observability", "elastic"]
181+
}
182+
120183
----
121184
//NOTCONSOLE
122185

186+
Now that we also have our documents in place, let's try to run some queries using retrievers!
187+
188+
include::retrievers_examples.asciidoc[tag=basic-rrf-retriever-with-semantic-query]
189+
include::retrievers_examples.asciidoc[tag=rrf-retriever-with-collapse]
190+
include::retrievers_examples.asciidoc[tag=rrf-on-top-of-semantic-reranker]
191+
include::retrievers_examples.asciidoc[tag=text-similarity-reranker-on-top-of-rrf]
192+
include::retrievers_examples.asciidoc[tag=chaining-text-similarity-reranker-retrievers]
193+
include::retrievers_examples.asciidoc[tag=rrf-retriever-with-aggs]
194+
123195
[discrete]
124196
[[retrievers-overview-glossary]]
125197
==== Glossary

0 commit comments

Comments
 (0)