|
1 | 1 | [[retrievers-overview]] |
2 | 2 | === Retrievers |
3 | 3 |
|
4 | | -preview::[] |
5 | | - |
6 | | -A retriever is an abstraction that was added to the Search API in *8.14.0*. |
| 4 | +A retriever is an abstraction that was added to the Search API in *8.14.0* and was made generally available in *8.16.0*. |
7 | 5 | This abstraction enables the configuration of multi-stage retrieval pipelines within a single `_search` call. |
8 | 6 | This simplifies your search application logic, because you no longer need to configure complex searches via multiple {es} calls or implement additional client-side logic to combine results from different queries. |
9 | 7 |
|
@@ -32,8 +30,7 @@ with different relevance indicators into a single result set. |
32 | 30 | An RRF retriever is a *compound retriever*, where its `filter` element is |
33 | 31 | propagated to its sub retrievers. |
34 | 32 | + |
35 | | -Sub retrievers may not use elements that are restricted by having a compound retriever as part of the retriever tree. |
36 | | -See the <<rrf-using-multiple-standard-retrievers,RRF documentation>> for detailed examples and information on how to use the RRF retriever. |
| 33 | + |
37 | 34 | * <<text-similarity-reranker-retriever,*Text Similarity Re-ranker Retriever*>>. Used for <<semantic-reranking,semantic reranking>>. |
38 | 35 | Requires first creating a `rerank` task using the <<put-inference-api,{es} Inference API>>. |
39 | 36 |
|
@@ -72,82 +69,56 @@ When using compound retrievers, only the query element is allowed, which enforce |
72 | 69 | [[retrievers-overview-example]] |
73 | 70 | ==== Example |
74 | 71 |
|
75 | | -The following example demonstrates how using retrievers simplify the composability of queries for RRF ranking. |
| 72 | +The following example demonstrates the powerful queries that we can now compose, and how retrievers simplify this process. We can use any combination of retrievers we want, propagating the |
| 73 | +results of a nested retriever to its parent. In this scenario, we'll make use of all 4 (currently) available retrievers, i.e. `standard`, `knn`, `text_similarity_reranker` and `rrf`. |
| 74 | +We'll first combine the results of a `semantic` query using the `standard` retriever, and that of a `knn` search on a dense vector field, using `rrf` to get the top 100 results. |
| 75 | +Finally, we'll then rerank the top-50 results of `rrf` using the `text_similarity_reranker` |
76 | 76 |
|
77 | 77 | [source,js] |
78 | 78 | ---- |
79 | 79 | GET example-index/_search |
80 | 80 | { |
81 | 81 | "retriever": { |
82 | | - "rrf": { |
83 | | - "retrievers": [ |
84 | | - { |
85 | | - "standard": { |
86 | | - "query": { |
87 | | - "sparse_vector": { |
88 | | - "field": "vector.tokens", |
89 | | - "inference_id": "my-elser-endpoint", |
90 | | - "query": "What blue shoes are on sale?" |
| 82 | + "text_similarity_reranker": { |
| 83 | + "retriever": { |
| 84 | + "rrf": { |
| 85 | + "retrievers": [ |
| 86 | + { |
| 87 | + "standard": { |
| 88 | + "query": { |
| 89 | + "semantic": { |
| 90 | + "field": "inference_field", |
| 91 | + "query": "state of the art vector database" |
| 92 | + } |
| 93 | + } |
| 94 | + } |
| 95 | + }, |
| 96 | + { |
| 97 | + "knn": { |
| 98 | + "query_vector": [ |
| 99 | + 0.54, |
| 100 | + ..., |
| 101 | + 0.245 |
| 102 | + ], |
| 103 | + "field": "embedding", |
| 104 | + "k": 10, |
| 105 | + "num_candidates": 15 |
91 | 106 | } |
92 | 107 | } |
93 | | - } |
94 | | - }, |
95 | | - { |
96 | | - "standard": { |
97 | | - "query": { |
98 | | - "match": { |
99 | | - "text": "blue shoes sale" |
100 | | - } |
101 | | - } |
102 | | - } |
103 | | - } |
104 | | - ] |
105 | | - } |
106 | | - } |
107 | | -} |
108 | | ----- |
109 | | -//NOTCONSOLE |
110 | | - |
111 | | -This example demonstrates how you can combine different retrieval strategies into a single `retriever` pipeline. |
112 | | - |
113 | | -Compare to `RRF` with `sub_searches` approach: |
114 | | - |
115 | | -.*Expand* for example |
116 | | -[%collapsible] |
117 | | -============== |
118 | | -
|
119 | | -[source,js] |
120 | | ----- |
121 | | -GET example-index/_search |
122 | | -{ |
123 | | - "sub_searches":[ |
124 | | - { |
125 | | - "query":{ |
126 | | - "match":{ |
127 | | - "text":"blue shoes sale" |
128 | | - } |
129 | | - } |
130 | | - }, |
131 | | - { |
132 | | - "query":{ |
133 | | - "sparse_vector": { |
134 | | - "field": "vector.tokens", |
135 | | - "inference_id": "my-elser-endoint", |
136 | | - "query": "What blue shoes are on sale?" |
137 | | - } |
| 108 | + ], |
| 109 | + "rank_window_size": 100, |
| 110 | + "rank_constant": 10 |
138 | 111 | } |
139 | | - } |
140 | | - ], |
141 | | - "rank":{ |
142 | | - "rrf":{ |
143 | | - "rank_window_size":50, |
144 | | - "rank_constant":20 |
| 112 | + }, |
| 113 | + "rank_window_size": 50, |
| 114 | + "field": "description", |
| 115 | + "inference_text": "what's the best way to create complex pipelines and retrieve documents?", |
| 116 | + "inference_id": "my-awesome-rerank-model" |
145 | 117 | } |
146 | 118 | } |
147 | 119 | } |
148 | 120 | ---- |
149 | 121 | //NOTCONSOLE |
150 | | -============== |
151 | 122 |
|
152 | 123 | [discrete] |
153 | 124 | [[retrievers-overview-glossary]] |
|
0 commit comments