You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reference/elasticsearch/rest-apis/retrievers.md
+3-153Lines changed: 3 additions & 153 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ applies_to:
11
11
A retriever is a specification to describe top documents returned from a search. A retriever replaces other elements of the [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) that also return top documents such as [`query`](/reference/query-languages/querydsl.md) and [`knn`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-api-knn). A retriever may have child retrievers where a retriever with two or more children is considered a compound retriever. This allows for complex behavior to be depicted in a tree-like structure, called the retriever tree, which clarifies the order of operations that occur during a search.
12
12
13
13
::::{tip}
14
-
Refer to [*Retrievers*](docs-content://solutions/search/retrievers-overview.md) for a high level overview of the retrievers abstraction. Refer to [Retrievers examples](retrievers/retrievers-examples.md) for additional examples.
14
+
Refer to [*Retrievers*](docs-content://solutions/search/retrievers-overview.md) for a high level overview of the retrievers abstraction. Refer to [Retrievers examples](docs-content://solutions/search/retrievers-examples.md) for additional examples.
15
15
16
16
::::
17
17
@@ -99,156 +99,6 @@ When using the `linear` retriever, fields can be boosted using the `^` notation:
99
99
GET books/_search
100
100
{
101
101
"retriever": {
102
-
"knn": { <1>
103
-
"field": "vector", <2>
104
-
"query_vector": [10, 22, 77], <3>
105
-
"k": 10, <4>
106
-
"num_candidates": 10 <5>
107
-
}
108
-
}
109
-
}
110
-
```
111
-
112
-
1. Configuration for k-nearest neighbor (knn) search, which is based on vector similarity.
113
-
2. Specifies the field name that contains the vectors.
114
-
3. The query vector against which document vectors are compared in the `knn` search.
115
-
4. The number of nearest neighbors to return as top hits. This value must be fewer than or equal to `num_candidates`.
116
-
5. The size of the initial candidate set from which the final `k` nearest neighbors are selected.
117
-
118
-
119
-
120
-
121
-
## Linear Retriever [linear-retriever]
122
-
123
-
A retriever that normalizes and linearly combines the scores of other retrievers.
124
-
125
-
126
-
#### Parameters [linear-retriever-parameters]
127
-
128
-
`retrievers`
129
-
: (Required, array of objects)
130
-
131
-
A list of the sub-retrievers' configuration, that we will take into account and whose result sets we will merge through a weighted sum. Each configuration can have a different weight and normalization depending on the specified retriever.
132
-
133
-
`normalizer`
134
-
: (Optional, String)
135
-
136
-
Specifies a normalizer to be applied to all sub-retrievers. This provides a simple way to configure normalization for all retrievers at once.
137
-
138
-
The `normalizer` can be specified at the top level, at the per-retriever level, or both, with the following rules:
139
-
140
-
* If only the top-level `normalizer` is specified, it applies to all sub-retrievers.
141
-
* If both a top-level and a per-retriever `normalizer` are specified, the per-retriever normalizer must be identical to the top-level one. If they differ, the request will fail.
142
-
* If only per-retriever normalizers are specified, they can be different for each sub-retriever.
143
-
* If no normalizer is specified at any level, no normalization is applied.
144
-
145
-
Available values are: `minmax`, `l2_norm`, and `none`. Defaults to `none`.
146
-
147
-
Each entry in the `retrievers` array specifies the following parameters:
148
-
149
-
`retriever`
150
-
: (Required, a `retriever` object)
151
-
152
-
Specifies the retriever for which we will compute the top documents for. The retriever will produce `rank_window_size` results, which will later be merged based on the specified `weight` and `normalizer`.
153
-
154
-
`weight`
155
-
: (Optional, float)
156
-
157
-
The weight that each score of this retriever’s top docs will be multiplied with. Must be greater or equal to 0. Defaults to 1.0.
158
-
159
-
`normalizer`
160
-
: (Optional, String)
161
-
162
-
Specifies how we will normalize this specific retriever’s scores, before applying the specified `weight`. If a top-level `normalizer` is also specified, this normalizer must be the same. Available values are: `minmax`, `l2_norm`, and `none`. Defaults to `none`.
163
-
164
-
* `none`
165
-
* `minmax` : A `MinMaxScoreNormalizer` that normalizes scores based on the following formula
166
-
167
-
```
168
-
score = (score - min) / (max - min)
169
-
```
170
-
171
-
* `l2_norm` : An `L2ScoreNormalizer` that normalizes scores using the L2 norm of the score values.
172
-
173
-
See also [this hybrid search example](docs-content://solutions/search/retrievers-examples.md#retrievers-examples-linear-retriever) using a linear retriever on how to independently configure and apply normalizers to retrievers.
174
-
175
-
`rank_window_size`
176
-
: (Optional, integer)
177
-
178
-
This value determines the size of the individual result sets per query. A higher value will improve result relevance at the cost of performance. The final ranked result set is pruned down to the search request’s [size](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-size-param). `rank_window_size` must be greater than or equal to `size` and greater than or equal to `1`. Defaults to the `size` parameter.
179
-
180
-
181
-
`filter`
182
-
: (Optional, [query object or list of query objects](/reference/query-languages/querydsl.md))
183
-
184
-
Applies the specified [boolean query filter](/reference/query-languages/query-dsl/query-dsl-bool-query.md) to all of the specified sub-retrievers, according to each retriever’s specifications.
185
-
186
-
187
-
188
-
## RRF Retriever [rrf-retriever]
189
-
190
-
An [RRF](/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) retriever returns top documents based on the RRF formula, equally weighting two or more child retrievers. Reciprocal rank fusion (RRF) is a method for combining multiple result sets with different relevance indicators into a single result set.
191
-
192
-
193
-
#### Parameters [rrf-retriever-parameters]
194
-
195
-
`retrievers`
196
-
: (Required, array of retriever objects)
197
-
198
-
A list of child retrievers to specify which sets of returned top documents will have the RRF formula applied to them. Each child retriever carries an equal weight as part of the RRF formula. Two or more child retrievers are required.
199
-
200
-
201
-
`rank_constant`
202
-
: (Optional, integer)
203
-
204
-
This value determines how much influence documents in individual result sets per query have over the final ranked result set. A higher value indicates that lower ranked documents have more influence. This value must be greater than or equal to `1`. Defaults to `60`.
205
-
206
-
207
-
`rank_window_size`
208
-
: (Optional, integer)
209
-
210
-
This value determines the size of the individual result sets per query. A higher value will improve result relevance at the cost of performance. The final ranked result set is pruned down to the search request’s [size](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#search-size-param). `rank_window_size` must be greater than or equal to `size` and greater than or equal to `1`. Defaults to the `size` parameter.
211
-
212
-
213
-
`filter`
214
-
: (Optional, [query object or list of query objects](/reference/query-languages/querydsl.md))
215
-
216
-
Applies the specified [boolean query filter](/reference/query-languages/query-dsl/query-dsl-bool-query.md) to all of the specified sub-retrievers, according to each retriever’s specifications.
A simple hybrid search example (lexical search + dense vector search) combining a `standard` retriever with a `knn` retriever using RRF:
223
-
224
-
```console
225
-
GET /restaurants/_search
226
-
{
227
-
"retriever": {
228
-
"rrf": { <1>
229
-
"retrievers": [ <2>
230
-
{
231
-
"standard": { <3>
232
-
"query": {
233
-
"multi_match": {
234
-
"query": "Austria",
235
-
"fields": [
236
-
"city",
237
-
"region"
238
-
]
239
-
}
240
-
}
241
-
}
242
-
},
243
-
{
244
-
"knn": { <4>
245
-
"field": "vector",
246
-
"query_vector": [10, 22, 77],
247
-
"k": 10,
248
-
"num_candidates": 10
249
-
}
250
-
}
251
-
=======
252
102
"linear": {
253
103
"query": "elasticsearch",
254
104
"fields": [
@@ -388,5 +238,5 @@ Note, however, that wildcard field patterns will only resolve to fields that eit
388
238
389
239
### Examples
390
240
391
-
- [RRF with the multi-field query format](retrievers/retrievers-examples.md#retrievers-examples-rrf-multi-field-query-format)
392
-
- [Linear retriever with the multi-field query format](retrievers/retrievers-examples.md#retrievers-examples-linear-multi-field-query-format)
241
+
- [RRF with the multi-field query format](docs-content://solutions/search/retrievers-examples.md#retrievers-examples-rrf-multi-field-query-format)
242
+
- [Linear retriever with the multi-field query format](docs-content://solutions/search/retrievers-examples.md#retrievers-examples-linear-multi-field-query-format)
0 commit comments