Skip to content
Merged
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
825683f
iter
pmpailis Jan 7, 2025
6968760
Merge remote-tracking branch 'origin/main' into add_linear_retriever
pmpailis Jan 7, 2025
6712fc6
Merge remote-tracking branch 'origin/main' into add_linear_retriever
pmpailis Jan 10, 2025
466c026
iter
pmpailis Jan 13, 2025
a4259cd
Merge remote-tracking branch 'origin/main' into add_linear_retriever
pmpailis Jan 13, 2025
a7da4f3
iter
pmpailis Jan 13, 2025
02db9d0
iter
pmpailis Jan 13, 2025
d64effa
iter
pmpailis Jan 14, 2025
b945acf
iter
pmpailis Jan 14, 2025
0c1b235
iter
pmpailis Jan 14, 2025
c97d27b
iter
pmpailis Jan 15, 2025
2d78404
iter
pmpailis Jan 15, 2025
c69b75b
Merge remote-tracking branch 'origin/main' into add_linear_retriever
pmpailis Jan 15, 2025
06d727a
[CI] Auto commit changes from spotless
Jan 15, 2025
822ff1d
iter
pmpailis Jan 16, 2025
f2eb82c
Merge branch 'add_linear_retriever' of github.com:pmpailis/elasticsea…
pmpailis Jan 16, 2025
020cd78
Update docs/changelog/120222.yaml
pmpailis Jan 16, 2025
8d0583a
iter
pmpailis Jan 16, 2025
8ec4110
iter
pmpailis Jan 16, 2025
ceaf3b5
iter
pmpailis Jan 16, 2025
ed78bf2
iter
pmpailis Jan 16, 2025
a70b0d6
iter
pmpailis Jan 16, 2025
9304c7b
iter
pmpailis Jan 16, 2025
05aae70
iter
pmpailis Jan 16, 2025
4fde947
addressing PR comments - removing lower_bound and upper_bound params
pmpailis Jan 16, 2025
e71b25e
fix test
pmpailis Jan 16, 2025
21a78d5
addressing PR comments - removing ScoreNormalizerParser
pmpailis Jan 16, 2025
77fd4e1
removing export from module info
pmpailis Jan 16, 2025
a79b280
iter
pmpailis Jan 16, 2025
ff0c8c3
spotless
pmpailis Jan 16, 2025
8d53d73
addressing PR comments - updating linear component parsing
pmpailis Jan 17, 2025
86db0bc
fix test
pmpailis Jan 17, 2025
d7ab2ce
Merge branch 'main' into add_linear_retriever
pmpailis Jan 17, 2025
cc2c071
iter
pmpailis Jan 19, 2025
90ef7f3
addressing PR comments - adding exception for unknown tokens during p…
pmpailis Jan 19, 2025
30123ac
Merge branch 'add_linear_retriever' of github.com:pmpailis/elasticsea…
pmpailis Jan 19, 2025
ecea688
Merge branch 'main' into add_linear_retriever
pmpailis Jan 19, 2025
7d6feed
iter
pmpailis Jan 19, 2025
83f9614
Merge remote-tracking branch 'origin/main' into add_linear_retriever
pmpailis Jan 21, 2025
512952d
reverting optimization to avoid populating rank docs for explain, as …
pmpailis Jan 21, 2025
4f97a81
moving linear retriever to xpack and adding integ tests
pmpailis Jan 22, 2025
6294917
fixing license
pmpailis Jan 22, 2025
78fdcda
adding integ tests
pmpailis Jan 22, 2025
5b253aa
iter
pmpailis Jan 22, 2025
1eca5fe
add license test
pmpailis Jan 22, 2025
1f36e18
checkstyle
pmpailis Jan 22, 2025
43cd490
Merge remote-tracking branch 'origin/main' into add_linear_retriever
pmpailis Jan 22, 2025
33bc324
[CI] Auto commit changes from spotless
Jan 22, 2025
4d82e28
moving tests
pmpailis Jan 22, 2025
cfcd84f
checkstyle
pmpailis Jan 22, 2025
174e0d0
adding missing writeables for tests
pmpailis Jan 22, 2025
c0943cb
Merge branch 'main' into add_linear_retriever
pmpailis Jan 22, 2025
aeacd33
addressing PR comments
pmpailis Jan 23, 2025
58e2887
fixing tests after refactoring
pmpailis Jan 23, 2025
c8a0e1e
Merge remote-tracking branch 'origin/main' into add_linear_retriever
pmpailis Jan 23, 2025
29438ee
iter
pmpailis Jan 23, 2025
7d3f36c
[CI] Auto commit changes from spotless
Jan 23, 2025
8677263
iter
pmpailis Jan 23, 2025
d961f22
updating parsing to use a static parser
pmpailis Jan 23, 2025
7a31b09
Merge branch 'add_linear_retriever' of github.com:pmpailis/elasticsea…
pmpailis Jan 23, 2025
f973d73
Merge branch 'main' into add_linear_retriever
pmpailis Jan 24, 2025
3640ae1
avoid populating LinearRankDoc metadata if not explain
pmpailis Jan 24, 2025
da84e03
Merge branch 'main' into add_linear_retriever
pmpailis Jan 24, 2025
9259159
addressing PR comments - simplifying linear score computation
pmpailis Jan 27, 2025
ea1787f
addressing PR comments - adding yaml test for linear retriever with i…
pmpailis Jan 27, 2025
ce8f60f
removing custom min max options for normalizer
pmpailis Jan 27, 2025
2bda448
adding assertion for negative weights
pmpailis Jan 27, 2025
669e94d
Merge branch 'main' into add_linear_retriever
pmpailis Jan 27, 2025
8b07ea5
updating tests after latest changes
pmpailis Jan 27, 2025
ccc2f8a
Merge branch 'main' into add_linear_retriever
pmpailis Jan 27, 2025
3ba0587
Update common-parms.asciidoc
pmpailis Jan 27, 2025
3237ef5
Update retrievers-examples.asciidoc
pmpailis Jan 27, 2025
a7425c4
Merge branch 'main' into add_linear_retriever
pmpailis Jan 28, 2025
173f254
setting knn field to flat
pmpailis Jan 28, 2025
42c543a
adding ids to parameter sections for retriever docs
pmpailis Jan 28, 2025
9b40cf6
Merge branch 'main' into add_linear_retriever
pmpailis Jan 28, 2025
95842cc
Merge branch 'main' into add_linear_retriever
pmpailis Jan 28, 2025
21bbb92
Merge branch 'main' into add_linear_retriever
pmpailis Jan 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/changelog/120222.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 120222
summary: Adding linear retriever to support weighted sums of sub-retrievers
area: "Search"
type: enhancement
issues: []
59 changes: 55 additions & 4 deletions docs/reference/rest-api/common-parms.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1338,7 +1338,7 @@ that lower ranked documents have more influence. This value must be greater than
equal to `1`. Defaults to `60`.
end::rrf-rank-constant[]

tag::rrf-rank-window-size[]
tag::compound-retriever-rank-window-size[]
`rank_window_size`::
(Optional, integer)
+
Expand All @@ -1347,12 +1347,63 @@ query. A higher value will improve result relevance at the cost of performance.
ranked result set is pruned down to the search request's <<search-size-param, size>>.
`rank_window_size` must be greater than or equal to `size` and greater than or equal to `1`.
Defaults to the `size` parameter.
end::rrf-rank-window-size[]
end::compound-retriever-rank-window-size[]

tag::rrf-filter[]
tag::compound-retriever-filter[]
`filter`::
(Optional, <<query-dsl, query object or list of query objects>>)
+
Applies the specified <<query-dsl-bool-query, boolean query filter>> to all of the specified sub-retrievers,
according to each retriever's specifications.
end::rrf-filter[]
end::compound-retriever-filter[]

tag::linear-retriever-components[]
`retrievers`::
(Required, array of objects)
+
A list of the sub-retrievers' configuration, that we will take into account and whose result sets
we will merge through a weighted sum. Each configuration can have a different weight and normalization depending
on the specified retriever.

Each entry specifies the following parameters:

* `retriever`::
(Required, a <<retriever, retriever>> object)
+
Specifies the retriever for which we will compute the top documents for. The retriever will produce `rank_window_size`
results, which will later be merged based on the specified `weight` and `normalizer`.

* `weight`::
(Optional, float)
+
The weight that each score of this retriever's top docs will be multiplied with. Defaults to 1.0.

* `normalizer`::
(Optional, String or Object)
+
Specifies how we will normalize the retriever's scores, before applying the specified `weight`.
We can either provide a string reference to use with the default values or further configure any normalizer
using its specific properties. Available values are: `minmax`, and `none`. Defaults to `none`.

** `none` : takes no argument
** `minmax` :
A `MinMaxScoreNormalizer` that normalizes scores based on the following formula
+
```
score = (score - min) / (max - min)
```
Available properties are:
*** `min`::
(Optional, float)
+
The minimum value of the original scores. Defaults to result set's true min value.

*** `max`::
(Optional, float)
+
The maximum value of the original scores. Defaults to result set's true max value.


See also <<retrievers-examples-linear-retriever, this hybrid search example>> using a linear retriever on how to
independently configure and apply normalizers to retrievers.
end::linear-retriever-components[]
27 changes: 22 additions & 5 deletions docs/reference/search/retriever.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ A <<standard-retriever, retriever>> that replaces the functionality of a traditi
`knn`::
A <<knn-retriever, retriever>> that replaces the functionality of a <<search-api-knn, knn search>>.

`linear`::
A <<linear-retriever, retriever>> that linearly combines the scores of other retrievers for the top documents.

`rescorer`::
A <<rescorer-retriever, retriever>> that replaces the functionality of the <<rescore, query rescorer>>.

Expand Down Expand Up @@ -263,6 +266,20 @@ GET /restaurants/_search
This value must be fewer than or equal to `num_candidates`.
<5> The size of the initial candidate set from which the final `k` nearest neighbors are selected.

[[linear-retriever]]
==== Linear Retriever
A retriever that normalizes and linearly combines the scores of other retrievers. If the final scores produced after the
weighted combination of all sub-retrievers are negative, a corrective factor is applied equal to the minimum score,
so all scores are positive.

===== Parameters

include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=linear-retriever-components]

include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=compound-retriever-rank-window-size]

include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=compound-retriever-filter]

[[rrf-retriever]]
==== RRF Retriever

Expand All @@ -275,9 +292,9 @@ include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=rrf-retrievers]

include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=rrf-rank-constant]

include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=rrf-rank-window-size]
include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=compound-retriever-rank-window-size]

include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=rrf-filter]
include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=compound-retriever-filter]

[discrete]
[[rrf-retriever-example-hybrid]]
Expand Down Expand Up @@ -576,15 +593,15 @@ This example demonstrates how to deploy the {ml-docs}/ml-nlp-rerank.html[Elastic

Follow these steps:

. Create an inference endpoint for the `rerank` task using the <<put-inference-api, Create {infer} API>>.
. Create an inference endpoint for the `rerank` task using the <<put-inference-api, Create {infer} API>>.
+
[source,console]
----
PUT _inference/rerank/my-elastic-rerank
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this endpoint available by default yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm not sure, I had this as a custom endpoint for testing/documentation purposes :D . We can refactor though to use the default endpoint (once, if not already, available).

{
"service": "elasticsearch",
"service_settings": {
"model_id": ".rerank-v1",
"model_id": ".rerank-v1",
"num_threads": 1,
"adaptive_allocations": { <1>
"enabled": true,
Expand All @@ -595,7 +612,7 @@ PUT _inference/rerank/my-elastic-rerank
}
----
// TEST[skip:uses ML]
<1> {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[Adaptive allocations] will be enabled with the minimum of 1 and the maximum of 10 allocations.
<1> {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[Adaptive allocations] will be enabled with the minimum of 1 and the maximum of 10 allocations.
+
. Define a `text_similarity_rerank` retriever:
+
Expand Down
12 changes: 6 additions & 6 deletions docs/reference/search/rrf.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=rrf-retrievers]

include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=rrf-rank-constant]

include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=rrf-rank-window-size]
include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=compound-retriever-rank-window-size]

An example request using RRF:

Expand Down Expand Up @@ -791,11 +791,11 @@ A more specific example of highlighting in RRF can also be found in the <<retrie

==== Inner hits in RRF

The `rrf` retriever supports <<inner-hits,inner hits>> functionality, allowing you to retrieve
related nested or parent/child documents alongside your main search results. Inner hits can be
specified as part of any nested sub-retriever and will be propagated to the top-level parent
retriever. Note that the inner hit computation will take place only at end of `rrf` retriever's
evaluation on the top matching documents, and not as part of the query execution of the nested
The `rrf` retriever supports <<inner-hits,inner hits>> functionality, allowing you to retrieve
related nested or parent/child documents alongside your main search results. Inner hits can be
specified as part of any nested sub-retriever and will be propagated to the top-level parent
retriever. Note that the inner hit computation will take place only at end of `rrf` retriever's
evaluation on the top matching documents, and not as part of the query execution of the nested
sub-retrievers.

[IMPORTANT]
Expand Down
Loading