You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Returns the document instead of only the index. Defaults to `true`.
120
120
=====
121
121
122
-
123
122
[discrete]
124
123
[[inference-example-elasticsearch-elser]]
125
124
==== ELSER via the `elasticsearch` service
@@ -137,7 +136,7 @@ PUT _inference/sparse_embedding/my-elser-model
137
136
"adaptive_allocations": { <1>
138
137
"enabled": true,
139
138
"min_number_of_allocations": 1,
140
-
"max_number_of_allocations": 10
139
+
"max_number_of_allocations": 4
141
140
},
142
141
"num_threads": 1,
143
142
"model_id": ".elser_model_2" <2>
@@ -150,6 +149,34 @@ PUT _inference/sparse_embedding/my-elser-model
150
149
Valid values are `.elser_model_2` and `.elser_model_2_linux-x86_64`.
151
150
For further details, refer to the {ml-docs}/ml-nlp-elser.html[ELSER model documentation].
152
151
152
+
[discrete]
153
+
[[inference-example-elastic-reranker]]
154
+
==== Elastic Rerank via the `elasticsearch` service
155
+
156
+
The following example shows how to create an {infer} endpoint called `my-elastic-rerank` to perform a `rerank` task type using the built-in Elastic Rerank cross-encoder model.
157
+
158
+
The API request below will automatically download the Elastic Rerank model if it isn't already downloaded and then deploy the model.
159
+
Once deployed, the model can be used for semantic re-ranking with a <<text-similarity-reranker-retriever-example-elastic-rerank,`text_similarity_reranker` retriever>>.
<1> The `model_id` must be the ID of the built-in Elastic Rerank model: `.rerank-v1`.
179
+
<2> {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[Adaptive allocations] will be enabled with the minimum of 1 and the maximum of 10 allocations.
153
180
154
181
[discrete]
155
182
[[inference-example-elasticsearch]]
@@ -186,7 +213,7 @@ If using the Python client, you can set the `timeout` parameter to a higher valu
186
213
187
214
[discrete]
188
215
[[inference-example-eland]]
189
-
==== Models uploaded by Eland via the elasticsearch service
216
+
==== Models uploaded by Eland via the `elasticsearch` service
190
217
191
218
The following example shows how to create an {infer} endpoint called
192
219
`my-msmarco-minilm-model` to perform a `text_embedding` task type.
Copy file name to clipboardExpand all lines: docs/reference/reranking/semantic-reranking.asciidoc
+11-9Lines changed: 11 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -85,14 +85,16 @@ In {es}, semantic re-rankers are implemented using the {es} <<inference-apis,Inf
85
85
86
86
To use semantic re-ranking in {es}, you need to:
87
87
88
-
. *Choose a re-ranking model*.
89
-
Currently you can:
90
-
91
-
** Integrate directly with the <<infer-service-cohere,Cohere Rerank inference endpoint>> using the `rerank` task type
92
-
** Integrate directly with the <<infer-service-google-vertex-ai,Google Vertex AI inference endpoint>> using the `rerank` task type
93
-
** Upload a model to {es} from Hugging Face with {eland-docs}/machine-learning.html#ml-nlp-pytorch[Eland]. You'll need to use the `text_similarity` NLP task type when loading the model using Eland. Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third party text similarity models supported by {es} for semantic re-ranking.
94
-
*** Then set up an <<inference-example-eland,{es} service inference endpoint>> with the `rerank` task type
95
-
. *Create a `rerank` task using the <<put-inference-api,{es} Inference API>>*.
88
+
. *Select and configure a re-ranking model*.
89
+
You have the following options:
90
+
.. Use the <<inference-example-elastic-reranker,Elastic Rerank>> cross-encoder model via the inference API's {es} service.
91
+
.. Use the <<infer-service-cohere,Cohere Rerank inference endpoint>> to create a `rerank` endpoint.
92
+
.. Use the <<infer-service-google-vertex-ai,Google Vertex AI inference endpoint>> to create a `rerank` endpoint.
93
+
.. Upload a model to {es} from Hugging Face with {eland-docs}/machine-learning.html#ml-nlp-pytorch[Eland]. You'll need to use the `text_similarity` NLP task type when loading the model using Eland. Then set up an <<inference-example-eland,{es} service inference endpoint>> with the `rerank` endpoint type.
94
+
+
95
+
Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third party text similarity models supported by {es} for semantic re-ranking.
96
+
97
+
. *Create a `rerank` endpoint using the <<put-inference-api,{es} Inference API>>*.
96
98
The Inference API creates an inference endpoint and configures your chosen machine learning model to perform the re-ranking task.
97
99
. *Define a `text_similarity_reranker` retriever in your search request*.
98
100
The retriever syntax makes it simple to configure both the retrieval and re-ranking of search results in a single API call.
@@ -117,7 +119,7 @@ POST _search
117
119
}
118
120
},
119
121
"field": "text",
120
-
"inference_id": "my-cohere-rerank-model",
122
+
"inference_id": "elastic-rerank",
121
123
"inference_text": "How often does the moon hide the sun?",
Copy file name to clipboardExpand all lines: docs/reference/search/retriever.asciidoc
+76-7Lines changed: 76 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,6 +11,7 @@ This allows for complex behavior to be depicted in a tree-like structure, called
11
11
[TIP]
12
12
====
13
13
Refer to <<retrievers-overview>> for a high level overview of the retrievers abstraction.
14
+
Refer to <<retrievers-examples, Retrievers examples>> for additional examples.
14
15
====
15
16
16
17
The following retrievers are available:
@@ -382,16 +383,17 @@ Refer to <<semantic-reranking>> for a high level overview of semantic re-ranking
382
383
383
384
===== Prerequisites
384
385
385
-
To use `text_similarity_reranker` you must first set up a `rerank` task using the <<put-inference-api, Create {infer} API>>.
386
-
The `rerank` task should be set up with a machine learning model that can compute text similarity.
386
+
To use `text_similarity_reranker` you must first set up an inference endpoint for the `rerank` task using the <<put-inference-api, Create {infer} API>>.
387
+
The endpoint should be set up with a machine learning model that can compute text similarity.
387
388
Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third-party text similarity models supported by {es}.
388
389
389
-
Currently you can:
390
+
You have the following options:
390
391
391
-
* Integrate directly with the <<infer-service-cohere,Cohere Rerank inference endpoint>> using the `rerank` task type
392
-
* Integrate directly with the <<infer-service-google-vertex-ai,Google Vertex AI inference endpoint>> using the `rerank` task type
392
+
* Use the the built-in <<inference-example-elastic-reranker,Elastic Rerank>> cross-encoder model via the inference API's {es} service.
393
+
* Use the <<infer-service-cohere,Cohere Rerank inference endpoint>> with the `rerank` task type.
394
+
* Use the <<infer-service-google-vertex-ai,Google Vertex AI inference endpoint>> with the `rerank` task type.
393
395
* Upload a model to {es} with {eland-docs}/machine-learning.html#ml-nlp-pytorch[Eland] using the `text_similarity` NLP task type.
394
-
** Then set up an <<inference-example-eland,{es} service inference endpoint>> with the `rerank` task type
396
+
** Then set up an <<inference-example-eland,{es} service inference endpoint>> with the `rerank` task type.
395
397
** Refer to the <<text-similarity-reranker-retriever-example-eland,example>> on this page for a step-by-step guide.
396
398
397
399
===== Parameters
@@ -436,13 +438,70 @@ Note that score calculations vary depending on the model used.
436
438
Applies the specified <<query-dsl-bool-query, boolean query filter>> to the child <<retriever, retriever>>.
437
439
If the child retriever already specifies any filters, then this top-level filter is applied in conjuction with the filter defined in the child retriever.
This examples demonstrates how to deploy the Elastic Rerank model and use it to re-rank search results using the `text_similarity_reranker` retriever.
446
+
447
+
Follow these steps:
448
+
449
+
. Create an inference endpoint for the `rerank` task using the <<put-inference-api, Create {infer} API>>.
450
+
+
451
+
[source,console]
452
+
----
453
+
PUT _inference/rerank/my-elastic-rerank
454
+
{
455
+
"service": "elasticsearch",
456
+
"service_settings": {
457
+
"model_id": ".rerank-v1",
458
+
"num_threads": 1,
459
+
"adaptive_allocations": { <1>
460
+
"enabled": true,
461
+
"min_number_of_allocations": 1,
462
+
"max_number_of_allocations": 10
463
+
}
464
+
}
465
+
}
466
+
----
467
+
// TEST[skip:uses ML]
468
+
<1> {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[Adaptive allocations] will be enabled with the minimum of 1 and the maximum of 10 allocations.
469
+
+
470
+
. Define a `text_similarity_rerank` retriever:
471
+
+
472
+
[source,console]
473
+
----
474
+
POST _search
475
+
{
476
+
"retriever": {
477
+
"text_similarity_reranker": {
478
+
"retriever": {
479
+
"standard": {
480
+
"query": {
481
+
"match": {
482
+
"text": "How often does the moon hide the sun?"
483
+
}
484
+
}
485
+
}
486
+
},
487
+
"field": "text",
488
+
"inference_id": "my-elastic-rerank",
489
+
"inference_text": "How often does the moon hide the sun?",
This example enables out-of-the-box semantic search by re-ranking top documents using the Cohere Rerank API.
444
503
This approach eliminates the need to generate and store embeddings for all indexed documents.
445
-
This requires a <<infer-service-cohere,Cohere Rerank inference endpoint>> using the `rerank` task type.
504
+
This requires a <<infer-service-cohere,Cohere Rerank inference endpoint>> that is set up for the `rerank` task type.
446
505
447
506
[source,console]
448
507
----
@@ -680,6 +739,12 @@ GET movies/_search
680
739
<1> The `rule` retriever is the outermost retriever, applying rules to the search results that were previously reranked using the `rrf` retriever.
681
740
<2> The `rrf` retriever returns results from all of its sub-retrievers, and the output of the `rrf` retriever is used as input to the `rule` retriever.
682
741
742
+
[discrete]
743
+
[[retriever-common-parameters]]
744
+
=== Common usage guidelines
745
+
746
+
[discrete]
747
+
[[retriever-size-pagination]]
683
748
==== Using `from` and `size` with a retriever tree
684
749
685
750
The <<search-from-param, `from`>> and <<search-size-param, `size`>>
@@ -688,12 +753,16 @@ parameters are provided globally as part of the general
688
753
They are applied to all retrievers in a retriever tree, unless a specific retriever overrides the `size` parameter using a different parameter such as `rank_window_size`.
689
754
Though, the final search hits are always limited to `size`.
690
755
756
+
[discrete]
757
+
[[retriever-aggregations]]
691
758
==== Using aggregations with a retriever tree
692
759
693
760
<<search-aggregations, Aggregations>> are globally specified as part of a search request.
694
761
The query used for an aggregation is the combination of all leaf retrievers as `should`
695
762
clauses in a <<query-dsl-bool-query, boolean query>>.
696
763
764
+
[discrete]
765
+
[[retriever-restrictions]]
697
766
==== Restrictions on search parameters when specifying a retriever
698
767
699
768
When a retriever is specified as part of a search, the following elements are not allowed at the top-level.
0 commit comments