-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[DOCS] Add Elastic Rerank usage docs #117625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Documentation preview: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I left a few suggestions for changes to this PR, and several other suggestions for improvements to this doc which don't need to be in this PR, but which I think should be done.
[discrete] | ||
[[inference-example-elastic-reranker]] | ||
==== Elastic Rerank via the `elasticsearch` service | ||
|
||
The following example shows how to create an {infer} endpoint called `my-elastic-rerank` to perform a `rerank` task type using the built-in Elastic Rerank cross-encoder model. | ||
|
||
The API request below will automatically download the Elastic Rerank model if it isn't already downloaded and then deploy the model. | ||
Once deployed, the model can be used for semantic re-ranking with a <<text-similarity-reranker-retriever-example-elastic-rerank,`text_similarity_reranker` retriever>>. | ||
|
||
[source,console] | ||
------------------------------------------------------------ | ||
PUT _inference/rerank/my-elastic-rerank | ||
{ | ||
"service": "elasticsearch", | ||
"service_settings": { | ||
"model_id": ".rerank-v1", <1> | ||
"num_threads": 1, | ||
"adaptive_allocations": { <2> | ||
"enabled": true, | ||
"min_number_of_allocations": 1, | ||
"max_number_of_allocations": 10 | ||
} | ||
} | ||
} | ||
------------------------------------------------------------ | ||
// TEST[skip:TBD] | ||
<1> The `model_id` must be the ID of the built-in Elastic Rerank model: `.rerank-v1`. | ||
<2> {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[Adaptive allocations] will be enabled with the minimum of 1 and the maximum of 10 allocations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks great! Maybe we should consider making the max allocations in this example a smaller number like 2 or 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good 👍
|
||
To use semantic re-ranking in {es}, you need to: | ||
|
||
. *Choose a re-ranking model*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about changing the wording from "Choose a re-ranking model" to "Decide which re-ranking model to use". I think the language in this section like "Use the built-in model" will make people think it is already available (without following step 2, which is the case for our "default" models, but isn't the case for elser rerank yet. Maybe theres a better way to phrase this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, good point 👍
===== Prerequisites | ||
|
||
To use `text_similarity_reranker` you must first set up a `rerank` task using the <<put-inference-api, Create {infer} API>>. | ||
The `rerank` task should be set up with a machine learning model that can compute text similarity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
rerank
task
we should be referring to the objects that are created with the create inference API
as "endpoints" rather than "tasks". I don't think this change needs to be in this PR as many of such instances weren't created in this PR, but I think we need to open an issue to reword this across our documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to reword AMAP in these files let me know if it looks OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems great!
|
||
Follow these steps: | ||
|
||
. Create an inference endpoint for the `rerank` task using the <<put-inference-api, Create {infer} API>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this case the "rerank
task" refers to the task type rather than the endpoint-object, so we don't need to change this instance.
"adaptive_allocations": { <1> | ||
"enabled": true, | ||
"min_number_of_allocations": 1, | ||
"max_number_of_allocations": 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once again, we may want to decrease the max allocations in this example
Co-authored-by: Max Hniebergall <[email protected]>
Pinging @elastic/es-docs (Team:Docs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Liam!
https://github.com/elastic/search-team/issues/8757
Updates:
text-similarity-retriever
retriever APIs under Search APIMakes Elastic Rerank the go-to default choice for re-ranking use cases.
Follow up