Skip to content
5 changes: 5 additions & 0 deletions docs/changelog/117826.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 117826
summary: "Switch ELSER service to elasticsearch service in semantic search tutorial"
area: Docs
type: doc
issues: ["117829"]
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ You don't need to define model related settings and parameters, or create {infer
The recommended way to use <<semantic-search,semantic search>> in the {stack} is following the `semantic_text` workflow.
When you need more control over indexing and query settings, you can still use the complete {infer} workflow (refer to <<semantic-search-inference,this tutorial>> to review the process).

This tutorial uses the <<inference-example-elser,`elser` service>> for demonstration, but you can use any service and their supported models offered by the {infer-cap} API.
This tutorial uses the <<inference-example-elasticsearch-elser,`elasticsearch` service>> for demonstration, but you can use any service and their supported models offered by the {infer-cap} API.


[discrete]
Expand All @@ -34,24 +34,28 @@ Create an inference endpoint by using the <<put-inference-api>>:
------------------------------------------------------------
PUT _inference/sparse_embedding/my-elser-endpoint <1>
{
"service": "elser", <2>
"service": "elasticsearch", <2>
"service_settings": {
"adaptive_allocations": { <3>
"enabled": true,
"min_number_of_allocations": 3,
"max_number_of_allocations": 10
},
"num_threads": 1
"num_threads": 1,
"model_id": ".elser_model_2" <4>
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `sparse_embedding` in the path as the `elser` service will
be used and ELSER creates sparse vectors. The `inference_id` is
`my-elser-endpoint`.
<2> The `elser` service is used in this example.
<2> The `elasticsearch` service is used in this example.
<3> This setting enables and configures {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations].
Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.
<4> The `model_id` must be the ID of one of the built-in ELSER models.
Valid values are `.elser_model_2` and `.elser_model_2_linux-x86_64`.
For further details, refer to the {ml-docs}/ml-nlp-elser.html[ELSER model documentation].

[NOTE]
====
Expand Down Expand Up @@ -282,4 +286,4 @@ query from the `semantic-embedding` index:

* If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide.
* For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
* To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.
* To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.