Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions docs/reference/inference/service-elser.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -80,12 +80,13 @@ Must be a power of 2. Max allowed value is 32.
[[inference-example-elser]]
==== ELSER service example

The following example shows how to create an {infer} endpoint called
`my-elser-model` to perform a `sparse_embedding` task type.
The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type.
Refer to the {ml-docs}/ml-nlp-elser.html[ELSER model documentation] for more info.

The request below will automatically download the ELSER model if it isn't
already downloaded and then deploy the model.
NOTE: If you want to optimize your ELSER endpoint for ingest, set the number of threads to `1` (`"num_threads": 1`).
If you want to optimize your ELSER endpoint for search, set the number of threads to greater than `1`.

The request below will automatically download the ELSER model if it isn't already downloaded and then deploy the model.

[source,console]
------------------------------------------------------------
Expand All @@ -100,7 +101,6 @@ PUT _inference/sparse_embedding/my-elser-model
------------------------------------------------------------
// TEST[skip:TBD]


Example response:

[source,console-result]
Expand Down Expand Up @@ -130,12 +130,12 @@ If using the Python client, you can set the `timeout` parameter to a higher valu
[[inference-example-elser-adaptive-allocation]]
==== Setting adaptive allocation for the ELSER service

The following example shows how to create an {infer} endpoint called
`my-elser-model` to perform a `sparse_embedding` task type and configure
adaptive allocations.
NOTE: For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.

The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type and configure adaptive allocations.

The request below will automatically download the ELSER model if it isn't
already downloaded and then deploy the model.
The request below will automatically download the ELSER model if it isn't already downloaded and then deploy the model.

[source,console]
------------------------------------------------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ PUT _inference/sparse_embedding/my-elser-endpoint <1>
be used and ELSER creates sparse vectors. The `inference_id` is
`my-elser-endpoint`.
<2> The `elser` service is used in this example.
<3> This setting enables and configures adaptive allocations.
<3> This setting enables and configures {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations].
Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.

[NOTE]
Expand Down Expand Up @@ -284,6 +284,8 @@ query from the `semantic-embedding` index:

[discrete]
[[semantic-text-further-examples]]
==== Further examples
==== Further examples and reading

If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide.
* If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide.
* For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
* To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.
Loading