Skip to content

Commit f4eba62

Browse files
authored
[DOCS] Adds link to tutorial and API docs to trained model autoscaling. (elastic#114904) (elastic#114912)
1 parent 5a09346 commit f4eba62

File tree

2 files changed

+15
-13
lines changed

2 files changed

+15
-13
lines changed

docs/reference/inference/service-elser.asciidoc

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -80,12 +80,13 @@ Must be a power of 2. Max allowed value is 32.
8080
[[inference-example-elser]]
8181
==== ELSER service example
8282

83-
The following example shows how to create an {infer} endpoint called
84-
`my-elser-model` to perform a `sparse_embedding` task type.
83+
The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type.
8584
Refer to the {ml-docs}/ml-nlp-elser.html[ELSER model documentation] for more info.
8685

87-
The request below will automatically download the ELSER model if it isn't
88-
already downloaded and then deploy the model.
86+
NOTE: If you want to optimize your ELSER endpoint for ingest, set the number of threads to `1` (`"num_threads": 1`).
87+
If you want to optimize your ELSER endpoint for search, set the number of threads to greater than `1`.
88+
89+
The request below will automatically download the ELSER model if it isn't already downloaded and then deploy the model.
8990

9091
[source,console]
9192
------------------------------------------------------------
@@ -100,7 +101,6 @@ PUT _inference/sparse_embedding/my-elser-model
100101
------------------------------------------------------------
101102
// TEST[skip:TBD]
102103

103-
104104
Example response:
105105

106106
[source,console-result]
@@ -130,12 +130,12 @@ If using the Python client, you can set the `timeout` parameter to a higher valu
130130
[[inference-example-elser-adaptive-allocation]]
131131
==== Setting adaptive allocation for the ELSER service
132132

133-
The following example shows how to create an {infer} endpoint called
134-
`my-elser-model` to perform a `sparse_embedding` task type and configure
135-
adaptive allocations.
133+
NOTE: For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
134+
To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.
135+
136+
The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type and configure adaptive allocations.
136137

137-
The request below will automatically download the ELSER model if it isn't
138-
already downloaded and then deploy the model.
138+
The request below will automatically download the ELSER model if it isn't already downloaded and then deploy the model.
139139

140140
[source,console]
141141
------------------------------------------------------------

docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ PUT _inference/sparse_embedding/my-elser-endpoint <1>
5050
be used and ELSER creates sparse vectors. The `inference_id` is
5151
`my-elser-endpoint`.
5252
<2> The `elser` service is used in this example.
53-
<3> This setting enables and configures adaptive allocations.
53+
<3> This setting enables and configures {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations].
5454
Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.
5555

5656
[NOTE]
@@ -267,6 +267,8 @@ query from the `semantic-embedding` index:
267267

268268
[discrete]
269269
[[semantic-text-further-examples]]
270-
==== Further examples
270+
==== Further examples and reading
271271

272-
If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide.
272+
* If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide.
273+
* For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
274+
* To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.

0 commit comments

Comments
 (0)