You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -130,12 +130,12 @@ If using the Python client, you can set the `timeout` parameter to a higher valu
130
130
[[inference-example-elser-adaptive-allocation]]
131
131
==== Setting adaptive allocation for the ELSER service
132
132
133
-
The following example shows how to create an {infer} endpoint called
134
-
`my-elser-model` to perform a `sparse_embedding` task type and configure
135
-
adaptive allocations.
133
+
NOTE: For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
134
+
To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.
135
+
136
+
The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type and configure adaptive allocations.
136
137
137
-
The request below will automatically download the ELSER model if it isn't
138
-
already downloaded and then deploy the model.
138
+
The request below will automatically download the ELSER model if it isn't already downloaded and then deploy the model.
Copy file name to clipboardExpand all lines: docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,7 +50,7 @@ PUT _inference/sparse_embedding/my-elser-endpoint <1>
50
50
be used and ELSER creates sparse vectors. The `inference_id` is
51
51
`my-elser-endpoint`.
52
52
<2> The `elser` service is used in this example.
53
-
<3> This setting enables and configures adaptive allocations.
53
+
<3> This setting enables and configures {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations].
54
54
Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.
55
55
56
56
[NOTE]
@@ -267,6 +267,8 @@ query from the `semantic-embedding` index:
267
267
268
268
[discrete]
269
269
[[semantic-text-further-examples]]
270
-
==== Further examples
270
+
==== Further examples and reading
271
271
272
-
If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide.
272
+
* If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide.
273
+
* For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
274
+
* To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.
0 commit comments