You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reference/inference/service-elser.asciidoc
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -98,12 +98,12 @@ Must be a power of 2. Max allowed value is 32.
98
98
99
99
[discrete]
100
100
[[inference-example-elser-adaptive-allocation]]
101
-
==== Setting adaptive allocations for the ELSER service
101
+
==== ELSER service example
102
102
103
103
NOTE: For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
104
104
To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.
105
105
106
-
The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type and configure adaptive allocations.
106
+
The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type and configure adaptive allocations (recommended).
107
107
108
108
The request below will automatically download the ELSER model if it isn't already downloaded and then deploy the model.
109
109
@@ -126,11 +126,13 @@ PUT _inference/sparse_embedding/my-elser-model
126
126
127
127
[discrete]
128
128
[[inference-example-elser]]
129
-
==== ELSER service example
129
+
==== Creating ELSER service without adaptive allocations
130
130
131
131
The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type.
132
132
Refer to the {ml-docs}/ml-nlp-elser.html[ELSER model documentation] for more info.
133
133
134
+
The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type when adaptive allocations isn't required or {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] isn't available.
135
+
134
136
NOTE: If you want to optimize your ELSER endpoint for ingest, set the number of threads to `1` (`"num_threads": 1`).
135
137
If you want to optimize your ELSER endpoint for search, set the number of threads to greater than `1`.
0 commit comments