Skip to content

Commit 84c4115

Browse files
authored
[DOCS] Adds deployment ID param documentation to trained model APIs (#96174) (#96199)
1 parent b86af25 commit 84c4115

9 files changed

+90
-36
lines changed

docs/reference/ingest/processors/inference.asciidoc

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,11 @@ ingested in the pipeline.
1515
.{infer-cap} Options
1616
[options="header"]
1717
|======
18-
| Name | Required | Default | Description
19-
| `model_id` | yes | - | (String) The ID or alias for the trained model.
20-
| `target_field` | no | `ml.inference.<processor_tag>` | (String) Field added to incoming documents to contain results objects.
21-
| `field_map` | no | If defined the model's default field map | (Object) Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.
22-
| `inference_config` | no | The default settings defined in the model | (Object) Contains the inference type and its options.
18+
| Name | Required | Default | Description
19+
| `model_id` . | yes | - | (String) The ID or alias for the trained model, or the ID of the deployment.
20+
| `target_field` | no | `ml.inference.<processor_tag>` | (String) Field added to incoming documents to contain results objects.
21+
| `field_map` | no | If defined the model's default field map | (Object) Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.
22+
| `inference_config` | no | The default settings defined in the model | (Object) Contains the inference type and its options.
2323
include::common-options.asciidoc[]
2424
|======
2525

@@ -28,7 +28,7 @@ include::common-options.asciidoc[]
2828
--------------------------------------------------
2929
{
3030
"inference": {
31-
"model_id": "flight_delay_regression-1571767128603",
31+
"model_id": "model_deployment_for_inference",
3232
"target_field": "FlightDelayMin_prediction_infer",
3333
"field_map": {
3434
"your_field": "my_field"
@@ -384,6 +384,7 @@ include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenizati
384384
[discrete]
385385
[[inference-processor-config-example]]
386386
==== {infer-cap} processor examples
387+
387388
[source,js]
388389
--------------------------------------------------
389390
"inference":{

docs/reference/ml/ml-shared.asciidoc

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -498,6 +498,10 @@ that document will not be used for training, but a prediction with the trained
498498
model will be generated for it. It is also known as continuous target variable.
499499
end::dependent-variable[]
500500

501+
tag::deployment-id[]
502+
A unique identifier for the deployment of the model.
503+
end::deployment-id[]
504+
501505
tag::desc-results[]
502506
If true, the results are sorted in descending order.
503507
end::desc-results[]

docs/reference/ml/trained-models/apis/clear-trained-model-deployment-cache.asciidoc

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,12 @@
66
<titleabbrev>Clear trained model deployment cache</titleabbrev>
77
++++
88

9-
Clears a trained model deployment cache on all nodes where the trained model is assigned.
9+
Clears the {infer} cache on all nodes where the deployment is assigned.
1010

1111
[[clear-trained-model-deployment-cache-request]]
1212
== {api-request-title}
1313

14-
`POST _ml/trained_models/<model_id>/deployment/cache/_clear`
14+
`POST _ml/trained_models/<deployment_id>/deployment/cache/_clear`
1515

1616
[[clear-trained-model-deployment-cache-prereq]]
1717
== {api-prereq-title}
@@ -22,16 +22,16 @@ Requires the `manage_ml` cluster privilege. This privilege is included in the
2222
[[clear-trained-model-deployment-cache-desc]]
2323
== {api-description-title}
2424

25-
A trained model deployment may have an inference cache enabled. As requests are handled by each allocated node,
26-
their responses may be cached on that individual node. Calling this API clears the caches without restarting the
27-
deployment.
25+
A trained model deployment may have an inference cache enabled. As requests are
26+
handled by each allocated node, their responses may be cached on that individual
27+
node. Calling this API clears the caches without restarting the deployment.
2828

2929
[[clear-trained-model-deployment-cache-path-params]]
3030
== {api-path-parms-title}
3131

32-
`<model_id>`::
32+
`deployment_id`::
3333
(Required, string)
34-
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
34+
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=deployment-id]
3535

3636
[[clear-trained-model-deployment-cache-example]]
3737
== {api-examples-title}

docs/reference/ml/trained-models/apis/get-trained-models-stats.asciidoc

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,11 @@ Retrieves usage information for trained models.
1616

1717
`GET _ml/trained_models/_all/_stats` +
1818

19-
`GET _ml/trained_models/<model_id>/_stats` +
19+
`GET _ml/trained_models/<model_id_or_deployment_id>/_stats` +
2020

21-
`GET _ml/trained_models/<model_id>,<model_id_2>/_stats` +
21+
`GET _ml/trained_models/<model_id_or_deployment_id>,<model_id_2_or_deployment_id_2>/_stats` +
2222

23-
`GET _ml/trained_models/<model_id_pattern*>,<model_id_2>/_stats`
23+
`GET _ml/trained_models/<model_id_pattern*_or_deployment_id_pattern*>,<model_id_2_or_deployment_id_2>/_stats`
2424

2525

2626
[[ml-get-trained-models-stats-prereq]]
@@ -33,17 +33,20 @@ Requires the `monitor_ml` cluster privilege. This privilege is included in the
3333
[[ml-get-trained-models-stats-desc]]
3434
== {api-description-title}
3535

36-
You can get usage information for multiple trained models in a single API
37-
request by using a comma-separated list of model IDs or a wildcard expression.
36+
You can get usage information for multiple trained models or trained model
37+
deployments in a single API request by using a comma-separated list of model
38+
IDs, deployment IDs, or a wildcard expression.
3839

3940

4041
[[ml-get-trained-models-stats-path-params]]
4142
== {api-path-parms-title}
4243

43-
`<model_id>`::
44+
`<model_id_or_deployment_id>`::
4445
(Optional, string)
45-
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id-or-alias]
46-
46+
The unique identifier of the model or the deployment. If a model has multiple
47+
deployments, and the ID of one of the deployments matches the model ID, then the
48+
model ID takes precedence; the results are returned for all deployments of the
49+
model.
4750

4851
[[ml-get-trained-models-stats-query-params]]
4952
== {api-query-parms-title}
@@ -116,6 +119,9 @@ The detailed allocation state related to the nodes.
116119
The desired number of nodes for model allocation.
117120
======
118121

122+
`deployment_id`:::
123+
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=deployment-id]
124+
119125
`error_count`:::
120126
(integer)
121127
The sum of `error_count` for all nodes in the deployment.

docs/reference/ml/trained-models/apis/infer-trained-model.asciidoc

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ directly from the {infer} cache.
1616
== {api-request-title}
1717

1818
`POST _ml/trained_models/<model_id>/_infer`
19+
`POST _ml/trained_models/<deployment_id>/_infer`
1920

2021
////
2122
[[infer-trained-model-prereq]]
@@ -32,8 +33,15 @@ directly from the {infer} cache.
3233
== {api-path-parms-title}
3334

3435
`<model_id>`::
35-
(Required, string)
36+
(Optional, string)
3637
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id-or-alias]
38+
If you specify the `model_id` in the API call, and the model has multiple
39+
deployments, a random deployment will be used. If the `model_id` matches the ID
40+
of one of the deployments, that deployment will be used.
41+
42+
`<deployment_id>`::
43+
(Optional, string)
44+
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=deployment-id]
3745

3846
[[infer-trained-model-query-params]]
3947
== {api-query-parms-title}

docs/reference/ml/trained-models/apis/put-trained-models-aliases.asciidoc

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,7 @@ An alias must be unique and refer to only a single trained model. However,
3535
you can have multiple aliases for each trained model.
3636

3737
API Restrictions:
38-
+
39-
--
38+
4039
* You are not allowed to update an alias such that it references a different
4140
trained model ID and the model uses a different type of {dfanalytics}. For example,
4241
this situation occurs if you have a trained model for
@@ -45,7 +44,6 @@ alias from one type of trained model to another.
4544
* You cannot update an alias from a `pytorch` model and a {dfanalytics} model.
4645
* You cannot update the alias from a deployed `pytorch` model to one
4746
not currently deployed.
48-
--
4947

5048
If you use this API to update an alias and there are very few input fields in
5149
common between the old and new trained models for the model alias, the API

docs/reference/ml/trained-models/apis/start-trained-model-deployment.asciidoc

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,11 @@ Currently only `pytorch` models are supported for deployment. Once deployed
2525
the model can be used by the <<inference-processor,{infer-cap} processor>>
2626
in an ingest pipeline or directly in the <<infer-trained-model>> API.
2727

28+
A model can be deployed multiple times by using deployment IDs. A deployment ID
29+
must be unique and should not match any other deployment ID or model ID, unless
30+
it is the same as the ID of the model being deployed. If `deployment_id` is not
31+
set, it defaults to the `model_id`.
32+
2833
Scaling inference performance can be achieved by setting the parameters
2934
`number_of_allocations` and `threads_per_allocation`.
3035

@@ -60,6 +65,11 @@ model. The default value is the size of the model as reported by the
6065
`model_size_bytes` field in the <<get-trained-models-stats>>. To disable the
6166
cache, `0b` can be provided.
6267

68+
`deployment_id`::
69+
(Optional, string)
70+
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=deployment-id]
71+
Defaults to `model_id`.
72+
6373
`number_of_allocations`::
6474
(Optional, integer)
6575
The total number of allocations this model is assigned across {ml} nodes.
@@ -150,3 +160,25 @@ The API returns the following results:
150160
}
151161
}
152162
----
163+
164+
165+
[[start-trained-model-deployment-deployment-id-example]]
166+
=== Using deployment IDs
167+
168+
The following example starts a new deployment for the `my_model` trained model
169+
with the ID `my_model_for_ingest`. The deployment ID an be used in {infer} API
170+
calls or in {infer} processors.
171+
172+
[source,console]
173+
--------------------------------------------------
174+
POST _ml/trained_models/my_model/deployment/_start?deployment_id=my_model_for_ingest
175+
--------------------------------------------------
176+
// TEST[skip:TBD]
177+
178+
The `my_model` trained model can be deployed again with a different ID:
179+
180+
[source,console]
181+
--------------------------------------------------
182+
POST _ml/trained_models/my_model/deployment/_start?deployment_id=my_model_for_search
183+
--------------------------------------------------
184+
// TEST[skip:TBD]

docs/reference/ml/trained-models/apis/stop-trained-model-deployment.asciidoc

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Stops a trained model deployment.
1111
[[stop-trained-model-deployment-request]]
1212
== {api-request-title}
1313

14-
`POST _ml/trained_models/<model_id>/deployment/_stop`
14+
`POST _ml/trained_models/<deployment_id>/deployment/_stop`
1515

1616
[[stop-trained-model-deployment-prereq]]
1717
== {api-prereq-title}
@@ -27,9 +27,9 @@ Deployment is required only for trained models that have a PyTorch `model_type`.
2727
[[stop-trained-model-deployment-path-params]]
2828
== {api-path-parms-title}
2929

30-
`<model_id>`::
30+
`<deployment_id>`::
3131
(Required, string)
32-
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
32+
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=deployment-id]
3333

3434

3535
[[stop-trained-model-deployment-query-params]]
@@ -40,9 +40,9 @@ include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
4040
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=allow-no-match-deployments]
4141

4242
`force`::
43-
(Optional, Boolean) If true, the deployment is stopped even if it or one of its model aliases
44-
is referenced by ingest pipelines. You can't use these pipelines until you restart the model
45-
deployment.
43+
(Optional, Boolean) If true, the deployment is stopped even if it or one of its
44+
model aliases is referenced by ingest pipelines. You can't use these pipelines
45+
until you restart the model deployment.
4646

4747
////
4848
[role="child_attributes"]
@@ -55,7 +55,12 @@ deployment.
5555
== {api-response-codes-title}
5656
////
5757

58-
////
5958
[[stop-trained-model-deployment-example]]
6059
== {api-examples-title}
61-
////
60+
61+
The following example stops the `my_model_for_search` deployment:
62+
63+
[source,console]
64+
--------------------------------------------------
65+
POST _ml/trained_models/my_model_for_search/deployment/_stop
66+
--------------------------------------------------

docs/reference/ml/trained-models/apis/update-trained-model-deployment.asciidoc

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ beta::[]
1414
[[update-trained-model-deployment-request]]
1515
== {api-request-title}
1616

17-
`POST _ml/trained_models/<model_id>/deployment/_update`
17+
`POST _ml/trained_models/<deployment_id>/deployment/_update`
1818

1919

2020
[[update-trained-model-deployments-prereqs]]
@@ -32,9 +32,9 @@ You can either increase or decrease the number of allocations of such a deployme
3232
[[update-trained-model-deployments-path-parms]]
3333
== {api-path-parms-title}
3434

35-
`<model_id>`::
35+
`<deployment_id>`::
3636
(Required, string)
37-
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
37+
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=deployment-id]
3838

3939
[[update-trained-model-deployment-request-body]]
4040
== {api-request-body-title}

0 commit comments

Comments
 (0)