Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/reference/inference/service-elasticsearch.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,7 @@ PUT _inference/text_embedding/my-e5-model
"min_number_of_allocations": 3,
"max_number_of_allocations": 10
},
"num_threads": 1
"model_id": ".multilingual-e5-small"
}
}
Expand Down
3 changes: 2 additions & 1 deletion docs/reference/inference/service-elser.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,8 @@ PUT _inference/sparse_embedding/my-elser-model
"enabled": true,
"min_number_of_allocations": 3,
"max_number_of_allocations": 10
}
},
"num_threads": 1
}
}
------------------------------------------------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,8 @@ POST _ml/trained_models/elastic__distilbert-base-uncased-finetuned-conll03-engli
"enabled": true,
"min_number_of_allocations": 3,
"max_number_of_allocations": 10
}
},
"num_threads": 1
}
--------------------------------------------------
// TEST[skip:TBD]
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,11 @@ PUT _inference/sparse_embedding/my-elser-endpoint <1>
{
"service": "elser", <2>
"service_settings": {
"num_allocations": 1,
"adaptive_allocations": { <3>
"enabled": true,
"min_number_of_allocations": 3,
"max_number_of_allocations": 10
},
"num_threads": 1
}
}
Expand All @@ -46,6 +50,8 @@ PUT _inference/sparse_embedding/my-elser-endpoint <1>
be used and ELSER creates sparse vectors. The `inference_id` is
`my-elser-endpoint`.
<2> The `elser` service is used in this example.
<3> This setting enables and configures {ml-docs}/ml-nlp-elser.html#elser-adaptive-allocations[adaptive allocations].
Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.

[NOTE]
====
Expand Down
Loading