Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 4 additions & 8 deletions docs/reference/inference/inference-apis.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,8 @@ models or if you want to use non-NLP models, use the
For the most up-to-date API details, refer to {api-es}/group/endpoint-inference[{infer-cap} APIs].
--

The {infer} APIs enable you to create {infer} endpoints and use {ml} models of
different providers - such as Amazon Bedrock, Anthropic, Azure AI Studio,
Cohere, Google AI, Mistral, OpenAI, or HuggingFace - as a service. Use
the following APIs to manage {infer} models and perform {infer}:
The {infer} APIs enable you to create {infer} endpoints and integrate with {ml} models of different services - such as Amazon Bedrock, Anthropic, Azure AI Studio, Cohere, Google AI, Mistral, OpenAI, or HuggingFace.
Use the following APIs to manage {infer} models and perform {infer}:

* <<delete-inference-api>>
* <<get-inference-api>>
Expand All @@ -37,10 +35,8 @@ An {infer} endpoint enables you to use the corresponding {ml} model without
manual deployment and apply it to your data at ingestion time through
<<semantic-search-semantic-text, semantic text>>.

Choose a model from your provider or use ELSER – a retrieval model trained by
Elastic –, then create an {infer} endpoint by the <<put-inference-api>>.
Now use <<semantic-search-semantic-text, semantic text>> to perform
<<semantic-search, semantic search>> on your data.
Choose a model from your service or use ELSER – a retrieval model trained by Elastic –, then create an {infer} endpoint by the <<put-inference-api>>.
Now use <<semantic-search-semantic-text, semantic text>> to perform <<semantic-search, semantic search>> on your data.

[discrete]
[[adaptive-allocations]]
Expand Down
16 changes: 8 additions & 8 deletions docs/reference/inference/put-inference.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ include::inference-shared.asciidoc[tag=inference-id]
include::inference-shared.asciidoc[tag=task-type]
+
--
Refer to the service list in the <<put-inference-api-desc,API description section>> for the available task types.
Refer to the integration list in the <<put-inference-api-desc,API description section>> for the available task types.
--


Expand All @@ -54,15 +54,15 @@ The create {infer} API enables you to create an {infer} endpoint and configure a

[IMPORTANT]
====
* When creating an inference endpoint, the associated machine learning model is automatically deployed if it is not already running.
* When creating an {infer} endpoint, the associated {ml} model is automatically deployed if it is not already running.
* After creating the endpoint, wait for the model deployment to complete before using it. You can verify the deployment status by using the <<get-trained-models-stats, Get trained model statistics>> API. In the response, look for `"state": "fully_allocated"` and ensure the `"allocation_count"` matches the `"target_allocation_count"`.
* Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.
====


The following services are available through the {infer} API.
You can find the available task types next to the service name.
Click the links to review the configuration details of the services:
The following integrations are available through the {infer} API.
You can find the available task types next to the integration name.
Click the links to review the configuration details of the integrations:

* <<infer-service-alibabacloud-ai-search,AlibabaCloud AI Search>> (`completion`, `rerank`, `sparse_embedding`, `text_embedding`)
* <<infer-service-amazon-bedrock,Amazon Bedrock>> (`completion`, `text_embedding`)
Expand All @@ -80,14 +80,14 @@ Click the links to review the configuration details of the services:
* <<infer-service-watsonx-ai>> (`text_embedding`)
* <<infer-service-jinaai,JinaAI>> (`text_embedding`, `rerank`)

The {es} and ELSER services run on a {ml} node in your {es} cluster. The rest of
the services connect to external providers.
The {es} and ELSER services run on a {ml} node in your {es} cluster.
The rest of the integrations connect to external services.

[discrete]
[[adaptive-allocations-put-inference]]
==== Adaptive allocations

Adaptive allocations allow inference services to dynamically adjust the number of model allocations based on the current load.
Adaptive allocations allow inference endpoints to dynamically adjust the number of model allocations based on the current load.

When adaptive allocations are enabled:

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-alibabacloud-ai-search]]
=== AlibabaCloud AI Search {infer} service
=== AlibabaCloud AI Search {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-amazon-bedrock.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-amazon-bedrock]]
=== Amazon Bedrock {infer} service
=== Amazon Bedrock {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-anthropic.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-anthropic]]
=== Anthropic {infer} service
=== Anthropic {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-azure-ai-studio.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-azure-ai-studio]]
=== Azure AI studio {infer} service
=== Azure AI studio {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-azure-openai.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-azure-openai]]
=== Azure OpenAI {infer} service
=== Azure OpenAI {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-cohere.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-cohere]]
=== Cohere {infer} service
=== Cohere {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-elasticsearch.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-elasticsearch]]
=== Elasticsearch {infer} service
=== Elasticsearch {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-elser.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-elser]]
=== ELSER {infer} service
=== ELSER {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-google-ai-studio.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-google-ai-studio]]
=== Google AI Studio {infer} service
=== Google AI Studio {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-google-vertex-ai.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-google-vertex-ai]]
=== Google Vertex AI {infer} service
=== Google Vertex AI {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-hugging-face.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-hugging-face]]
=== HuggingFace {infer} service
=== HuggingFace {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-jinaai.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-jinaai]]
=== JinaAI {infer} service
=== JinaAI {infer} integration

Creates an {infer} endpoint to perform an {infer} task with the `jinaai` service.

Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-mistral.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-mistral]]
=== Mistral {infer} service
=== Mistral {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-openai.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-openai]]
=== OpenAI {infer} service
=== OpenAI {infer} integration

.New API reference
[sidebar]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/inference/service-watsonx-ai.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[[infer-service-watsonx-ai]]
=== Watsonx {infer} service
=== Watsonx {infer} integration

.New API reference
[sidebar]
Expand Down