You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reference/inference/inference-apis.asciidoc
+4-8Lines changed: 4 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,10 +10,8 @@ trained models. However, if you do not plan to use the {infer} APIs to use these
10
10
models or if you want to use non-NLP models, use the
11
11
<<ml-df-trained-models-apis>>.
12
12
13
-
The {infer} APIs enable you to create {infer} endpoints and use {ml} models of
14
-
different providers - such as Amazon Bedrock, Anthropic, Azure AI Studio,
15
-
Cohere, Google AI, Mistral, OpenAI, or HuggingFace - as a service. Use
16
-
the following APIs to manage {infer} models and perform {infer}:
13
+
The {infer} APIs enable you to create {infer} endpoints and integrate with {ml} models of different services - such as Amazon Bedrock, Anthropic, Azure AI Studio, Cohere, Google AI, Mistral, OpenAI, or HuggingFace.
14
+
Use the following APIs to manage {infer} models and perform {infer}:
17
15
18
16
* <<delete-inference-api>>
19
17
* <<get-inference-api>>
@@ -30,10 +28,8 @@ An {infer} endpoint enables you to use the corresponding {ml} model without
30
28
manual deployment and apply it to your data at ingestion time through
31
29
<<semantic-search-semantic-text, semantic text>>.
32
30
33
-
Choose a model from your provider or use ELSER – a retrieval model trained by
34
-
Elastic –, then create an {infer} endpoint by the <<put-inference-api>>.
35
-
Now use <<semantic-search-semantic-text, semantic text>> to perform
36
-
<<semantic-search, semantic search>> on your data.
31
+
Choose a model from your service or use ELSER – a retrieval model trained by Elastic –, then create an {infer} endpoint by the <<put-inference-api>>.
32
+
Now use <<semantic-search-semantic-text, semantic text>> to perform <<semantic-search, semantic search>> on your data.
Refer to the service list in the <<put-inference-api-desc,API description section>> for the available task types.
39
+
Refer to the integration list in the <<put-inference-api-desc,API description section>> for the available task types.
40
40
--
41
41
42
42
@@ -48,15 +48,15 @@ The create {infer} API enables you to create an {infer} endpoint and configure a
48
48
49
49
[IMPORTANT]
50
50
====
51
-
* When creating an inference endpoint, the associated machine learning model is automatically deployed if it is not already running.
51
+
* When creating an {infer} endpoint, the associated {ml} model is automatically deployed if it is not already running.
52
52
* After creating the endpoint, wait for the model deployment to complete before using it. You can verify the deployment status by using the <<get-trained-models-stats, Get trained model statistics>> API. In the response, look for `"state": "fully_allocated"` and ensure the `"allocation_count"` matches the `"target_allocation_count"`.
53
53
* Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.
54
54
====
55
55
56
56
57
-
The following services are available through the {infer} API.
58
-
You can find the available task types next to the service name.
59
-
Click the links to review the configuration details of the services:
57
+
The following integrations are available through the {infer} API.
58
+
You can find the available task types next to the integration name.
59
+
Click the links to review the configuration details of the integrations:
60
60
61
61
* <<infer-service-alibabacloud-ai-search,AlibabaCloud AI Search>> (`completion`, `rerank`, `sparse_embedding`, `text_embedding`)
0 commit comments