Skip to content

Commit e7e568b

Browse files
[DOCS] Documents watsonx service of the Inference API (#115088)
Co-authored-by: Saikat Sarkar <[email protected]>
1 parent ade7f7c commit e7e568b

File tree

7 files changed

+129
-26
lines changed

7 files changed

+129
-26
lines changed

docs/reference/inference/delete-inference.asciidoc

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,9 @@ experimental[]
66

77
Deletes an {infer} endpoint.
88

9-
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
10-
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or
11-
Hugging Face. For built-in models and models uploaded through Eland, the {infer}
12-
APIs offer an alternative way to use and manage trained models. However, if you
13-
do not plan to use the {infer} APIs to use these models or if you want to use
14-
non-NLP models, use the <<ml-df-trained-models-apis>>.
9+
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
10+
For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
11+
However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
1512

1613

1714
[discrete]

docs/reference/inference/get-inference.asciidoc

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,9 @@ experimental[]
66

77
Retrieves {infer} endpoint information.
88

9-
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
10-
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or
11-
Hugging Face. For built-in models and models uploaded through Eland, the {infer}
12-
APIs offer an alternative way to use and manage trained models. However, if you
13-
do not plan to use the {infer} APIs to use these models or if you want to use
14-
non-NLP models, use the <<ml-df-trained-models-apis>>.
9+
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
10+
For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
11+
However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
1512

1613

1714
[discrete]

docs/reference/inference/inference-apis.asciidoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,3 +54,4 @@ include::service-google-vertex-ai.asciidoc[]
5454
include::service-hugging-face.asciidoc[]
5555
include::service-mistral.asciidoc[]
5656
include::service-openai.asciidoc[]
57+
include::service-watsonx-ai.asciidoc[]

docs/reference/inference/post-inference.asciidoc

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,9 @@ experimental[]
66

77
Performs an inference task on an input text by using an {infer} endpoint.
88

9-
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
10-
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or
11-
Hugging Face. For built-in models and models uploaded through Eland, the {infer}
12-
APIs offer an alternative way to use and manage trained models. However, if you
13-
do not plan to use the {infer} APIs to use these models or if you want to use
14-
non-NLP models, use the <<ml-df-trained-models-apis>>.
9+
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
10+
For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
11+
However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
1512

1613

1714
[discrete]

docs/reference/inference/put-inference.asciidoc

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,8 @@ Creates an {infer} endpoint to perform an {infer} task.
88

99
[IMPORTANT]
1010
====
11-
* The {infer} APIs enable you to use certain services, such as built-in
12-
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral,
13-
Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic or Hugging Face.
14-
* For built-in models and models uploaded through Eland, the {infer} APIs offer an
15-
alternative way to use and manage trained models. However, if you do not plan to
16-
use the {infer} APIs to use these models or if you want to use non-NLP models,
17-
use the <<ml-df-trained-models-apis>>.
11+
* The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
12+
* For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
1813
====
1914

2015

@@ -71,6 +66,7 @@ Click the links to review the configuration details of the services:
7166
* <<infer-service-hugging-face,Hugging Face>> (`text_embedding`)
7267
* <<infer-service-mistral,Mistral>> (`text_embedding`)
7368
* <<infer-service-openai,OpenAI>> (`completion`, `text_embedding`)
69+
* <<infer-service-watsonx-ai>> (`text_embedding`)
7470

7571
The {es} and ELSER services run on a {ml} node in your {es} cluster. The rest of
7672
the services connect to external providers.
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
[[infer-service-watsonx-ai]]
2+
=== Watsonx {infer} service
3+
4+
Creates an {infer} endpoint to perform an {infer} task with the `watsonxai` service.
5+
6+
You need an https://cloud.ibm.com/docs/databases-for-elasticsearch?topic=databases-for-elasticsearch-provisioning&interface=api[IBM Cloud® Databases for Elasticsearch deployment] to use the `watsonxai` {infer} service.
7+
You can provision one through the https://cloud.ibm.com/databases/databases-for-elasticsearch/create[IBM catalog], the https://cloud.ibm.com/docs/databases-cli-plugin?topic=databases-cli-plugin-cdb-reference[Cloud Databases CLI plug-in], the https://cloud.ibm.com/apidocs/cloud-databases-api[Cloud Databases API], or https://registry.terraform.io/providers/IBM-Cloud/ibm/latest/docs/resources/database[Terraform].
8+
9+
10+
[discrete]
11+
[[infer-service-watsonx-ai-api-request]]
12+
==== {api-request-title}
13+
14+
`PUT /_inference/<task_type>/<inference_id>`
15+
16+
[discrete]
17+
[[infer-service-watsonx-ai-api-path-params]]
18+
==== {api-path-parms-title}
19+
20+
`<inference_id>`::
21+
(Required, string)
22+
include::inference-shared.asciidoc[tag=inference-id]
23+
24+
`<task_type>`::
25+
(Required, string)
26+
include::inference-shared.asciidoc[tag=task-type]
27+
+
28+
--
29+
Available task types:
30+
31+
* `text_embedding`.
32+
--
33+
34+
[discrete]
35+
[[infer-service-watsonx-ai-api-request-body]]
36+
==== {api-request-body-title}
37+
38+
`service`::
39+
(Required, string)
40+
The type of service supported for the specified task type. In this case,
41+
`watsonxai`.
42+
43+
`service_settings`::
44+
(Required, object)
45+
include::inference-shared.asciidoc[tag=service-settings]
46+
+
47+
--
48+
These settings are specific to the `watsonxai` service.
49+
--
50+
51+
`api_key`:::
52+
(Required, string)
53+
A valid API key of your Watsonx account.
54+
You can find your Watsonx API keys or you can create a new one https://cloud.ibm.com/iam/apikeys[on the API keys page].
55+
+
56+
--
57+
include::inference-shared.asciidoc[tag=api-key-admonition]
58+
--
59+
60+
`api_version`:::
61+
(Required, string)
62+
Version parameter that takes a version date in the format of `YYYY-MM-DD`.
63+
For the active version data parameters, refer to the https://cloud.ibm.com/apidocs/watsonx-ai#active-version-dates[documentation].
64+
65+
`model_id`:::
66+
(Required, string)
67+
The name of the model to use for the {infer} task.
68+
Refer to the IBM Embedding Models section in the https://www.ibm.com/products/watsonx-ai/foundation-models[Watsonx documentation] for the list of available text embedding models.
69+
70+
`url`:::
71+
(Required, string)
72+
The URL endpoint to use for the requests.
73+
74+
`project_id`:::
75+
(Required, string)
76+
The name of the project to use for the {infer} task.
77+
78+
`rate_limit`:::
79+
(Optional, object)
80+
By default, the `watsonxai` service sets the number of requests allowed per minute to `120`.
81+
This helps to minimize the number of rate limit errors returned from Watsonx.
82+
To modify this, set the `requests_per_minute` setting of this object in your service settings:
83+
+
84+
--
85+
include::inference-shared.asciidoc[tag=request-per-minute-example]
86+
--
87+
88+
89+
[discrete]
90+
[[inference-example-watsonx-ai]]
91+
==== Watsonx AI service example
92+
93+
The following example shows how to create an {infer} endpoint called `watsonx-embeddings` to perform a `text_embedding` task type.
94+
95+
[source,console]
96+
------------------------------------------------------------
97+
PUT _inference/text_embedding/watsonx-embeddings
98+
{
99+
"service": "watsonxai",
100+
"service_settings": {
101+
"api_key": "<api_key>", <1>
102+
"url": "<url>", <2>
103+
"model_id": "ibm/slate-30m-english-rtrvr",
104+
"project_id": "<project_id>", <3>
105+
"api_version": "2024-03-14" <4>
106+
}
107+
}
108+
109+
------------------------------------------------------------
110+
// TEST[skip:TBD]
111+
<1> A valid Watsonx API key.
112+
You can find on the https://cloud.ibm.com/iam/apikeys[API keys page of your account].
113+
<2> The {infer} endpoint URL you created on Watsonx.
114+
<3> The ID of your IBM Cloud project.
115+
<4> A valid API version parameter. You can find the active version data parameters https://cloud.ibm.com/apidocs/watsonx-ai#active-version-dates[here].

docs/reference/inference/update-inference.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ experimental[]
66

77
Updates an {infer} endpoint.
88

9-
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or Hugging Face.
9+
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
1010
For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
1111
However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
1212

0 commit comments

Comments
 (0)