Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 3 additions & 6 deletions docs/reference/inference/delete-inference.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,9 @@ experimental[]

Deletes an {infer} endpoint.

IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or
Hugging Face. For built-in models and models uploaded through Eland, the {infer}
APIs offer an alternative way to use and manage trained models. However, if you
do not plan to use the {infer} APIs to use these models or if you want to use
non-NLP models, use the <<ml-df-trained-models-apis>>.
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.


[discrete]
Expand Down
9 changes: 3 additions & 6 deletions docs/reference/inference/get-inference.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,9 @@ experimental[]

Retrieves {infer} endpoint information.

IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or
Hugging Face. For built-in models and models uploaded through Eland, the {infer}
APIs offer an alternative way to use and manage trained models. However, if you
do not plan to use the {infer} APIs to use these models or if you want to use
non-NLP models, use the <<ml-df-trained-models-apis>>.
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.


[discrete]
Expand Down
1 change: 1 addition & 0 deletions docs/reference/inference/inference-apis.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -54,3 +54,4 @@ include::service-google-vertex-ai.asciidoc[]
include::service-hugging-face.asciidoc[]
include::service-mistral.asciidoc[]
include::service-openai.asciidoc[]
include::service-watsonx-ai.asciidoc[]
9 changes: 3 additions & 6 deletions docs/reference/inference/post-inference.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,9 @@ experimental[]

Performs an inference task on an input text by using an {infer} endpoint.

IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or
Hugging Face. For built-in models and models uploaded through Eland, the {infer}
APIs offer an alternative way to use and manage trained models. However, if you
do not plan to use the {infer} APIs to use these models or if you want to use
non-NLP models, use the <<ml-df-trained-models-apis>>.
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.


[discrete]
Expand Down
10 changes: 3 additions & 7 deletions docs/reference/inference/put-inference.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,8 @@ Creates an {infer} endpoint to perform an {infer} task.

[IMPORTANT]
====
* The {infer} APIs enable you to use certain services, such as built-in
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral,
Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic or Hugging Face.
* For built-in models and models uploaded through Eland, the {infer} APIs offer an
alternative way to use and manage trained models. However, if you do not plan to
use the {infer} APIs to use these models or if you want to use non-NLP models,
use the <<ml-df-trained-models-apis>>.
* The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
* For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
====


Expand Down Expand Up @@ -71,6 +66,7 @@ Click the links to review the configuration details of the services:
* <<infer-service-hugging-face,Hugging Face>> (`text_embedding`)
* <<infer-service-mistral,Mistral>> (`text_embedding`)
* <<infer-service-openai,OpenAI>> (`completion`, `text_embedding`)
* <<infer-service-watsonx-ai>> (`text_embedding`)

The {es} and ELSER services run on a {ml} node in your {es} cluster. The rest of
the services connect to external providers.
115 changes: 115 additions & 0 deletions docs/reference/inference/service-watsonx-ai.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
[[infer-service-watsonx-ai]]
=== Watsonx {infer} service

Creates an {infer} endpoint to perform an {infer} task with the `watsonxai` service.

You need an https://cloud.ibm.com/docs/databases-for-elasticsearch?topic=databases-for-elasticsearch-provisioning&interface=api[IBM Cloud® Databases for Elasticsearch deployment] to use the `watsonxai` {infer} service.
You can provision one through the https://cloud.ibm.com/databases/databases-for-elasticsearch/create[IBM catalog], the https://cloud.ibm.com/docs/databases-cli-plugin?topic=databases-cli-plugin-cdb-reference[Cloud Databases CLI plug-in], the https://cloud.ibm.com/apidocs/cloud-databases-api[Cloud Databases API], or https://registry.terraform.io/providers/IBM-Cloud/ibm/latest/docs/resources/database[Terraform].


[discrete]
[[infer-service-watsonx-ai-api-request]]
==== {api-request-title}

`PUT /_inference/<task_type>/<inference_id>`

[discrete]
[[infer-service-watsonx-ai-api-path-params]]
==== {api-path-parms-title}

`<inference_id>`::
(Required, string)
include::inference-shared.asciidoc[tag=inference-id]

`<task_type>`::
(Required, string)
include::inference-shared.asciidoc[tag=task-type]
+
--
Available task types:

* `text_embedding`.
--

[discrete]
[[infer-service-watsonx-ai-api-request-body]]
==== {api-request-body-title}

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
`watsonxai`.

`service_settings`::
(Required, object)
include::inference-shared.asciidoc[tag=service-settings]
+
--
These settings are specific to the `watsonxai` service.
--

`api_key`:::
(Required, string)
A valid API key of your Watsonx account.
You can find your Watsonx API keys or you can create a new one https://cloud.ibm.com/iam/apikeys[on the API keys page].
+
--
include::inference-shared.asciidoc[tag=api-key-admonition]
--

`api_version`:::
(Required, string)
Version parameter that takes a version date in the format of `YYYY-MM-DD`.
For the active version data parameters, refer to the https://cloud.ibm.com/apidocs/watsonx-ai#active-version-dates[documentation].

`model_id`:::
(Required, string)
The name of the model to use for the {infer} task.
Refer to the IBM Embedding Models section in the https://www.ibm.com/products/watsonx-ai/foundation-models[Watsonx documentation] for the list of available text embedding models.

`url`:::
(Required, string)
The URL endpoint to use for the requests.

`project_id`:::
(Required, string)
The name of the project to use for the {infer} task.

`rate_limit`:::
(Optional, object)
By default, the `watsonxai` service sets the number of requests allowed per minute to `120`.
This helps to minimize the number of rate limit errors returned from Watsonx.
To modify this, set the `requests_per_minute` setting of this object in your service settings:
+
--
include::inference-shared.asciidoc[tag=request-per-minute-example]
--


[discrete]
[[inference-example-watsonx-ai]]
==== Watsonx AI service example

The following example shows how to create an {infer} endpoint called `watsonx-embeddings` to perform a `text_embedding` task type.

[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/watsonx-embeddings
{
"service": "watsonxai",
"service_settings": {
"api_key": "<api_key>", <1>
"url": "<url>", <2>
"model_id": "ibm/slate-30m-english-rtrvr",
"project_id": "<project_id>", <3>
"api_version": "2024-03-14" <4>
}
}

------------------------------------------------------------
// TEST[skip:TBD]
<1> A valid Watsonx API key.
You can find on the https://cloud.ibm.com/iam/apikeys[API keys page of your account].
<2> The {infer} endpoint URL you created on Watsonx.
<3> The ID of your IBM Cloud project.
<4> A valid API version parameter. You can find the active version data parameters https://cloud.ibm.com/apidocs/watsonx-ai#active-version-dates[here].
2 changes: 1 addition & 1 deletion docs/reference/inference/update-inference.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ experimental[]

Updates an {infer} endpoint.

IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or Hugging Face.
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.

Expand Down
Loading