Skip to content

Commit befdcbf

Browse files
authored
Updates the Watsonx integration page with reranking feature (#623)
## [Preview](https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/623/solutions/search/inference-api/watsonx-inference-integration) This PR updates the Watsonx integration page to include the new 'rerank' feature. Related issue: elastic/developer-docs-team#255
1 parent fdadcf7 commit befdcbf

File tree

1 file changed

+56
-11
lines changed

1 file changed

+56
-11
lines changed

solutions/search/inference-api/watsonx-inference-integration.md

Lines changed: 56 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,8 @@ You need an [IBM Cloud® Databases for Elasticsearch deployment](https://cloud.i
3333

3434
Available task types:
3535

36-
* `text_embedding`.
37-
38-
36+
* `text_embedding`,
37+
* `rerank`.
3938

4039
## {{api-request-body-title}} [infer-service-watsonx-ai-api-request-body]
4140

@@ -50,9 +49,9 @@ You need an [IBM Cloud® Databases for Elasticsearch deployment](https://cloud.i
5049
`api_key`
5150
: (Required, string) A valid API key of your Watsonx account. You can find your Watsonx API keys or you can create a new one [on the API keys page](https://cloud.ibm.com/iam/apikeys).
5251

53-
::::{important}
54-
You need to provide the API key only once, during the {{infer}} model creation. The [Get {{infer}} API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-get) does not retrieve your API key. After creating the {{infer}} model, you cannot change the associated API key. If you want to use a different API key, delete the {{infer}} model and recreate it with the same name and the updated API key.
55-
::::
52+
::::{important}
53+
You need to provide the API key only once, during the {{infer}} model creation. The [Get {{infer}} API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-get) does not retrieve your API key. After creating the {{infer}} model, you cannot change the associated API key. If you want to use a different API key, delete the {{infer}} model and recreate it with the same name and the updated API key.
54+
::::
5655

5756

5857
`api_version`
@@ -70,13 +69,28 @@ You need an [IBM Cloud® Databases for Elasticsearch deployment](https://cloud.i
7069
`rate_limit`
7170
: (Optional, object) By default, the `watsonxai` service sets the number of requests allowed per minute to `120`. This helps to minimize the number of rate limit errors returned from Watsonx. To modify this, set the `requests_per_minute` setting of this object in your service settings:
7271

73-
```text
74-
"rate_limit": {
75-
"requests_per_minute": <<number_of_requests>>
76-
}
77-
```
72+
```json
73+
"rate_limit": {
74+
"requests_per_minute": <<number_of_requests>>
75+
}
76+
```
77+
78+
`task_settings`
79+
: (Optional, object) Settings to configure the inference task.
80+
81+
These settings are specific to the `<task_type>` you specified.
82+
83+
::::{dropdown} `task_settings` for the `rerank` task type
84+
`truncate_input_tokens`
85+
: (Optional, integer) Specifies the maximum number of tokens per input document before truncation.
86+
87+
`return_documents`
88+
: (Optional, boolean) Specify whether to return doc text within the results.
7889

90+
`top_n`
91+
: (Optional, integer) The number of most relevant documents to return. Defaults to the number of input documents.
7992

93+
::::
8094

8195
## Watsonx AI service example [inference-example-watsonx-ai]
8296

@@ -101,4 +115,35 @@ PUT _inference/text_embedding/watsonx-embeddings
101115
3. The ID of your IBM Cloud project.
102116
4. A valid API version parameter. You can find the active version data parameters [here](https://cloud.ibm.com/apidocs/watsonx-ai#active-version-dates).
103117

118+
The following example shows how to create an inference endpoint called `watsonx-rerank` to perform a `rerank` task type.
119+
120+
```console
121+
122+
PUT _inference/rerank/watsonx-rerank
123+
{
124+
"service": "watsonxai",
125+
"service_settings": {
126+
"api_key": "<api_key>", <1>
127+
"url": "<url>", <2>
128+
"model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
129+
"project_id": "<project_id>", <3>
130+
"api_version": "2024-05-02" <4>
131+
},
132+
"task_settings": {
133+
"truncate_input_tokens": 50, <5>
134+
"return_documents": true, <6>
135+
"top_n": 3 <7>
136+
}
137+
}
138+
```
139+
140+
1. A valid Watsonx API key. You can find on the [API keys page of your account](https://cloud.ibm.com/iam/apikeys).
141+
2. The {{infer}} endpoint URL you created on Watsonx.
142+
3. The ID of your IBM Cloud project.
143+
4. A valid API version parameter. You can find the active version data parameters [here](https://cloud.ibm.com/apidocs/watsonx-ai#active-version-dates).
144+
5. The maximum number of tokens per document before truncation.
145+
6. Whether to return the document text in the results.
146+
7. The number of top relevant documents to return.
147+
148+
104149

0 commit comments

Comments
 (0)