|
| 1 | +[[infer-service-voyageai]] |
| 2 | +=== VoyageAI {infer} integration |
| 3 | + |
| 4 | +.New API reference |
| 5 | +[sidebar] |
| 6 | +-- |
| 7 | +For the most up-to-date API details, refer to {api-es}/group/endpoint-inference[{infer-cap} APIs]. |
| 8 | +-- |
| 9 | + |
| 10 | +Creates an {infer} endpoint to perform an {infer} task with the `voyageai` service. |
| 11 | + |
| 12 | + |
| 13 | +[discrete] |
| 14 | +[[infer-service-voyageai-api-request]] |
| 15 | +==== {api-request-title} |
| 16 | + |
| 17 | +`PUT /_inference/<task_type>/<inference_id>` |
| 18 | + |
| 19 | +[discrete] |
| 20 | +[[infer-service-voyageai-api-path-params]] |
| 21 | +==== {api-path-parms-title} |
| 22 | + |
| 23 | +`<inference_id>`:: |
| 24 | +(Required, string) |
| 25 | +include::inference-shared.asciidoc[tag=inference-id] |
| 26 | + |
| 27 | +`<task_type>`:: |
| 28 | +(Required, string) |
| 29 | +include::inference-shared.asciidoc[tag=task-type] |
| 30 | ++ |
| 31 | +-- |
| 32 | +Available task types: |
| 33 | + |
| 34 | +* `text_embedding`, |
| 35 | +* `rerank`. |
| 36 | +-- |
| 37 | + |
| 38 | +[discrete] |
| 39 | +[[infer-service-voyageai-api-request-body]] |
| 40 | +==== {api-request-body-title} |
| 41 | + |
| 42 | +`chunking_settings`:: |
| 43 | +(Optional, object) |
| 44 | +include::inference-shared.asciidoc[tag=chunking-settings] |
| 45 | + |
| 46 | +`max_chunk_size`::: |
| 47 | +(Optional, integer) |
| 48 | +include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size] |
| 49 | + |
| 50 | +`overlap`::: |
| 51 | +(Optional, integer) |
| 52 | +include::inference-shared.asciidoc[tag=chunking-settings-overlap] |
| 53 | + |
| 54 | +`sentence_overlap`::: |
| 55 | +(Optional, integer) |
| 56 | +include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap] |
| 57 | + |
| 58 | +`strategy`::: |
| 59 | +(Optional, string) |
| 60 | +include::inference-shared.asciidoc[tag=chunking-settings-strategy] |
| 61 | + |
| 62 | +`service`:: |
| 63 | +(Required, string) |
| 64 | +The type of service supported for the specified task type. In this case, |
| 65 | +`voyageai`. |
| 66 | + |
| 67 | +`service_settings`:: |
| 68 | +(Required, object) |
| 69 | +include::inference-shared.asciidoc[tag=service-settings] |
| 70 | ++ |
| 71 | +-- |
| 72 | +These settings are specific to the `voyageai` service. |
| 73 | +-- |
| 74 | + |
| 75 | +`dimensions`::: |
| 76 | +(Optional, integer) |
| 77 | +The number of dimensions the resulting output embeddings should have. |
| 78 | +This setting maps to `output_dimension` in the https://docs.voyageai.com/docs/embeddings[VoyageAI documentation]. |
| 79 | +Only for the `text_embedding` task type. |
| 80 | + |
| 81 | +`embedding_type`::: |
| 82 | +(Optional, string) |
| 83 | +The data type for the embeddings to be returned. |
| 84 | +This setting maps to `output_dtype` in the https://docs.voyageai.com/docs/embeddings[VoyageAI documentation]. |
| 85 | +Permitted values: `float`, `int8`, `bit`. |
| 86 | +`int8` is a synonym of `byte` in the VoyageAI documentation. |
| 87 | +`bit` is a synonym of `binary` in the VoyageAI documentation. |
| 88 | +Only for the `text_embedding` task type. |
| 89 | + |
| 90 | +`model_id`::: |
| 91 | +(Required, string) |
| 92 | +The name of the model to use for the {infer} task. |
| 93 | +Refer to the VoyageAI documentation for the list of available https://docs.voyageai.com/docs/embeddings[text embedding] and https://docs.voyageai.com/docs/reranker[rerank] models. |
| 94 | + |
| 95 | +`rate_limit`::: |
| 96 | +(Optional, object) |
| 97 | +This setting helps to minimize the number of rate limit errors returned from VoyageAI. |
| 98 | +The `voyageai` service sets a default number of requests allowed per minute depending on the task type. |
| 99 | +For both `text_embedding` and `rerank`, it is set to `2000`. |
| 100 | +To modify this, set the `requests_per_minute` setting of this object in your service settings: |
| 101 | ++ |
| 102 | +-- |
| 103 | +include::inference-shared.asciidoc[tag=request-per-minute-example] |
| 104 | + |
| 105 | +More information about the rate limits for OpenAI can be found in your https://platform.openai.com/account/limits[Account limits]. |
| 106 | +-- |
| 107 | + |
| 108 | +`task_settings`:: |
| 109 | +(Optional, object) |
| 110 | +include::inference-shared.asciidoc[tag=task-settings] |
| 111 | ++ |
| 112 | +.`task_settings` for the `text_embedding` task type |
| 113 | +[%collapsible%closed] |
| 114 | +===== |
| 115 | +`input_type`::: |
| 116 | +(Optional, string) |
| 117 | +Type of the input text. |
| 118 | +Permitted values: `ingest` (maps to `document` in the VoyageAI documentation), `search` (maps to `query` in the VoyageAI documentation). |
| 119 | +
|
| 120 | +`truncation`::: |
| 121 | +(Optional, boolean) |
| 122 | +Whether to truncate the input texts to fit within the context length. |
| 123 | +Defaults to `false`. |
| 124 | +===== |
| 125 | ++ |
| 126 | +.`task_settings` for the `rerank` task type |
| 127 | +[%collapsible%closed] |
| 128 | +===== |
| 129 | +`return_documents`::: |
| 130 | +(Optional, boolean) |
| 131 | +Whether to return the source documents in the response. |
| 132 | +Defaults to `false`. |
| 133 | +
|
| 134 | +`top_k`::: |
| 135 | +(Optional, integer) |
| 136 | +The number of most relevant documents to return. |
| 137 | +If not specified, the reranking results of all documents will be returned. |
| 138 | +
|
| 139 | +`truncation`::: |
| 140 | +(Optional, boolean) |
| 141 | +Whether to truncate the input texts to fit within the context length. |
| 142 | +Defaults to `false`. |
| 143 | +===== |
| 144 | + |
| 145 | + |
| 146 | +[discrete] |
| 147 | +[[inference-example-voyageai]] |
| 148 | +==== VoyageAI service example |
| 149 | + |
| 150 | +The following example shows how to create an {infer} endpoint called `voyageai-embeddings` to perform a `text_embedding` task type. |
| 151 | +The embeddings created by requests to this endpoint will have 512 dimensions. |
| 152 | + |
| 153 | +[source,console] |
| 154 | +------------------------------------------------------------ |
| 155 | +PUT _inference/text_embedding/voyageai-embeddings |
| 156 | +{ |
| 157 | + "service": "voyageai", |
| 158 | + "service_settings": { |
| 159 | + "model_id": "voyage-3-large", |
| 160 | + "dimensions": 512 |
| 161 | + } |
| 162 | +} |
| 163 | +------------------------------------------------------------ |
| 164 | +// TEST[skip:TBD] |
| 165 | + |
| 166 | +The next example shows how to create an {infer} endpoint called `voyageai-rerank` to perform a `rerank` task type. |
| 167 | + |
| 168 | +[source,console] |
| 169 | +------------------------------------------------------------ |
| 170 | +PUT _inference/rerank/voyageai-rerank |
| 171 | +{ |
| 172 | + "service": "voyageai", |
| 173 | + "service_settings": { |
| 174 | + "model_id": "rerank-2" |
| 175 | + } |
| 176 | +} |
| 177 | +------------------------------------------------------------ |
| 178 | +// TEST[skip:TBD] |
0 commit comments