You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Azure AI Model Inference API specifies a set of modalities and parameters that models can subscribe to. However, some models may have further capabilities that the ones the API indicates. On those cases, the API allows the developer to pass them as extra parameters in the payload.
138
138
139
-
By setting a header `extra-parameters: allow`, the API will attempt to pass any unknown parameter directly to the underlying model. If the model can handle that parameter, the request completes.
139
+
By setting a header `extra-parameters: pass-through`, the API will attempt to pass any unknown parameter directly to the underlying model. If the model can handle that parameter, the request completes.
140
140
141
141
The following example shows a request passing the parameter `safe_prompt` supported by Mistral-Large, which isn't specified in the Azure AI Model Inference API:
142
142
@@ -163,6 +163,7 @@ var messages = [
163
163
];
164
164
165
165
var response =awaitclient.path("/chat/completions").post({
166
+
"extra-parameters":"pass-through",
166
167
body: {
167
168
messages: messages,
168
169
safe_mode:true
@@ -178,7 +179,7 @@ __Request__
178
179
POST /chat/completions?api-version=2024-04-01-preview
179
180
Authorization: Bearer <bearer-token>
180
181
Content-Type: application/json
181
-
extra-parameters: allow
182
+
extra-parameters: pass-through
182
183
```
183
184
184
185
```JSON
@@ -203,7 +204,7 @@ extra-parameters: allow
203
204
---
204
205
205
206
> [!TIP]
206
-
> Alternatively, you can set `extra-parameters: drop` to drop any unknown parameter in the request. Use this capability in case you happen to be sending requests with extra parameters that you know the model won't support but you want the request to completes anyway. A typical example of this is indicating `seed` parameter.
207
+
> The default value for `extra-parameters` is `error` which returns an error if an extra parameter is indicated in the payload. Alternatively, you can set `extra-parameters: ignore` to drop any unknown parameter in the request. Use this capability in case you happen to be sending requests with extra parameters that you know the model won't support but you want the request to completes anyway. A typical example of this is indicating `seed` parameter.
207
208
208
209
### Models with disparate set of capabilities
209
210
@@ -426,3 +427,27 @@ __Response__
426
427
## Getting started
427
428
428
429
The Azure AI Model Inference API is currently supported in certain models deployed as [Serverless API endpoints](../how-to/deploy-models-serverless.md) and Managed Online Endpoints. Deploy any of the [supported models](#availability) and use the exact same code to consume their predictions.
430
+
431
+
# [Python](#tab/python)
432
+
433
+
The client library `azure-ai-inference` does inference, including chat completions, for AI models deployed by Azure AI Studio and Azure Machine Learning Studio. It supports Serverless API endpoints and Managed Compute endpoints (formerly known as Managed Online Endpoints).
434
+
435
+
Explore our [samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) and read the [API reference documentation](https://aka.ms/azsdk/azure-ai-inference/python/reference) to get yourself started.
436
+
437
+
# [JavaScript](#tab/javascript)
438
+
439
+
The client library `@azure-rest/ai-inference` does inference, including chat completions, for AI models deployed by Azure AI Studio and Azure Machine Learning Studio. It supports Serverless API endpoints and Managed Compute endpoints (formerly known as Managed Online Endpoints).
440
+
441
+
Explore our [samples](https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/ai/ai-inference-rest/samples) and read the [API reference documentation](https://aka.ms/AAp1kxa) to get yourself started.
442
+
443
+
# [REST](#tab/rest)
444
+
445
+
Explore the reference section of the Azure AI model inference API to see parameters and options to consume models, including chat completions models, deployed by Azure AI Studio and Azure Machine Learning Studio. It supports Serverless API endpoints and Managed Compute endpoints (formerly known as Managed Online Endpoints).
446
+
447
+
*[Get info](reference-model-inference-info.md): Returns the information about the model deployed under the endpoint.
448
+
*[Text embeddings](reference-model-inference-embeddings.md): Creates an embedding vector representing the input text.
449
+
*[Text completions](reference-model-inference-completions.md): Creates a completion for the provided prompt and parameters.
450
+
*[Chat completions](reference-model-inference-chat-completions.md): Creates a model response for the given chat conversation.
451
+
*[Image embeddings](reference-model-inference-images-embeddings.md): Creates an embedding vector representing the input text and image.
Copy file name to clipboardExpand all lines: articles/ai-studio/reference/reference-model-inference-chat-completions.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ POST /chat/completions?api-version=2024-04-01-preview
35
35
36
36
| Name | Required | Type | Description |
37
37
| --- | --- | --- | --- |
38
-
| extra-parameters || string | The behavior of the API when extra parameters are indicated in the payload. Using `allow` makes the API to pass the parameter to the underlying model. Use this value when you want to pass parameters that you know the underlying model can support. Using `drop` makes the API to drop any unsupported parameter. Use this value when you need to use the same payload across different models, but one of the extra parameters may make a model to error out if not supported. Using `error` makes the API to reject any extra parameter in the payload. Only parameters specified in this API can be indicated, or a 400 error is returned. |
38
+
| extra-parameters || string | The behavior of the API when extra parameters are indicated in the payload. Using `pass-through` makes the API to pass the parameter to the underlying model. Use this value when you want to pass parameters that you know the underlying model can support. Using `ignore` makes the API to drop any unsupported parameter. Use this value when you need to use the same payload across different models, but one of the extra parameters may make a model to error out if not supported. Using `error` makes the API to reject any extra parameter in the payload. Only parameters specified in this API can be indicated, or a 400 error is returned. |
39
39
| azureml-model-deployment || string | Name of the deployment you want to route the request to. Supported for endpoints that support multiple deployments. |
Copy file name to clipboardExpand all lines: articles/ai-studio/reference/reference-model-inference-completions.md
+9-1Lines changed: 9 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,6 +28,14 @@ POST /completions?api-version=2024-04-01-preview
28
28
| --- | --- | --- | --- | --- |
29
29
| api-version | query | True | string | The version of the API in the format "YYYY-MM-DD" or "YYYY-MM-DD-preview". |
30
30
31
+
## Request Header
32
+
33
+
34
+
| Name | Required | Type | Description |
35
+
| --- | --- | --- | --- |
36
+
| extra-parameters || string | The behavior of the API when extra parameters are indicated in the payload. Using `pass-through` makes the API to pass the parameter to the underlying model. Use this value when you want to pass parameters that you know the underlying model can support. Using `ignore` makes the API to drop any unsupported parameter. Use this value when you need to use the same payload across different models, but one of the extra parameters may make a model to error out if not supported. Using `error` makes the API to reject any extra parameter in the payload. Only parameters specified in this API can be indicated, or a 400 error is returned. |
37
+
| azureml-model-deployment || string | Name of the deployment you want to route the request to. Supported for endpoints that support multiple deployments. |
38
+
31
39
32
40
## Request Body
33
41
@@ -285,4 +293,4 @@ The object type, which is always "list".
Copy file name to clipboardExpand all lines: articles/ai-studio/reference/reference-model-inference-embeddings.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ POST /embeddings?api-version=2024-04-01-preview
35
35
36
36
| Name | Required | Type | Description |
37
37
| --- | --- | --- | --- |
38
-
| extra-parameters || string | The behavior of the API when extra parameters are indicated in the payload. Using `allow` makes the API to pass the parameter to the underlying model. Use this value when you want to pass parameters that you know the underlying model can support. Using `drop` makes the API to drop any unsupported parameter. Use this value when you need to use the same payload across different models, but one of the extra parameters may make a model to error out if not supported. Using `error` makes the API to reject any extra parameter in the payload. Only parameters specified in this API can be indicated, or a 400 error is returned. |
38
+
| extra-parameters || string | The behavior of the API when extra parameters are indicated in the payload. Using `pass-through` makes the API to pass the parameter to the underlying model. Use this value when you want to pass parameters that you know the underlying model can support. Using `ignore` makes the API to drop any unsupported parameter. Use this value when you need to use the same payload across different models, but one of the extra parameters may make a model to error out if not supported. Using `error` makes the API to reject any extra parameter in the payload. Only parameters specified in this API can be indicated, or a 400 error is returned. |
39
39
| azureml-model-deployment || string | Name of the deployment you want to route the request to. Supported for endpoints that support multiple deployments. |
Copy file name to clipboardExpand all lines: articles/ai-studio/reference/reference-model-inference-images-embeddings.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ POST /images/embeddings?api-version=2024-04-01-preview
35
35
36
36
| Name | Required | Type | Description |
37
37
| --- | --- | --- | --- |
38
-
| extra-parameters || string | The behavior of the API when extra parameters are indicated in the payload. Using `allow` makes the API to pass the parameter to the underlying model. Use this value when you want to pass parameters that you know the underlying model can support. Using `drop` makes the API to drop any unsupported parameter. Use this value when you need to use the same payload across different models, but one of the extra parameters may make a model to error out if not supported. Using `error` makes the API to reject any extra parameter in the payload. Only parameters specified in this API can be indicated, or a 400 error is returned. |
38
+
| extra-parameters || string | The behavior of the API when extra parameters are indicated in the payload. Using `pass-through` makes the API to pass the parameter to the underlying model. Use this value when you want to pass parameters that you know the underlying model can support. Using `ignore` makes the API to drop any unsupported parameter. Use this value when you need to use the same payload across different models, but one of the extra parameters may make a model to error out if not supported. Using `error` makes the API to reject any extra parameter in the payload. Only parameters specified in this API can be indicated, or a 400 error is returned. |
39
39
| azureml-model-deployment || string | Name of the deployment you want to route the request to. Supported for endpoints that support multiple deployments. |
0 commit comments