You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> *[Jais](how-to-deploy-jais-models.md) family of models
49
+
> *[Jamba](how-to-deploy-models-jamba.md) family of models
48
50
> *[Phi-3](how-to-deploy-models-phi-3.md) family of models
49
51
50
52
Models deployed to [managed inference](concept-endpoints-online.md):
@@ -56,6 +58,9 @@ Models deployed to [managed inference](concept-endpoints-online.md):
56
58
57
59
The API is compatible with Azure OpenAI model deployments.
58
60
61
+
> [!NOTE]
62
+
> The Azure AI model inference API is available in managed inference (Managed Online Endpoints) for __models deployed after June 24th, 2024__. To take advance of the API, redeploy your endpoint if the model has been deployed before such date.
63
+
59
64
## Capabilities
60
65
61
66
The following section describes some of the capabilities the API exposes. For a full specification of the API, view the [reference section](reference-model-inference-info.md).
@@ -95,6 +100,19 @@ model = ChatCompletionsClient(
95
100
)
96
101
```
97
102
103
+
If you are using an endpoint with support for Entra ID, you can create your client as follows:
104
+
105
+
```python
106
+
import os
107
+
from azure.ai.inference import ChatCompletionsClient
108
+
from azure.identity import AzureDefaultCredential
109
+
110
+
model = ChatCompletionsClient(
111
+
endpoint=os.environ["AZUREAI_ENDPOINT_URL"],
112
+
credential=AzureDefaultCredential(),
113
+
)
114
+
```
115
+
98
116
Explore our [samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) and read the [API reference documentation](https://aka.ms/azsdk/azure-ai-inference/python/reference) to get yourself started.
99
117
100
118
# [JavaScript](#tab/javascript)
@@ -118,6 +136,19 @@ const client = new ModelClient(
118
136
);
119
137
```
120
138
139
+
For endpoint with support for Microsoft Entra ID, you can create your client as follows:
Explore our [samples](https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/ai/ai-inference-rest/samples) and read the [API reference documentation](https://aka.ms/AAp1kxa) to get yourself started.
122
153
123
154
# [REST](#tab/rest)
@@ -153,8 +184,13 @@ response = model.complete(
153
184
"safe_mode": True
154
185
}
155
186
)
187
+
188
+
print(response.choices[0].message.content)
156
189
```
157
190
191
+
> [!TIP]
192
+
> When using Azure AI Inference SDK, using passing extra parameters using `model_extras` configures the request with `extra-parameters: pass-through` automatically for you.
193
+
158
194
# [JavaScript](#tab/javascript)
159
195
160
196
```javascript
@@ -170,6 +206,8 @@ var response = await client.path("/chat/completions").post({
> The default value for `extra-parameters` is `error` which returns an error if an extra parameter is indicated in the payload. Alternatively, you can set `extra-parameters: ignore` to drop any unknown parameter in the request. Use this capability in case you happen to be sending requests with extra parameters that you know the model won't support but you want the request to completes anyway. A typical example of this is indicating `seed` parameter.
245
+
> [!NOTE]
246
+
> The default value for `extra-parameters` is `error` which returns an error if an extra parameter is indicated in the payload. Alternatively, you can set `extra-parameters: drop` to drop any unknown parameter in the request. Use this capability in case you happen to be sending requests with extra parameters that you know the model won't support but you want the request to completes anyway. A typical example of this is indicating `seed` parameter.
209
247
210
248
### Models with disparate set of capabilities
211
249
@@ -216,9 +254,9 @@ The following example shows the response for a chat completion request indicatin
216
254
# [Python](#tab/python)
217
255
218
256
```python
219
-
from azure.ai.inference.models import ChatCompletionsResponseFormat
220
-
from azure.core.exceptions import HttpResponseError
221
257
import json
258
+
from azure.ai.inference.models import SystemMessage, UserMessage, ChatCompletionsResponseFormat
259
+
from azure.core.exceptions import HttpResponseError
222
260
223
261
try:
224
262
response = model.complete(
@@ -332,6 +370,7 @@ The following example shows the response for a chat completion request that has
332
370
333
371
```python
334
372
from azure.ai.inference.models import AssistantMessage, UserMessage, SystemMessage
373
+
from azure.core.exceptions import HttpResponseError
0 commit comments