Skip to content

Commit 922ed7c

Browse files
Merge pull request #280908 from santiagxf/santiagxf-patch-1
Update reference-model-inference-api.md
2 parents 7b5a61f + 6b29e7c commit 922ed7c

File tree

1 file changed

+43
-4
lines changed

1 file changed

+43
-4
lines changed

articles/machine-learning/reference-model-inference-api.md

Lines changed: 43 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,8 @@ Models deployed to [serverless API endpoints](how-to-deploy-models-serverless.md
4545
> * [Meta Llama 3 instruct](how-to-deploy-models-llama.md) family of models
4646
> * [Mistral-Small](how-to-deploy-models-mistral.md)
4747
> * [Mistral-Large](how-to-deploy-models-mistral.md)
48+
> * [Jais](deploy-jais-models.md) family of models
49+
> * [Jamba](how-to-deploy-models-jamba.md) family of models
4850
> * [Phi-3](how-to-deploy-models-phi-3.md) family of models
4951
5052
Models deployed to [managed inference](concept-endpoints-online.md):
@@ -56,6 +58,9 @@ Models deployed to [managed inference](concept-endpoints-online.md):
5658
5759
The API is compatible with Azure OpenAI model deployments.
5860

61+
> [!NOTE]
62+
> The Azure AI model inference API is available in managed inference (Managed Online Endpoints) for __models deployed after June 24th, 2024__. To take advance of the API, redeploy your endpoint if the model has been deployed before such date.
63+
5964
## Capabilities
6065

6166
The following section describes some of the capabilities the API exposes. For a full specification of the API, view the [reference section](reference-model-inference-info.md).
@@ -95,6 +100,19 @@ model = ChatCompletionsClient(
95100
)
96101
```
97102

103+
If you are using an endpoint with support for Entra ID, you can create your client as follows:
104+
105+
```python
106+
import os
107+
from azure.ai.inference import ChatCompletionsClient
108+
from azure.identity import AzureDefaultCredential
109+
110+
model = ChatCompletionsClient(
111+
endpoint=os.environ["AZUREAI_ENDPOINT_URL"],
112+
credential=AzureDefaultCredential(),
113+
)
114+
```
115+
98116
Explore our [samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) and read the [API reference documentation](https://aka.ms/azsdk/azure-ai-inference/python/reference) to get yourself started.
99117

100118
# [JavaScript](#tab/javascript)
@@ -118,6 +136,19 @@ const client = new ModelClient(
118136
);
119137
```
120138

139+
For endpoint with support for Microsoft Entra ID, you can create your client as follows:
140+
141+
```javascript
142+
import ModelClient from "@azure-rest/ai-inference";
143+
import { isUnexpected } from "@azure-rest/ai-inference";
144+
import { AzureDefaultCredential } from "@azure/identity";
145+
146+
const client = new ModelClient(
147+
process.env.AZUREAI_ENDPOINT_URL,
148+
new AzureDefaultCredential()
149+
);
150+
```
151+
121152
Explore our [samples](https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/ai/ai-inference-rest/samples) and read the [API reference documentation](https://aka.ms/AAp1kxa) to get yourself started.
122153

123154
# [REST](#tab/rest)
@@ -153,8 +184,13 @@ response = model.complete(
153184
"safe_mode": True
154185
}
155186
)
187+
188+
print(response.choices[0].message.content)
156189
```
157190

191+
> [!TIP]
192+
> When using Azure AI Inference SDK, using `model_extras` configures the request with `extra-parameters: pass-through` automatically for you.
193+
158194
# [JavaScript](#tab/javascript)
159195

160196
```javascript
@@ -170,6 +206,8 @@ var response = await client.path("/chat/completions").post({
170206
safe_mode: true
171207
}
172208
});
209+
210+
console.log(response.choices[0].message.content)
173211
```
174212

175213
# [REST](#tab/rest)
@@ -204,8 +242,8 @@ extra-parameters: pass-through
204242

205243
---
206244

207-
> [!TIP]
208-
> The default value for `extra-parameters` is `error` which returns an error if an extra parameter is indicated in the payload. Alternatively, you can set `extra-parameters: ignore` to drop any unknown parameter in the request. Use this capability in case you happen to be sending requests with extra parameters that you know the model won't support but you want the request to completes anyway. A typical example of this is indicating `seed` parameter.
245+
> [!NOTE]
246+
> The default value for `extra-parameters` is `error` which returns an error if an extra parameter is indicated in the payload. Alternatively, you can set `extra-parameters: drop` to drop any unknown parameter in the request. Use this capability in case you happen to be sending requests with extra parameters that you know the model won't support but you want the request to completes anyway. A typical example of this is indicating `seed` parameter.
209247
210248
### Models with disparate set of capabilities
211249

@@ -216,9 +254,9 @@ The following example shows the response for a chat completion request indicatin
216254
# [Python](#tab/python)
217255

218256
```python
219-
from azure.ai.inference.models import ChatCompletionsResponseFormat
220-
from azure.core.exceptions import HttpResponseError
221257
import json
258+
from azure.ai.inference.models import SystemMessage, UserMessage, ChatCompletionsResponseFormat
259+
from azure.core.exceptions import HttpResponseError
222260

223261
try:
224262
response = model.complete(
@@ -332,6 +370,7 @@ The following example shows the response for a chat completion request that has
332370

333371
```python
334372
from azure.ai.inference.models import AssistantMessage, UserMessage, SystemMessage
373+
from azure.core.exceptions import HttpResponseError
335374

336375
try:
337376
response = model.complete(

0 commit comments

Comments
 (0)