You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
28
-
29
-
* Install the [Azure AI inference package](https://aka.ms/azsdk/azure-ai-inference/python/reference) with the following command:
* If you are using Entra ID, you also need the following package:
29
+
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
28
30
29
-
* Add the [Azure AI inference package](https://aka.ms/azsdk/azure-ai-inference/java/reference) to your project:
30
-
31
-
```xml
32
-
<dependency>
33
-
<groupId>com.azure</groupId>
34
-
<artifactId>azure-ai-inference</artifactId>
35
-
<version>1.0.0-beta.1</version>
36
-
</dependency>
37
-
```
38
-
39
-
* If you are using Entra ID, you also need the following package:
> Some models don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
76
69
77
70
The response is as follows, where you can see the model's usage statistics:
78
71
72
+
```java
73
+
System.out.printf("Model ID=%s is created at %s.%n", chatCompletions.getId(), chatCompletions.getCreated());
74
+
for (ChatChoice choice : chatCompletions.getChoices()) {
Response: As of now, it's estimated that there are about 7,000 languages spoken around the world. However, this number can vary as some languages become extinct and new ones develop. It's also important to note that the number of speakers can greatly vary between languages, with some having millions of speakers and others only a few hundred.
81
84
Model: mistral-large-2407
@@ -93,7 +96,26 @@ By default, the completions API returns the entire generated content in a single
93
96
94
97
You can _stream_ the content to get it as it's being generated. Streaming content allows you to start processing the completion as content becomes available. This mode returns an object that streams back the response as [data-only server-sent events](https://html.spec.whatwg.org/multipage/server-sent-events.html#server-sent-events). Extract chunks from the delta field, rather than the message field.
95
98
96
-
You can visualize how streaming generates content:
#### Explore more parameters supported by the inference client
99
121
@@ -141,29 +163,3 @@ The following example shows how to handle events when the model detects harmful
141
163
142
164
> [!TIP]
143
165
> To learn more about how you can configure and control Azure AI content safety settings, check the [Azure AI content safety documentation](https://aka.ms/azureaicontentsafety).
144
-
145
-
## Use chat completions with images
146
-
147
-
Some models can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of Some models for vision in a chat fashion:
148
-
149
-
> [!IMPORTANT]
150
-
> Some models support only one image for each turn in the chat conversation and only the last image is retained in context. If you add multiple images, it results in an error.
151
-
152
-
To see this capability, download an image and encode the information as `base64` string. The resulting data should be inside of a [data URL](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs):
153
-
154
-
Visualize the image:
155
-
156
-
:::image type="content" source="../../../../ai-foundry/media/how-to/sdks/small-language-models-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="../../../../ai-foundry/media/how-to/sdks/small-language-models-chart-example.jpg":::
157
-
158
-
Now, create a chat completion request with the image:
159
-
160
-
The response is as follows, where you can see the model's usage statistics:
161
-
162
-
```console
163
-
ASSISTANT: The chart illustrates that larger models tend to perform better in quality, as indicated by their size in billions of parameters. However, there are exceptions to this trend, such as Phi-3-medium and Phi-3-small, which outperform smaller models in quality. This suggests that while larger models generally have an advantage, there might be other factors at play that influence a model's performance.
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
30
+
31
+
* This example uses `mistral-large-2407`.
29
32
30
33
## Use chat completions
31
34
32
35
First, create the client to consume the model. The following code uses an endpoint URL and key that are stored in environment variables.
33
36
34
-
35
37
```python
36
38
import os
37
39
from azure.ai.inference import ChatCompletionsClient
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/includes/use-chat-completions/rest.md
+6-2Lines changed: 6 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,24 +26,28 @@ To use chat completion models in your application, you need:
26
26
27
27
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
28
28
29
+
* This example uses `mistral-large-2407`.
30
+
29
31
## Use chat completions
30
32
31
-
To use chat completions API, use the route `/chat/completions` appended to the base URL along with your credential indicated in `api-key`. `Authorization` header is also supported with the format `Bearer <key>`.
33
+
To use chat completions API, use the route `/chat/completions` appended to the base URL along with your credential indicated in `api-key`.
32
34
33
35
```http
34
36
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
35
37
Content-Type: application/json
36
38
api-key: <key>
37
39
```
38
40
39
-
If you have configured the resource with **Microsoft Entra ID** support, pass you token in the `Authorization` header:
41
+
If you have configured the resource with **Microsoft Entra ID** support, pass you token in the `Authorization` header with the format `Bearer <token>`. Use scope `https://cognitiveservices.azure.com/.default`.
40
42
41
43
```http
42
44
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
43
45
Content-Type: application/json
44
46
Authorization: Bearer <token>
45
47
```
46
48
49
+
Using Microsoft Entra ID may require additional configuration in your resource to grant access. Learn how to [configure key-less authentication with Microsoft Entra ID](../../how-to/configure-entra-id.md).
50
+
47
51
### Create a chat completion request
48
52
49
53
The following example shows how you can create a basic chat completions request to the model.
This article explains how to use chat completions API with models deployed to Azure AI model inference in Azure AI services.
19
+
This article explains how to use chat completions API with models supporting images or audio deployed to Azure AI model inference in Azure AI services.
20
20
21
21
## Prerequisites
22
22
@@ -28,6 +28,7 @@ To use chat completion models in your application, you need:
28
28
29
29
* A chat completions model deployment. If you don't have one read [Add and configure models to Azure AI services](../../how-to/create-model-deployments.md) to add a chat completions model to your resource.
0 commit comments