You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This article explains how to use chat completions API with multimodel models deployed to Azure AI model inference in Azure AI services. These multimodal models can accept combinations of text, images, and audio input.
19
+
This article explains how to use chat completions API with _multimodal_ models deployed to Azure AI model inference in Azure AI services. In addition to text input, multimodal models can accept other input types, such as images and audio input.
20
20
21
21
## Prerequisites
22
22
@@ -62,6 +62,9 @@ client = new ChatCompletionsClient(
62
62
63
63
Some models can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of Some models for vision in a chat fashion:
64
64
65
+
> [!IMPORTANT]
66
+
> Some models support only one image for each turn in the chat conversation and only the last image is retained in context. If you add multiple images, it results in an error.
67
+
65
68
To see this capability, download an image and encode the information as `base64` string. The resulting data should be inside of a [data URL](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs):
Themodelcanreadthecontentfroman**accessiblecloudlocation**bypassingtheURLasaninput. ThePythonSDKdoesn't provide a direct way to do it, but you can indicate the payload as follows:
This article explains how to use chat completions API with multimodel models deployed to Azure AI model inference in Azure AI services. These multimodal models can accept combinations of text, images, and audio input.
19
+
This article explains how to use chat completions API with _multimodal_ models deployed to Azure AI model inference in Azure AI services. In addition to text input, multimodal models can accept other input types, such as images and audio input.
20
20
21
21
## Prerequisites
22
22
@@ -57,6 +57,9 @@ const client = new ModelClient(
57
57
58
58
Some models can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of some models for vision in a chat fashion.
59
59
60
+
> [!IMPORTANT]
61
+
> Some models support only one image for each turn in the chat conversation and only the last image is retained in context. If you add multiple images, it results in an error.
62
+
60
63
To see this capability, download an image and encode the information as `base64` string. The resulting data should be inside of a [data URL](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs):
The model can read the content from an **accessible cloud location** by passing the URL as an input. The Python SDK doesn't provide a direct way to do it, but you can indicate the payload as follows:
This article explains how to use chat completions API with multimodel models deployed to Azure AI model inference in Azure AI services. These multimodal models can accept combinations of text, images, and audio input.
19
+
This article explains how to use chat completions API with _multimodal_ models deployed to Azure AI model inference in Azure AI services. In addition to text input, multimodal models can accept other input types, such as images and audio input.
The model can read the content from an **accessible cloud location** by passing the URL as an input. The Python SDK doesn't provide a direct way to do it, but you can indicate the payload as follows:
This article explains how to use chat completions API with multimodel models deployed to Azure AI model inference in Azure AI services. These multimodal models can accept combinations of text, images, and audio input.
19
+
This article explains how to use chat completions API with _multimodal_ models deployed to Azure AI model inference in Azure AI services. In addition to text input, multimodal models can accept other input types, such as images and audio input.
0 commit comments