You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Phi-4-mini-MM-instruct is a lightweight open multimodal foundation model that leverages the language, vision, and speech research and datasets used for Phi-3.5 and 4.0 models. The model processes text, image, and audio inputs, and generates text outputs. The model underwent an enhancement process, incorporating both supervised fine-tuning, and direct preference optimization to support precise instruction adherence and safety measures.
34
+
Phi-4-multimodal-instruct is a lightweight open multimodal foundation model that leverages the language, vision, and speech research and datasets used for Phi-3.5 and 4.0 models. The model processes text, image, and audio inputs, and generates text outputs. The model underwent an enhancement process, incorporating both supervised fine-tuning, and direct preference optimization to support precise instruction adherence and safety measures.
35
35
36
-
The Phi-4-mini-MM-instruct model comes in the following variant with a 128K token length.
36
+
The Phi-4-multimodal-instruct model comes in the following variant with a 128K token length.
> Phi-4-mini-MM-instruct, Phi-4-mini-instruct, and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
191
+
> Phi-4-multimodal-instruct, Phi-4-mini-instruct, and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
192
192
193
193
The response is as follows, where you can see the model's usage statistics:
Response: As of now, it's estimated that there are about 7,000 languages spoken around the world. However, this number can vary as some languages become extinct and new ones develop. It's also important to note that the number of speakers can greatly vary between languages, with some having millions of speakers and others only a few hundred.
207
-
Model: Phi-4-mini-MM-instruct
207
+
Model: Phi-4-multimodal-instruct
208
208
Usage:
209
209
Prompt tokens: 19
210
210
Total tokens: 91
@@ -356,16 +356,16 @@ except HttpResponseError as ex:
356
356
357
357
The Phi-4 family chat models include the following models:
Phi-4-mini-MM-instruct is a lightweight open multimodal foundation model that leverages the language, vision, and speech research and datasets used for Phi-3.5 and 4.0 models. The model processes text, image, and audio inputs, and generates text outputs. The model underwent an enhancement process, incorporating both supervised fine-tuning, and direct preference optimization to support precise instruction adherence and safety measures.
361
+
Phi-4-multimodal-instruct is a lightweight open multimodal foundation model that leverages the language, vision, and speech research and datasets used for Phi-3.5 and 4.0 models. The model processes text, image, and audio inputs, and generates text outputs. The model underwent an enhancement process, incorporating both supervised fine-tuning, and direct preference optimization to support precise instruction adherence and safety measures.
362
362
363
-
The Phi-4-mini-MM-instruct model comes in the following variant with a 128K token length.
363
+
The Phi-4-multimodal-instruct model comes in the following variant with a 128K token length.
@@ -515,7 +515,7 @@ var response = await client.path("/chat/completions").post({
515
515
```
516
516
517
517
> [!NOTE]
518
-
> Phi-4-mini-MM-instruct, Phi-4-mini-instruct, and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
518
+
> Phi-4-multimodal-instruct, Phi-4-mini-instruct, and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
519
519
520
520
The response is as follows, where you can see the model's usage statistics:
Response: As of now, it's estimated that there are about 7,000 languages spoken around the world. However, this number can vary as some languages become extinct and new ones develop. It's also important to note that the number of speakers can greatly vary between languages, with some having millions of speakers and others only a few hundred.
538
-
Model: Phi-4-mini-MM-instruct
538
+
Model: Phi-4-multimodal-instruct
539
539
Usage:
540
540
Prompt tokens: 19
541
541
Total tokens: 91
@@ -706,16 +706,16 @@ catch (error) {
706
706
707
707
The Phi-4 family chat models include the following models:
Phi-4-mini-MM-instruct is a lightweight open multimodal foundation model that leverages the language, vision, and speech research and datasets used for Phi-3.5 and 4.0 models. The model processes text, image, and audio inputs, and generates text outputs. The model underwent an enhancement process, incorporating both supervised fine-tuning, and direct preference optimization to support precise instruction adherence and safety measures.
711
+
Phi-4-multimodal-instruct is a lightweight open multimodal foundation model that leverages the language, vision, and speech research and datasets used for Phi-3.5 and 4.0 models. The model processes text, image, and audio inputs, and generates text outputs. The model underwent an enhancement process, incorporating both supervised fine-tuning, and direct preference optimization to support precise instruction adherence and safety measures.
712
712
713
-
The Phi-4-mini-MM-instruct model comes in the following variant with a 128K token length.
713
+
The Phi-4-multimodal-instruct model comes in the following variant with a 128K token length.
> Phi-4-mini-MM-instruct, Phi-4-mini-instruct, and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
882
+
> Phi-4-multimodal-instruct, Phi-4-mini-instruct, and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
883
883
884
884
The response is as follows, where you can see the model's usage statistics:
Response: As of now, it's estimated that there are about 7,000 languages spoken around the world. However, this number can vary as some languages become extinct and new ones develop. It's also important to note that the number of speakers can greatly vary between languages, with some having millions of speakers and others only a few hundred.
Phi-4-mini-MM-instruct is a lightweight open multimodal foundation model that leverages the language, vision, and speech research and datasets used for Phi-3.5 and 4.0models. The model processes text, image, and audio inputs, and generates text outputs. The model underwent an enhancement process, incorporating both supervised fine-tuning, and direct preference optimization to support precise instruction adherence and safety measures.
1073
+
Phi-4-multimodal-instruct is a lightweight open multimodal foundation model that leverages the language, vision, and speech research and datasets used for Phi-3.5 and 4.0models. The model processes text, image, and audio inputs, and generates text outputs. The model underwent an enhancement process, incorporating both supervised fine-tuning, and direct preference optimization to support precise instruction adherence and safety measures.
1074
1074
1075
-
The Phi-4-mini-MM-instruct model comes in the following variant with a 128K token length.
1075
+
The Phi-4-multimodal-instruct model comes in the following variant with a 128K token length.
@@ -1196,7 +1196,7 @@ The following example shows how you can create a basic chat completions request
1196
1196
```
1197
1197
1198
1198
> [!NOTE]
1199
-
> Phi-4-mini-MM-instruct, Phi-4-mini-instruct, and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
1199
+
> Phi-4-multimodal-instruct, Phi-4-mini-instruct, and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
1200
1200
1201
1201
The response is as follows, where you can see the model's usage statistics:
1202
1202
@@ -1206,7 +1206,7 @@ The response is as follows, where you can see the model's usage statistics:
1206
1206
"id": "0a1234b5de6789f01gh2i345j6789klm",
1207
1207
"object": "chat.completion",
1208
1208
"created": 1718726686,
1209
-
"model": "Phi-4-mini-MM-instruct",
1209
+
"model": "Phi-4-multimodal-instruct",
1210
1210
"choices": [
1211
1211
{
1212
1212
"index": 0,
@@ -1263,7 +1263,7 @@ You can visualize how streaming generates content:
1263
1263
"id": "23b54589eba14564ad8a2e6978775a39",
1264
1264
"object": "chat.completion.chunk",
1265
1265
"created": 1718726371,
1266
-
"model": "Phi-4-mini-MM-instruct",
1266
+
"model": "Phi-4-multimodal-instruct",
1267
1267
"choices": [
1268
1268
{
1269
1269
"index": 0,
@@ -1286,7 +1286,7 @@ The last message in the stream has `finish_reason` set, indicating the reason fo
1286
1286
"id": "23b54589eba14564ad8a2e6978775a39",
1287
1287
"object": "chat.completion.chunk",
1288
1288
"created": 1718726371,
1289
-
"model": "Phi-4-mini-MM-instruct",
1289
+
"model": "Phi-4-multimodal-instruct",
1290
1290
"choices": [
1291
1291
{
1292
1292
"index": 0,
@@ -1337,7 +1337,7 @@ Explore other parameters that you can specify in the inference client. For a ful
| Model | Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning |
56
56
|---------|---------|---------|---------|
57
-
Phi-4 <br> Phi-4-mini-instruct <br> Phi-4-mini-MM-instruct | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
57
+
Phi-4 <br> Phi-4-mini-instruct <br> Phi-4-multimodal-instruct | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
58
58
Phi-3.5-vision-Instruct | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
59
59
Phi-3.5-MoE-Instruct | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | East US 2 |
60
60
Phi-3.5-Mini-Instruct | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | East US 2 | East US 2 |
0 commit comments