Merge pull request #3167 from msakande/phi4-mini-only

prmerger-automator[bot] · web-flow · commit d135d64625df · 2025-02-26T15:24:38.000Z
Phi4 mini only
diff --git a/articles/ai-studio/how-to/deploy-models-phi-4.md b/articles/ai-studio/how-to/deploy-models-phi-4.md
@@ -5,7 +5,7 @@ description: Learn how to use Phi-4 family chat models with Azure AI Foundry.
 ms.service: azure-ai-foundry
 manager: scottpolly
 ms.topic: how-to
-ms.date: 01/09/2025
+ms.date: 02/25/2025
 ms.reviewer: v-vkonjarla
 reviewer: VindyaKonjarla
 ms.author: mopeakande
@@ -27,18 +27,35 @@ The Phi-4 family of small language models (SLMs) is a collection of instruction-
 
 ## Phi-4 family chat models
 
+The Phi-4 family chat models include the following models:
+
+# [Phi-4-mini-instruct](#tab/phi-4-mini-instruct)
+
+Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4 model family and supports 128K token context length. The model underwent an enhancement process, incorporating both supervised fine-tuning and direct preference optimization to support precise instruction adherence and robust safety measures. 
+
+The Phi-4-mini-instruct model comes in the following variant with a 128K token length.
+
+
+The following models are available:
+
+* [Phi-4-mini-instruct](https://aka.ms/azureai/landing/Phi-4-mini-instruct)
+
+
+# [Phi-4](#tab/phi-4)
+
 Phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.
 
 Phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
+The Phi-4 model comes in the following variant with a 16K token length.
 
-The Phi-4 models come in the following variants with a 16K tokens length.
 
-
-You can learn more about the models in their respective model card:
+The following models are available:
 
 * [Phi-4](https://aka.ms/azureai/landing/Phi-4)
 
 
+---
+
 ## Prerequisites
 
 To use Phi-4 family chat models with Azure AI Foundry, you need the following prerequisites:
@@ -138,7 +155,7 @@ print("Model provider name:", model_info.model_provider_name)
 ```
 
 ```console
-Model name: Phi-4
+Model name: Phi-4-mini-instruct
 Model type: chat-completions
 Model provider name: Microsoft
 ```
@@ -159,7 +176,7 @@ response = client.complete(
 ```
 
 > [!NOTE]
-> Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
+> Phi-4-mini-instruct and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
 
 The response is as follows, where you can see the model's usage statistics:
 
@@ -175,7 +192,7 @@ print("\tCompletion tokens:", response.usage.completion_tokens)
 
 ```console
 Response: As of now, it's estimated that there are about 7,000 languages spoken around the world. However, this number can vary as some languages become extinct and new ones develop. It's also important to note that the number of speakers can greatly vary between languages, with some having millions of speakers and others only a few hundred.
-Model: Phi-4
+Model: Phi-4-mini-instruct
 Usage: 
   Prompt tokens: 19
   Total tokens: 91
@@ -322,18 +339,35 @@ except HttpResponseError as ex:
 
 ## Phi-4 family chat models
 
+The Phi-4 family chat models include the following models:
+
+# [Phi-4-mini-instruct](#tab/phi-4-mini-instruct)
+
+Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4 model family and supports 128K token context length. The model underwent an enhancement process, incorporating both supervised fine-tuning and direct preference optimization to support precise instruction adherence and robust safety measures. 
+
+The Phi-4-mini-instruct model comes in the following variant with a 128K token length.
+
+
+The following models are available:
+
+* [Phi-4-mini-instruct](https://aka.ms/azureai/landing/Phi-4-mini-instruct)
+
+
+# [Phi-4](#tab/phi-4)
+
 Phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.
 
 Phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
+The Phi-4 model comes in the following variant with a 16K token length.
 
-The Phi-4 models come in the following variants with a 16K tokens length.
 
-
-You can learn more about the models in their respective model card:
+The following models are available:
 
 * [Phi-4](https://aka.ms/azureai/landing/Phi-4)
 
 
+---
+
 ## Prerequisites
 
 To use Phi-4 family chat models with Azure AI Foundry, you need the following prerequisites:
@@ -431,7 +465,7 @@ console.log("Model provider name: ", model_info.body.model_provider_name)
 ```
 
 ```console
-Model name: Phi-4
+Model name: Phi-4-mini-instruct
 Model type: chat-completions
 Model provider name: Microsoft
 ```
@@ -454,7 +488,7 @@ var response = await client.path("/chat/completions").post({
 ```
 
 > [!NOTE]
-> Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
+> Phi-4-mini-instruct and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
 
 The response is as follows, where you can see the model's usage statistics:
 
@@ -474,7 +508,7 @@ console.log("\tCompletion tokens:", response.body.usage.completion_tokens);
 
 ```console
 Response: As of now, it's estimated that there are about 7,000 languages spoken around the world. However, this number can vary as some languages become extinct and new ones develop. It's also important to note that the number of speakers can greatly vary between languages, with some having millions of speakers and others only a few hundred.
-Model: Phi-4
+Model: Phi-4-mini-instruct
 Usage: 
   Prompt tokens: 19
   Total tokens: 91
@@ -640,18 +674,35 @@ catch (error) {
 
 ## Phi-4 family chat models
 
+The Phi-4 family chat models include the following models:
+
+# [Phi-4-mini-instruct](#tab/phi-4-mini-instruct)
+
+Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4 model family and supports 128K token context length. The model underwent an enhancement process, incorporating both supervised fine-tuning and direct preference optimization to support precise instruction adherence and robust safety measures. 
+
+The Phi-4-mini-instruct model comes in the following variant with a 128K token length.
+
+
+The following models are available:
+
+* [Phi-4-mini-instruct](https://aka.ms/azureai/landing/Phi-4-mini-instruct)
+
+
+# [Phi-4](#tab/phi-4)
+
 Phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.
 
 Phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
+The Phi-4 model comes in the following variant with a 16K token length.
 
-The Phi-4 models come in the following variants with a 16K tokens length.
 
-
-You can learn more about the models in their respective model card:
+The following models are available:
 
 * [Phi-4](https://aka.ms/azureai/landing/Phi-4)
 
 
+---
+
 ## Prerequisites
 
 To use Phi-4 family chat models with Azure AI Foundry, you need the following prerequisites:
@@ -764,7 +815,7 @@ Console.WriteLine($"Model provider name: {modelInfo.Value.ModelProviderName}");
 ```
 
 ```console
-Model name: Phi-4
+Model name: Phi-4-mini-instruct
 Model type: chat-completions
 Model provider name: Microsoft
 ```
@@ -786,7 +837,7 @@ Response<ChatCompletions> response = client.Complete(requestOptions);
 ```
 
 > [!NOTE]
-> Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
+> Phi-4-mini-instruct and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
 
 The response is as follows, where you can see the model's usage statistics:
 
@@ -802,7 +853,7 @@ Console.WriteLine($"\tCompletion tokens: {response.Value.Usage.CompletionTokens}
 
 ```console
 Response: As of now, it's estimated that there are about 7,000 languages spoken around the world. However, this number can vary as some languages become extinct and new ones develop. It's also important to note that the number of speakers can greatly vary between languages, with some having millions of speakers and others only a few hundred.
-Model: Phi-4
+Model: Phi-4-mini-instruct
 Usage: 
   Prompt tokens: 19
   Total tokens: 91
@@ -970,18 +1021,35 @@ catch (RequestFailedException ex)
 
 ## Phi-4 family chat models
 
+The Phi-4 family chat models include the following models:
+
+# [Phi-4-mini-instruct](#tab/phi-4-mini-instruct)
+
+Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4 model family and supports 128K token context length. The model underwent an enhancement process, incorporating both supervised fine-tuning and direct preference optimization to support precise instruction adherence and robust safety measures. 
+
+The Phi-4-mini-instruct model comes in the following variant with a 128K token length.
+
+
+The following models are available:
+
+* [Phi-4-mini-instruct](https://aka.ms/azureai/landing/Phi-4-mini-instruct)
+
+
+# [Phi-4](#tab/phi-4)
+
 Phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.
 
 Phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
+The Phi-4 model comes in the following variant with a 16K token length.
 
-The Phi-4 models come in the following variants with a 16K tokens length.
 
-
-You can learn more about the models in their respective model card:
+The following models are available:
 
 * [Phi-4](https://aka.ms/azureai/landing/Phi-4)
 
 
+---
+
 ## Prerequisites
 
 To use Phi-4 family chat models with Azure AI Foundry, you need the following prerequisites:
@@ -1045,7 +1113,7 @@ The response is as follows:
 
 ```json
 {
-    "model_name": "Phi-4",
+    "model_name": "Phi-4-mini-instruct",
     "model_type": "chat-completions",
     "model_provider_name": "Microsoft"
 }
@@ -1071,7 +1139,7 @@ The following example shows how you can create a basic chat completions request
 ```
 
 > [!NOTE]
-> Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
+> Phi-4-mini-instruct and Phi-4 don't support system messages (`role="system"`). When you use the Azure AI model inference API, system messages are translated to user messages, which is the closest capability available. This translation is offered for convenience, but it's important for you to verify that the model is following the instructions in the system message with the right level of confidence.
 
 The response is as follows, where you can see the model's usage statistics:
 
@@ -1081,7 +1149,7 @@ The response is as follows, where you can see the model's usage statistics:
     "id": "0a1234b5de6789f01gh2i345j6789klm",
     "object": "chat.completion",
     "created": 1718726686,
-    "model": "Phi-4",
+    "model": "Phi-4-mini-instruct",
     "choices": [
         {
             "index": 0,
@@ -1138,7 +1206,7 @@ You can visualize how streaming generates content:
     "id": "23b54589eba14564ad8a2e6978775a39",
     "object": "chat.completion.chunk",
     "created": 1718726371,
-    "model": "Phi-4",
+    "model": "Phi-4-mini-instruct",
     "choices": [
         {
             "index": 0,
@@ -1161,7 +1229,7 @@ The last message in the stream has `finish_reason` set, indicating the reason fo
     "id": "23b54589eba14564ad8a2e6978775a39",
     "object": "chat.completion.chunk",
     "created": 1718726371,
-    "model": "Phi-4",
+    "model": "Phi-4-mini-instruct",
     "choices": [
         {
             "index": 0,
@@ -1212,7 +1280,7 @@ Explore other parameters that you can specify in the inference client. For a ful
     "id": "0a1234b5de6789f01gh2i345j6789klm",
     "object": "chat.completion",
     "created": 1718726686,
-    "model": "Phi-4",
+    "model": "Phi-4-mini-instruct",
     "choices": [
         {
             "index": 0,
diff --git a/articles/ai-studio/how-to/model-catalog-overview.md b/articles/ai-studio/how-to/model-catalog-overview.md
@@ -84,7 +84,7 @@ Gretel | Not available | Gretel-Navigator
 Healthcare AI family Models | MedImageParse<BR>  MedImageInsight<BR>  CxrReportGen<BR>  Virchow<BR>  Virchow2<BR>  Prism<BR>  BiomedCLIP-PubMedBERT<BR>  microsoft-llava-med-v1.5<BR>  m42-health-llama3-med4<BR>  biomistral-biomistral-7b<BR>  microsoft-biogpt-large-pub<BR>  microsoft-biomednlp-pub<BR>  stanford-crfm-biomedlm<BR>  medicalai-clinicalbert<BR>  microsoft-biogpt<BR>  microsoft-biogpt-large<BR>  microsoft-biomednlp-pub<BR> | Not Available
 JAIS | Not available | jais-30b-chat
 Meta Llama family models | Llama-3.3-70B-Instruct<BR> Llama-3.2-3B-Instruct<BR>  Llama-3.2-1B-Instruct<BR>  Llama-3.2-1B<BR>  Llama-3.2-90B-Vision-Instruct<BR>  Llama-3.2-11B-Vision-Instruct<BR>  Llama-3.1-8B-Instruct<BR>  Llama-3.1-8B<BR>  Llama-3.1-70B-Instruct<BR>  Llama-3.1-70B<BR>  Llama-3-8B-Instruct<BR>  Llama-3-70B<BR>  Llama-3-8B<BR>  Llama-Guard-3-1B<BR>  Llama-Guard-3-8B<BR>  Llama-Guard-3-11B-Vision<BR>  Llama-2-7b<BR>  Llama-2-70b<BR>  Llama-2-7b-chat<BR>  Llama-2-13b-chat<BR>  CodeLlama-7b-hf<BR>  CodeLlama-7b-Instruct-hf<BR>  CodeLlama-34b-hf<BR>  CodeLlama-34b-Python-hf<BR>  CodeLlama-34b-Instruct-hf<BR>  CodeLlama-13b-Instruct-hf<BR>  CodeLlama-13b-Python-hf<BR>  Prompt-Guard-86M<BR>  CodeLlama-70b-hf<BR> | Llama-3.3-70B-Instruct<BR> Llama-3.2-90B-Vision-Instruct<br>  Llama-3.2-11B-Vision-Instruct<br>  Llama-3.1-8B-Instruct<br>  Llama-3.1-70B-Instruct<br>  Llama-3.1-405B-Instruct<br>  Llama-3-8B-Instruct<br>  Llama-3-70B-Instruct<br>  Llama-2-7b<br>  Llama-2-7b-chat<br>  Llama-2-70b<br>  Llama-2-70b-chat<br>  Llama-2-13b<br>  Llama-2-13b-chat<br>
-Microsoft Phi family models | Phi-3-mini-4k-Instruct <br> Phi-3-mini-128k-Instruct <br> Phi-3-small-8k-Instruct <br> Phi-3-small-128k-Instruct <br> Phi-3-medium-4k-instruct <br> Phi-3-medium-128k-instruct <br> Phi-3-vision-128k-Instruct <br> Phi-3.5-mini-Instruct <br> Phi-3.5-vision-Instruct <br> Phi-3.5-MoE-Instruct <br> Phi-4| Phi-3-mini-4k-Instruct <br> Phi-3-mini-128k-Instruct <br> Phi-3-small-8k-Instruct <br> Phi-3-small-128k-Instruct <br> Phi-3-medium-4k-instruct <br> Phi-3-medium-128k-instruct <br> <br> Phi-3.5-mini-Instruct <br> Phi-3.5-vision-Instruct <br> Phi-3.5-MoE-Instruct <br> Phi-4
+Microsoft Phi family models | Phi-3-mini-4k-Instruct <br> Phi-3-mini-128k-Instruct <br> Phi-3-small-8k-Instruct <br> Phi-3-small-128k-Instruct <br> Phi-3-medium-4k-instruct <br> Phi-3-medium-128k-instruct <br> Phi-3-vision-128k-Instruct <br> Phi-3.5-mini-Instruct <br> Phi-3.5-vision-Instruct <br> Phi-3.5-MoE-Instruct <br> Phi-4 <br> Phi-4-mini-instruct | Phi-3-mini-4k-Instruct <br> Phi-3-mini-128k-Instruct <br> Phi-3-small-8k-Instruct <br> Phi-3-small-128k-Instruct <br> Phi-3-medium-4k-instruct <br> Phi-3-medium-128k-instruct <br> <br> Phi-3.5-mini-Instruct <br> Phi-3.5-vision-Instruct <br> Phi-3.5-MoE-Instruct <br> Phi-4 <br> Phi-4-mini-instruct
 Mistral family models | mistralai-Mixtral-8x22B-v0-1 <br> mistralai-Mixtral-8x22B-Instruct-v0-1 <br> mistral-community-Mixtral-8x22B-v0-1 <br> mistralai-Mixtral-8x7B-v01 <br> mistralai-Mistral-7B-Instruct-v0-2 <br> mistralai-Mistral-7B-v01 <br> mistralai-Mixtral-8x7B-Instruct-v01 <br> mistralai-Mistral-7B-Instruct-v01 | Mistral-large (2402) <br> Mistral-large (2407) <br> Mistral-small <br> Ministral-3B <br> Mistral-NeMo
 Nixtla | Not available | TimeGEN-1
 
diff --git a/articles/ai-studio/includes/region-availability-maas.md b/articles/ai-studio/includes/region-availability-maas.md
@@ -61,7 +61,7 @@ Llama 3.1 405B Instruct  | [Microsoft Managed countries/regions](/partner-center
 
 | Model | Offer Availability Region  | Hub/Project Region for Deployment  | Hub/Project Region for Fine tuning  |
 |---------|---------|---------|---------|
-Phi-4                       | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3  | Not available       |
+Phi-4 <br>  Phi-4-mini-instruct    | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3  | Not available       |
 Phi-3.5-vision-Instruct     | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3  | Not available       |
 Phi-3.5-MoE-Instruct     | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3  | East US 2       |
 Phi-3.5-Mini-Instruct     | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3  | East US 2  | East US 2       |