Skip to content

Commit e59c942

Browse files
authored
Merge pull request #5646 from MicrosoftDocs/main
6/20/2025 AM Publish
2 parents fc0b900 + 155bbb1 commit e59c942

File tree

15 files changed

+418
-361
lines changed

15 files changed

+418
-361
lines changed

articles/ai-foundry/model-inference/concepts/models.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,6 @@ Microsoft models include various model groups such as MAI models, Phi models, he
180180

181181
| Model | Type | Tier | Capabilities |
182182
| ------ | ---- | --- | ------------ |
183-
| [MAI-DS-R1](https://ai.azure.com/explore/models/MAI-DS-R1/version/1/registry/azureml) | chat-completion <br /> [(with reasoning content)](../how-to/use-chat-reasoning.md) | Global standard | - **Input:** text (163,840 tokens) <br /> - **Output:** (163,840 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text. |
184183
| [Phi-4-mini-instruct](https://ai.azure.com/explore/models/Phi-4-mini-instruct/version/1/registry/azureml) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** `ar`, `zh`, `cs`, `da`, `nl`, `en`, `fi`, `fr`, `de`, `he`, `hu`, `it`, `ja`, `ko`, `no`, `pl`, `pt`, `ru`, `es`, `sv`, `th`, `tr`, and `uk` <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
185184
| [Phi-4-multimodal-instruct](https://ai.azure.com/explore/models/Phi-4-multimodal-instruct/version/1/registry/azureml) | chat-completion | Global standard | - **Input:** text, images, and audio (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** `ar`, `zh`, `cs`, `da`, `nl`, `en`, `fi`, `fr`, `de`, `he`, `hu`, `it`, `ja`, `ko`, `no`, `pl`, `pt`, `ru`, `es`, `sv`, `th`, `tr`, and `uk` <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
186185
| [Phi-4](https://ai.azure.com/explore/models/Phi-4/version/2/registry/azureml) | chat-completion | Global standard | - **Input:** text (16,384 tokens) <br /> - **Output:** (16,384 tokens) <br /> - **Languages:** `en`, `ar`, `bn`, `cs`, `da`, `de`, `el`, `es`, `fa`, `fi`, `fr`, `gu`, `ha`, `he`, `hi`, `hu`, `id`, `it`, `ja`, `jv`, `kn`, `ko`, `ml`, `mr`, `nl`, `no`, `or`, `pa`, `pl`, `ps`, `pt`, `ro`, `ru`, `sv`, `sw`, `ta`, `te`, `th`, `tl`, `tr`, `uk`, `ur`, `vi`, `yo`, and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text |

articles/ai-foundry/model-inference/includes/use-chat-reasoning/about-reasoning.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ author: santiagxf
99

1010
## Reasoning models
1111

12-
Reasoning models can reach higher levels of performance in domains like math, coding, science, strategy, and logistics. The way these models produces outputs is by explicitly using chain of thought to explore all possible paths before generating an answer. They verify their answers as they produce them which helps them to arrive to better more accurate conclusions. This means that reasoning models may require less context in prompting in order to produce effective results.
12+
Reasoning models can reach higher levels of performance in domains like math, coding, science, strategy, and logistics. The way these models produce outputs is by explicitly using chain of thought to explore all possible paths before generating an answer. They verify their answers as they produce them which helps them to arrive to better more accurate conclusions. This means that reasoning models may require less context in prompting in order to produce effective results.
1313

1414
Such way of scaling model's performance is referred as *inference compute time* as it trades performance against higher latency and cost. It contrasts to other approaches that scale through *training compute time*.
1515

@@ -19,4 +19,4 @@ Reasoning models then produce two types of outputs:
1919
> * Reasoning completions
2020
> * Output completions
2121
22-
Both of these completions count towards content generated from the model and hence, towards the token limits and costs associated with the model. Some models may output the reasoning content, like `DeepSeek-R1`. Some others, like `o1`, only outputs the output piece of the completions.
22+
Both of these completions count towards content generated from the model and hence, towards the token limits and costs associated with the model. Some models may output the reasoning content, like `DeepSeek-R1`. Some others, like `o1`, only outputs the output piece of the completions.

articles/ai-services/openai/how-to/fine-tuning-direct-preference-optimization.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ Training datasets must be in `jsonl` format:
5151

5252
## Direct preference optimization model support
5353

54-
- `gpt-4o-2024-08-06` supports direct preference optimization in its respective fine-tuning regions. Latest region availability is updated in the [models page](../concepts/models.md#fine-tuning-models)
54+
- `gpt-4o-2024-08-06`,`gpt-4.1-2025-04-14`,`gpt-4.1-mini-2025-04-14` supports direct preference optimization in its respective fine-tuning regions. Latest region availability is updated in the [models page](../concepts/models.md#fine-tuning-models)
5555

5656
Users can use preference fine tuning with base models as well as models that have already been fine-tuned using supervised fine-tuning as long as they are of a supported model/version.
5757

@@ -70,4 +70,4 @@ Users can use preference fine tuning with base models as well as models that hav
7070

7171
- Explore the fine-tuning capabilities in the [Azure OpenAI fine-tuning tutorial](../tutorials/fine-tune.md).
7272
- Review fine-tuning [model regional availability](../concepts/models.md#fine-tuning-models)
73-
- Learn more about [Azure OpenAI quotas](../quotas-limits.md)
73+
- Learn more about [Azure OpenAI quotas](../quotas-limits.md)

articles/ai-services/openai/includes/global-batch-limits.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,19 +24,25 @@ The table shows the batch quota limit. Quota values for global batch are represe
2424

2525
|Model|Enterprise agreement|Default| Monthly credit card based subscriptions | MSDN subscriptions | Azure for Students, Free Trials |
2626
|---|---|---|---|---|---|
27+
| `gpt-4.1`| 5 B | 200 M | 50 M | 90 K | N/A |
28+
| `gpt-4.1 mini` | 15B | 1B | 50M | 90k | N/A |
29+
| `gpt-4.1-nano` | 15 B | 1 B | 50 M | 90 K | N/A |
2730
| `gpt-4o` | 5 B | 200 M | 50 M | 90 K | N/A|
2831
| `gpt-4o-mini` | 15 B | 1 B | 50 M | 90 K | N/A |
2932
| `gpt-4-turbo` | 300 M | 80 M | 40 M | 90 K | N/A |
3033
| `gpt-4` | 150 M | 30 M | 5 M | 100 K | N/A |
3134
| `gpt-35-turbo` | 10 B | 1 B | 100 M | 2 M | 50 K |
3235
| `o3-mini`| 15 B | 1 B | 50 M | 90 K | N/A |
36+
| `o4-mini` | 15 B | 1 B | 50 M | 90 K | N/A |
3337

3438
B = billion | M = million | K = thousand
3539

3640
### Data zone batch
3741

3842
|Model|Enterprise agreement|Default| Monthly credit card based subscriptions | MSDN subscriptions | Azure for Students, Free Trials |
3943
|---|---|---|---|---|---|
44+
| `gpt-4.1` | 500 M | 30 M | 30 M | 90 K | N/A|
45+
| `gpt-4.1-mini` | 1.5 B | 100 M | 50 M | 90 K | N/A |
4046
| `gpt-4o` | 500 M | 30 M | 30 M | 90 K | N/A|
4147
| `gpt-4o-mini` | 1.5 B | 100 M | 50 M | 90 K | N/A |
4248
| `o3-mini` | 1.5 B | 100 M | 50 M | 90 K | N/A |

articles/ai-services/openai/quotas-limits.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,11 +128,15 @@ The following sections provide you with a quick guide to the default quotas and
128128

129129
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
130130
|---|---|:---:|:---:|
131+
| `codex-mini`| Enterprise agreement | 10 K | 10 M |
132+
| `o3-pro` | Enterprise agreement | 160 | 1.6 M |
131133
| `o4-mini` | Enterprise agreement | 10 M | 10 K |
132134
| `o3` | Enterprise agreement | 10 M | 10 K |
133135
| `o3-mini` | Enterprise agreement | 50 M | 5 K |
134136
| `o1` & `o1-preview` | Enterprise agreement | 30 M | 5 K |
135137
| `o1-mini`| Enterprise agreement | 50 M | 5 K |
138+
| `codex-mini`| Default | 1 K | 1 M |
139+
| `o3-pro` | Default | 1.6 K | 16 M |
136140
| `o4-mini` | Default | 1 M | 1 K |
137141
| `o3` | Default | 1 M | 1 K |
138142
| `o3-mini` | Default | 5 M | 500 |

articles/ai-services/openai/whats-new.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ This article provides a summary of the latest releases and major documentation u
2222

2323
### codex-mini & o3-pro models released
2424

25-
- `codex-mini and `o3-pro` are now available. To learn more, see the [getting started with reasoning models page](./how-to/reasoning.md)
25+
- `codex-mini` and `o3-pro` are now available. To learn more, see the [getting started with reasoning models page](./how-to/reasoning.md)
2626

2727
## May 2025
2828

articles/machine-learning/includes/machine-learning-sdk-v1.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@ ms.date: 04/22/2022
66
ms.author: sgilley
77
---
88

9-
**APPLIES TO:** :::image type="icon" source="../media/yes.png" border="false"::: [Python SDK azureml **v1**](/python/api/overview/azure/ml/?view=azure-ml-py&preserve-view=true)
9+
**APPLIES TO:** :::image type="icon" source="../media/yes.png" border="false"::: [Azure Machine Learning SDK v1 for Python](/python/api/overview/azure/ml/?view=azure-ml-py&preserve-view=true)

0 commit comments

Comments
 (0)