Merge pull request #6665 from mrbullwinkle/mrb_08_19_2025_quota_stuff

prmerger-automator[bot] · web-flow · commit ef9ca358d8b3 · 2025-08-21T00:38:30.000Z
[Azure OpenAI] Quota updates
diff --git a/articles/ai-foundry/openai/quotas-limits.md b/articles/ai-foundry/openai/quotas-limits.md
@@ -4,7 +4,7 @@ description: This article features detailed descriptions and best practices on t
 author: mrbullwinkle
 ms.author: mbullwin
 manager: nitinme
-ms.date: 08/07/2025
+ms.date: 08/19/2025
 ms.service: azure-ai-openai
 ms.topic: conceptual
 ms.custom:
@@ -80,6 +80,13 @@ The following section provides you with a quick guide to the default quotas and
 | gpt-5-nano  | 5 M             | 150 M              | 2 M                | 50 M                  |
 | gpt-5-chat  | 1 M             | 5 M                | N/A              |    N/A                  |
 
+| Model       | Global Default<br>Requests per minute (RPM)  | Global Enterprise and MCA-E <br>Requests per minute (RPM)  | Data Zone Default <br>Requests per minute (RPM)  | Data Zone Enterprise and MCA-E <br>Requests per minute (RPM) |
+|-------------|----------------------------------------------|------------------------------------------------------------|--------------------------------------------------|--------------------------------------------------------------|
+| gpt-5       | 10 K                                         | 100 K                                                      | 3 K                                              | 30 K                   |
+| gpt-5-mini  | 1 K                                          | 10 K                                                       | 300                                              | 3 K                   |
+| gpt-5-nano  | 5 K                                          | 150 K                                                      | 2 K                                              | 50 K                  |
+| gpt-5-chat  | 1 K                                          | 5 K                                                        | N/A                                              | N/A                  |
+
 
 [!INCLUDE [Quota](./includes/global-batch-limits.md)]
 
@@ -206,7 +213,7 @@ The following section provides you with a quick guide to the default quotas and
 | Model|Tier| Quota limit in tokens per minute | Requests per minute |
 |---|---|:---:|:---:|
 |`gpt-4o`|Enterprise and MCA-E | 30M | 180K |
-|`gpt-4o-mini` | Enterprise and MCA-E | 50M | 300K |
+|`gpt-4o-mini` | Enterprise and MCA-E | 150M | 1.5M |
 |`gpt-4o` |Default | 450K | 2.7K |
 |`gpt-4o-mini` | Default | 2M | 12K  |
 
@@ -276,11 +283,15 @@ The usage limit determines the level of usage above which customers might see la
 
 If your Azure subscription is linked to certain [offer types](https://azure.microsoft.com/support/legal/offer-details/), your maximum quota values are lower than the values indicated in the previous tables.
 
+- GPT-5 reasoning model quota is 20K TPM and 200 RPM for all offer types that do not have access to MCA-E or default quota. GPT-5-chat is 50K and 50 RPM.
+
+- Some offer types are restricted to only Global Standard deployments in the East US2 and Sweden Central regions.
+
 |Tier| Quota limit in tokens per minute |
 |---|:---|
 |`Azure for Students` | 1K (all models) <br>Exception o-series, GPT-4.1, and GPT 4.5 Preview: 0|
 | `MSDN` | GPT-4o-mini: 200K <br> GPT 3.5 Turbo Series: 200K <br> GPT-4 series: 50K <br>computer-use-preview: 8K <br> gpt-4o-realtime-preview: 1K <br> o-series: 0 <br> GPT 4.5 Preview: 0 <br> GPT-4.1: 50K <br> GPT-4.1-nano: 200K  |
-|`Standard` | GPT-4o-mini: 200K <br> GPT 3.5 Turbo Series: 200K <br> GPT-4 series: 50K <br>computer-use-preview: 30K <br> o-series: 0 <br> GPT 4.5 Preview: 0  <br> GPT-4.1: 50K <br> GPT-4.1-nano: 200K  |
+|`Standard`& `Pay-as-you-go` | GPT-4o-mini: 200K <br> GPT 3.5 Turbo Series: 200K <br> GPT-4 series: 50K <br>computer-use-preview: 30K <br> o-series: 0 <br> GPT 4.5 Preview: 0  <br> GPT-4.1: 50K <br> GPT-4.1-nano: 200K  |
 | `Azure_MS-AZR-0111P`  <br> `Azure_MS-AZR-0035P` <br> `Azure_MS-AZR-0025P` <br> `Azure_MS-AZR-0052P` <br>| GPT-4o-mini: 200K <br> GPT 3.5 Turbo Series: 200K <br> GPT-4 series: 50K |
 | `CSP Integration Sandbox` <sup>*</sup> | All models: 0 |
 | `Lightweight trial`<br>`Free trials`<br>`Azure Pass`  | All models: 0 |