Skip to content

Commit 82ac718

Browse files
Merge pull request #5995 from mrbullwinkle/mrb_07_11_2025_quota_enterprise_update
[Azure OpenAI] Quota updates
2 parents ea982c6 + a7a0825 commit 82ac718

File tree

2 files changed

+42
-32
lines changed

2 files changed

+42
-32
lines changed

articles/ai-foundry/openai/includes/global-batch-limits.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ The table shows the batch quota limit. Quota values for global batch are represe
2222

2323
### Global batch
2424

25-
|Model|Enterprise agreement|Default| Monthly credit card based subscriptions | MSDN subscriptions | Azure for Students, Free Trials |
25+
|Model|Enterprise & MCA-E|Default| Monthly credit card based subscriptions | MSDN subscriptions | Azure for Students, Free Trials |
2626
|---|---|---|---|---|---|
2727
| `gpt-4.1`| 5 B | 200 M | 50 M | 90 K | N/A |
2828
| `gpt-4.1 mini` | 15B | 1B | 50M | 90k | N/A |
@@ -39,7 +39,7 @@ B = billion | M = million | K = thousand
3939

4040
### Data zone batch
4141

42-
|Model|Enterprise agreement|Default| Monthly credit card based subscriptions | MSDN subscriptions | Azure for Students, Free Trials |
42+
|Model|Enterprise & MCA-E|Default| Monthly credit card based subscriptions | MSDN subscriptions | Azure for Students, Free Trials |
4343
|---|---|---|---|---|---|
4444
| `gpt-4.1` | 500 M | 30 M | 30 M | 90 K | N/A|
4545
| `gpt-4.1-mini` | 1.5 B | 100 M | 50 M | 90 K | N/A |

articles/ai-foundry/openai/quotas-limits.md

Lines changed: 40 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Quick reference, detailed description, and best practices on the qu
44
author: mrbullwinkle
55
ms.author: mbullwin
66
manager: nitinme
7-
ms.date: 07/02/2025
7+
ms.date: 07/11/2025
88
ms.service: azure-ai-openai
99
ms.topic: conceptual
1010
ms.custom:
@@ -17,6 +17,16 @@ ms.custom:
1717

1818
This article contains a quick reference and a detailed description of the quotas and limits for Azure OpenAI.
1919

20+
**Scope of quota**:
21+
22+
- Quotas and limits are not enforced at the tenant level.
23+
- Instead, the highest level of quota restrictions are scoped at the Azure subscription level.
24+
25+
**Regional quota allocation:**
26+
27+
- Tokens per minute (TPM) and requests per minute (RPM) limits are defined **per region, per subscription, and per model/deployment type**.
28+
- For example, if the `gpt-4.1` global standard model is listed with a quota of **5 million TPM and 5,000 RPM**, then **each region** where that [model/deployment type is available](./concepts/models.md) has its own dedicated pool of quota of that amount for **each of your Azure subscriptions**. So within a single Azure subscription, it is possible to use a larger quantity of total TPM/RPM quota for a given model/deployment type, as long as you have resources/model deployments spread across multiple regions.
29+
2030
## Quotas and limits reference
2131

2232
The following sections provide you with a quick guide to the default quotas and limits that apply to Azure OpenAI:
@@ -70,29 +80,29 @@ The following sections provide you with a quick guide to the default quotas and
7080

7181
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
7282
|---|---|:---:|:---:|
73-
| `gpt-4.5` | Enterprise Tier | 200 K | 200 |
83+
| `gpt-4.5` | Enterprise & MCA-E | 200 K | 200 |
7484
| `gpt-4.5` | Default | 150 K | 150 |
7585

7686
### GPT-4.1 series global standard
7787

7888
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
7989
|---|---|:---:|:---:|
80-
| `gpt-4.1` (2025-04-14) | Enterprise Tier | 5 M | 5 K |
90+
| `gpt-4.1` (2025-04-14) | Enterprise & MCA-E | 5 M | 5 K |
8191
| `gpt-4.1` (2025-04-14) | Default | 1 M | 1 K |
82-
| `gpt-4.1-nano` (2025-04-14) | Enterprise Tier | 150 M | 150 K |
92+
| `gpt-4.1-nano` (2025-04-14) | Enterprise & MCA-E | 150 M | 150 K |
8393
| `gpt-4.1-nano` (2025-04-14) | Default | 5 M | 5 K |
84-
| `gpt-4.1-mini` (2025-04-14) | Enterprise Tier | 150 M | 150 K |
94+
| `gpt-4.1-mini` (2025-04-14) | Enterprise & MCA-E | 150 M | 150 K |
8595
| `gpt-4.1-mini` (2025-04-14) | Default | 5 M | 5 K |
8696

8797
### GPT-4.1 series data zone standard
8898

8999
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
90100
|---|---|:---:|:---:|
91-
| `gpt-4.1` (2025-04-14) | Enterprise Tier | 2 M | 2 K |
101+
| `gpt-4.1` (2025-04-14) | Enterprise & MCA-E | 2 M | 2 K |
92102
| `gpt-4.1` (2025-04-14) | Default | 300 K | 300 |
93-
| `gpt-4.1-nano` (2025-04-14) | Enterprise Tier | 50 M | 50 K |
103+
| `gpt-4.1-nano` (2025-04-14) | Enterprise & MCA-E | 50 M | 50 K |
94104
| `gpt-4.1-nano` (2025-04-14) | Default | 2 M | 2 K |
95-
| `gpt-4.1-mini` (2025-04-14) | Enterprise Tier | 50 M | 50 K |
105+
| `gpt-4.1-mini` (2025-04-14) | Enterprise & MCA-E | 50 M | 50 K |
96106
| `gpt-4.1-mini` (2025-04-14) | Default | 2 M | 2 K |
97107

98108
### GPT-4 Turbo
@@ -101,21 +111,21 @@ The following sections provide you with a quick guide to the default quotas and
101111

102112
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
103113
|---|---|:---:|:---:|
104-
|`gpt-4` (turbo-2024-04-09) | Enterprise agreement | 2 M | 12 K |
114+
|`gpt-4` (turbo-2024-04-09) | Enterprise & MCA-E | 2 M | 12 K |
105115
|`gpt-4` (turbo-2024-04-09) | Default | 450 K | 2.7 K |
106116

107117
## model-router rate limits
108118

109119
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
110120
|---|---|:---:|:---:|
111-
| `model-router` (2025-05-19) | Enterprise Tier | 10 M | 10 K |
121+
| `model-router` (2025-05-19) | Enterprise & MCA-E | 10 M | 10 K |
112122
| `model-router` (2025-05-19) | Default | 1 M | 1 K |
113123

114124
## computer-use-preview global standard rate limits
115125

116126
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
117127
|---|---|:---:|:---:|
118-
| `computer-use-preview`| Enterprise Tier | 30 M | 300 K |
128+
| `computer-use-preview`| Enterprise & MCA-E | 30 M | 300 K |
119129
| `computer-use-preview`| Default | 450 K | 4.5 K |
120130

121131
## o-series rate limits
@@ -139,13 +149,13 @@ The following sections provide you with a quick guide to the default quotas and
139149

140150
| Model |Tier | Quota Limit in tokens per minute (TPM) | Requests per minute |
141151
|--------------------|------------------------|:--------------------------------------:|:---: |
142-
| `codex-mini` | Enterprise agreement | 10 M | 10 K |
143-
| `o3-pro` | Enterprise agreement | 16 M | 1.6 K |
144-
| `o4-mini` | Enterprise agreement | 10 M | 10 K |
145-
| `o3` | Enterprise agreement | 10 M | 10 K |
146-
| `o3-mini` | Enterprise agreement | 50 M | 5 K |
147-
| `o1` & `o1-preview`| Enterprise agreement | 30 M | 5 K |
148-
| `o1-mini` | Enterprise agreement | 50 M | 5 K |
152+
| `codex-mini` | Enterprise & MCA-E | 10 M | 10 K |
153+
| `o3-pro` | Enterprise & MCA-E | 16 M | 1.6 K |
154+
| `o4-mini` | Enterprise & MCA-E | 10 M | 10 K |
155+
| `o3` | Enterprise & MCA-E | 10 M | 10 K |
156+
| `o3-mini` | Enterprise & MCA-E | 50 M | 5 K |
157+
| `o1` & `o1-preview`| Enterprise & MCA-E | 30 M | 5 K |
158+
| `o1-mini` | Enterprise & MCA-E | 50 M | 5 K |
149159
| `codex-mini` | Default | 1 M | 1 K |
150160
| `o3-pro` | Default | 1.6 M | 160 |
151161
| `o4-mini` | Default | 1 M | 1 K |
@@ -158,17 +168,17 @@ The following sections provide you with a quick guide to the default quotas and
158168

159169
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
160170
|---|---|:---:|:---:|
161-
| `o3-mini` | Enterprise agreement | 20 M | 2 K |
171+
| `o3-mini` | Enterprise & MCA-E | 20 M | 2 K |
162172
| `o3-mini` | Default | 2 M | 200 |
163-
| `o1` | Enterprise agreement | 6 M | 1 K |
173+
| `o1` | Enterprise & MCA-E | 6 M | 1 K |
164174
| `o1` | Default | 600 K | 100 |
165175

166176
### o1-preview & o1-mini standard
167177

168178
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
169179
|---|---|:---:|:---:|
170-
| `o1-preview` | Enterprise agreement | 600 K | 100 |
171-
| `o1-mini`| Enterprise agreement | 1 M | 100 |
180+
| `o1-preview` | Enterprise & MCA-E | 600 K | 100 |
181+
| `o1-mini`| Enterprise & MCA-E | 1 M | 100 |
172182
| `o1-preview` | Default | 300 K | 50 |
173183
| `o1-mini`| Default | 500 K | 50 |
174184

@@ -180,8 +190,8 @@ The following sections provide you with a quick guide to the default quotas and
180190

181191
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
182192
|---|---|:---:|:---:|
183-
|`gpt-4o`|Enterprise agreement | 30 M | 180 K |
184-
|`gpt-4o-mini` | Enterprise agreement | 50 M | 300 K |
193+
|`gpt-4o`|Enterprise & MCA-E | 30 M | 180 K |
194+
|`gpt-4o-mini` | Enterprise & MCA-E | 50 M | 300 K |
185195
|`gpt-4o` |Default | 450 K | 2.7 K |
186196
|`gpt-4o-mini` | Default | 2 M | 12 K |
187197

@@ -191,8 +201,8 @@ M = million | K = thousand
191201

192202
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
193203
|---|---|:---:|:---:|
194-
|`gpt-4o`|Enterprise agreement | 10 M | 60 K |
195-
|`gpt-4o-mini` | Enterprise agreement | 20 M | 120 K |
204+
|`gpt-4o`|Enterprise & MCA-E | 10 M | 60 K |
205+
|`gpt-4o-mini` | Enterprise & MCA-E | 20 M | 120 K |
196206
|`gpt-4o` |Default | 300 K | 1.8 K |
197207
|`gpt-4o-mini` | Default | 1 M | 6 K |
198208

@@ -203,8 +213,8 @@ M = million | K = thousand
203213

204214
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
205215
|---|---|:---:|:---:|
206-
|`gpt-4o`|Enterprise agreement | 1 M | 6 K |
207-
|`gpt-4o-mini` | Enterprise agreement | 2 M | 12 K |
216+
|`gpt-4o`|Enterprise & MCA-E | 1 M | 6 K |
217+
|`gpt-4o-mini` | Enterprise & MCA-E | 2 M | 12 K |
208218
|`gpt-4o`|Default | 150 K | 900 |
209219
|`gpt-4o-mini` | Default | 450 K | 2.7 K |
210220

@@ -229,7 +239,7 @@ M = million | K = thousand
229239

230240
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
231241
|---|---|:---:|:---:|
232-
|`gpt-image-1`|Enterprise agreement | N/A | 20 |
242+
|`gpt-image-1`|Enterprise & MCA-E | N/A | 20 |
233243
|`gpt-image-1` |Default | N/A | 6 |
234244

235245

@@ -317,7 +327,7 @@ az rest --method GET --uri "https://management.azure.com/subscriptions/{sub-id}?
317327

318328
| Quota allocation/Offer type | Subscription quota ID |
319329
|:---|:----|
320-
| Enterprise | `EnterpriseAgreement_2014-09-01` |
330+
| Enterprise & MCA-E | `EnterpriseAgreement_2014-09-01` |
321331
| Pay-as-you-go | `PayAsYouGo_2014-09-01`|
322332
| MSDN | `MSDN_2014-09-01` |
323333
| CSP Integration Sandbox | `CSPDEVTEST_2018-05-01` |

0 commit comments

Comments
 (0)