You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/quotas-limits.md
+31-31Lines changed: 31 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ description: Quick reference, detailed description, and best practices on the qu
4
4
author: mrbullwinkle
5
5
ms.author: mbullwin
6
6
manager: nitinme
7
-
ms.date: 07/02/2025
7
+
ms.date: 07/11/2025
8
8
ms.service: azure-ai-openai
9
9
ms.topic: conceptual
10
10
ms.custom:
@@ -15,7 +15,7 @@ ms.custom:
15
15
16
16
# Azure OpenAI in Azure AI Foundry Models quotas and limits
17
17
18
-
This article contains a quick reference and a detailed description of the quotas and limits for Azure OpenAI.
18
+
This article contains a quick reference and a detailed description of the quotas and limits for Azure OpenAI. Quota is not restricted at the tenant level. At its highest level, quota is scoped per individual Azure subscription. Tokens per minute (TPM) and Requests per minute (RPM) quota limits for each model and deployment type are set per region. For example, if `gpt-4.1` global standard has 5 million TPM and 5,000 RPM, each region where the [model/deployment type is available](./concepts/models.md) can use up that amount of quota for an individual subscription. Quota is not shared cross region.
19
19
20
20
## Quotas and limits reference
21
21
@@ -70,29 +70,29 @@ The following sections provide you with a quick guide to the default quotas and
70
70
71
71
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
72
72
|---|---|:---:|:---:|
73
-
|`gpt-4.5`| Enterprise Tier| 200 K | 200 |
73
+
|`gpt-4.5`| Enterprise & MCA-E| 200 K | 200 |
74
74
|`gpt-4.5`| Default | 150 K | 150 |
75
75
76
76
### GPT-4.1 series global standard
77
77
78
78
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
79
79
|---|---|:---:|:---:|
80
-
|`gpt-4.1` (2025-04-14) | Enterprise Tier| 5 M | 5 K |
80
+
|`gpt-4.1` (2025-04-14) | Enterprise & MCA-E| 5 M | 5 K |
81
81
|`gpt-4.1` (2025-04-14) | Default | 1 M | 1 K |
82
-
|`gpt-4.1-nano` (2025-04-14) | Enterprise Tier| 150 M | 150 K |
82
+
|`gpt-4.1-nano` (2025-04-14) | Enterprise & MCA-E| 150 M | 150 K |
83
83
|`gpt-4.1-nano` (2025-04-14) | Default | 5 M | 5 K |
84
-
|`gpt-4.1-mini` (2025-04-14) | Enterprise Tier| 150 M | 150 K |
84
+
|`gpt-4.1-mini` (2025-04-14) | Enterprise & MCA-E| 150 M | 150 K |
85
85
|`gpt-4.1-mini` (2025-04-14) | Default | 5 M | 5 K |
86
86
87
87
### GPT-4.1 series data zone standard
88
88
89
89
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
90
90
|---|---|:---:|:---:|
91
-
|`gpt-4.1` (2025-04-14) | Enterprise Tier| 2 M | 2 K |
91
+
|`gpt-4.1` (2025-04-14) | Enterprise & MCA-E| 2 M | 2 K |
92
92
|`gpt-4.1` (2025-04-14) | Default | 300 K | 300 |
93
-
|`gpt-4.1-nano` (2025-04-14) | Enterprise Tier| 50 M | 50 K |
93
+
|`gpt-4.1-nano` (2025-04-14) | Enterprise & MCA-E| 50 M | 50 K |
94
94
|`gpt-4.1-nano` (2025-04-14) | Default | 2 M | 2 K |
95
-
|`gpt-4.1-mini` (2025-04-14) | Enterprise Tier| 50 M | 50 K |
95
+
|`gpt-4.1-mini` (2025-04-14) | Enterprise & MCA-E| 50 M | 50 K |
96
96
|`gpt-4.1-mini` (2025-04-14) | Default | 2 M | 2 K |
97
97
98
98
### GPT-4 Turbo
@@ -101,21 +101,21 @@ The following sections provide you with a quick guide to the default quotas and
101
101
102
102
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
103
103
|---|---|:---:|:---:|
104
-
|`gpt-4` (turbo-2024-04-09) | Enterprise agreement| 2 M | 12 K |
104
+
|`gpt-4` (turbo-2024-04-09) | Enterprise & MCA-E| 2 M | 12 K |
105
105
|`gpt-4` (turbo-2024-04-09) | Default | 450 K | 2.7 K |
106
106
107
107
## model-router rate limits
108
108
109
109
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
110
110
|---|---|:---:|:---:|
111
-
|`model-router` (2025-05-19) | Enterprise Tier| 10 M | 10 K |
111
+
|`model-router` (2025-05-19) | Enterprise & MCA-E| 10 M | 10 K |
112
112
|`model-router` (2025-05-19) | Default | 1 M | 1 K |
113
113
114
114
## computer-use-preview global standard rate limits
115
115
116
116
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
117
117
|---|---|:---:|:---:|
118
-
|`computer-use-preview`| Enterprise Tier| 30 M | 300 K |
118
+
|`computer-use-preview`| Enterprise & MCA-E| 30 M | 300 K |
119
119
|`computer-use-preview`| Default | 450 K | 4.5 K |
120
120
121
121
## o-series rate limits
@@ -139,13 +139,13 @@ The following sections provide you with a quick guide to the default quotas and
139
139
140
140
| Model |Tier | Quota Limit in tokens per minute (TPM) | Requests per minute |
0 commit comments