Skip to content

Commit 50016bb

Browse files
Merge pull request #284893 from mrbullwinkle/mrb_08_16_2024_global_standard_update
[Azure OpenAI] Quota update
2 parents 8f58bb4 + a8280a0 commit 50016bb

File tree

1 file changed

+7
-6
lines changed

1 file changed

+7
-6
lines changed

articles/ai-services/openai/quotas-limits.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.custom:
1010
- ignite-2023
1111
- references_regions
1212
ms.topic: conceptual
13-
ms.date: 08/14/2024
13+
ms.date: 08/16/2024
1414
ms.author: mbullwin
1515
---
1616

@@ -50,27 +50,28 @@ The following sections provide you with a quick guide to the default quotas and
5050
| GPT-4 `vision-preview` & GPT-4 `turbo-2024-04-09` default max tokens | 16 <br><br> Increase the `max_tokens` parameter value to avoid truncated responses. GPT-4o max tokens defaults to 4096. |
5151
| Max number of custom headers in API requests<sup>1</sup> | 10 |
5252

53-
<sup>1</sup> Our current APIs allow up to 10 custom headers, which are passed through the pipeline, and returned. We have noticed some customers now exceed this header count resulting in HTTP 431 errors. There is no solution for this error, other than to reduce header volume. **In future API versions we will no longer pass through custom headers**. We recommend customers not depend on custom headers in future system architectures.
54-
53+
<sup>1</sup> Our current APIs allow up to 10 custom headers, which are passed through the pipeline, and returned. We have noticed some customers now exceed this header count resulting in HTTP 431 errors. There is no solution for this error, other than to reduce header volume. **In future API versions we will no longer pass through custom headers**. We recommend customers not depend on custom headers in future system architectures.
5554

5655
## Regional quota limits
5756

5857
[!INCLUDE [Quota](./includes/model-matrix/quota.md)]
5958

6059
[!INCLUDE [Quota](./includes/global-batch-limits.md)]
6160

62-
## gpt-4o rate limits
61+
## gpt-4o & GPT-4 Turbo rate limits
6362

64-
`gpt-4o` and `gpt-4o-mini` have rate limit tiers with higher limits for certain customer types.
63+
`gpt-4o` and `gpt-4o-mini`, and `gpt-4` (`turbo-2024-04-09`) have rate limit tiers with higher limits for certain customer types.
6564

66-
### gpt-4o global standard
65+
### gpt-4o & GPT-4 Turbo global standard
6766

6867
| Model|Tier| Quota Limit in tokens per minute (TPM) | Requests per minute |
6968
|---|---|:---:|:---:|
7069
|`gpt-4o`|Enterprise agreement | 30 M | 180 K |
7170
|`gpt-4o-mini` | Enterprise agreement | 50 M | 300 K |
71+
|`gpt-4` (turbo-2024-04-09) | Enterprise agreement | 2 M | 12 K |
7272
|`gpt-4o` |Default | 450 K | 2.7 K |
7373
|`gpt-4o-mini` | Default | 2 M | 12 K |
74+
|`gpt-4` (turbo-2024-04-09) | Default | 450 K | 2.7 K |
7475

7576
M = million | K = thousand
7677

0 commit comments

Comments
 (0)