Skip to content

Commit 9dd3633

Browse files
Merge pull request #271202 from mrbullwinkle/mrb_04_04_2024_quota
[Azure OpenAI] quota clarification
2 parents 00fb7d0 + 3f47e60 commit 9dd3633

File tree

1 file changed

+5
-1
lines changed
  • articles/ai-services/openai/includes/model-matrix

1 file changed

+5
-1
lines changed

articles/ai-services/openai/includes/model-matrix/quota.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ ms.date: 03/13/2024
1010

1111
The default quota for models varies by model and region. Default quota limits are subject to change.
1212

13+
Quota for standard deployments is described in of terms of [Tokens-Per-Minute (TPM)](../../how-to/quota.md).
14+
1315
| Region | GPT-4 | GPT-4-32K | GPT-4-Turbo | GPT-4-Turbo-V | GPT-35-Turbo | GPT-35-Turbo-Instruct | Text-Embedding-Ada-002 | text-embedding-3-small | text-embedding-3-large | Babbage-002 | Babbage-002 - finetune | Davinci-002 | Davinci-002 - finetune | GPT-35-Turbo - finetune | GPT-35-Turbo-1106 - finetune | GPT-35-Turbo-0125 - finetune |
1416
|:-----------------|:-------:|:-----------:|:-------------:|:---------------:|:--------------:|:-----------------------:|:------------------------:|:------------------------:|:------------------------:|:-------------:|:------------------------:|:-------------:|:------------------------:|:-------------------------:|:------------------------------:|:-------------------------------|
1517
| australiaeast | 40 K | 80 K | 80 K | 30 K | 300 K | - | 350 K | - | - | - | - | - | - | - | - | - |
@@ -28,4 +30,6 @@ The default quota for models varies by model and region. Default quota limits ar
2830
| switzerlandnorth | 40 K | 80 K | - | 30 K | 300 K | - | 350 K | - | - | - | - | - | - | - | - | - |
2931
| uksouth | - | - | 80 K | - | 240 K | - | 350 K | - | - | - | - | - | - | - | - | - |
3032
| westeurope | - | - | - | - | 240 K | - | 240 K | - | - | - | - | - | - | - | - | - |
31-
| westus | - | - | 80 K | 30 K | 300 K | - | 350 K | - | - | - | - | - | - | - | - | - |
33+
| westus | - | - | 80 K | 30 K | 300 K | - | 350 K | - | - | - | - | - | - | - | - | - |
34+
35+
1 K = 1000 Tokens-Per-Minute (TPM). The relationship between TPM and Requests Per Minute (RPM) is [currently defined as 6 RPM per 1000 TPM](../../how-to/quota.md#understanding-rate-limits).

0 commit comments

Comments
 (0)