Skip to content

Commit efc9fd4

Browse files
authored
Merge pull request #7151 from msakande/PTU-update-for-gpt5-mini
Ptu update for gpt5 mini
2 parents 9f9ee73 + f6fa8c4 commit efc9fd4

File tree

3 files changed

+38
-38
lines changed

3 files changed

+38
-38
lines changed

articles/ai-foundry/openai/how-to/provisioned-throughput-onboarding.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -83,14 +83,14 @@ For example, for gpt-5 1 output token counts as 8 input tokens towards your util
8383
> [!NOTE]
8484
> gpt-4.1, gpt-4.1-mini and gpt-4.1-nano don't support long context (requests estimated at larger than 128k prompt tokens).
8585
86-
|Topic| **gpt-5** | **gpt-4.1** | **gpt-4.1-mini** | **gpt-4.1-nano** | **o3** | **o4-mini** |
87-
| --- | --- | --- | --- | --- | --- | --- |
88-
|Global & data zone provisioned minimum deployment| 15 | 15|15| 15 | 15 | 15 |
89-
|Global & data zone provisioned scale increment| 5 | 5|5| 5 | 5 | 5 |
90-
|Regional provisioned minimum deployment| 50 | 50|25| 25 |50 |25|
91-
|Regional provisioned scale increment| 50 | 50|25| 25 | 50 | 25|
92-
|Input TPM per PTU| 4,750 | 3,000|14,900| 59,400 | 3,000 | 5,400 |
93-
|Latency Target Value| 99% > 50 Tokens Per Second\* | 99% > 80 Tokens Per Second\* | 99% > 90 Tokens Per Second\*| 99% > 100 Tokens Per Second\* | 99% > 80 Tokens Per Second\* | 99% > 90 Tokens Per Second\* |
86+
|Topic| **gpt-5** | **gpt-5-mini** | **gpt-4.1** | **gpt-4.1-mini** | **gpt-4.1-nano** | **o3** | **o4-mini** |
87+
| --- | --- | --- | --- | --- | --- | --- | --- |
88+
|Global & data zone provisioned minimum deployment| 15 | 15 | 15|15| 15 | 15 | 15 |
89+
|Global & data zone provisioned scale increment| 5 | 5 | 5|5| 5 | 5 | 5 |
90+
|Regional provisioned minimum deployment| 50 | 25 | 50|25| 25 |50 |25|
91+
|Regional provisioned scale increment| 50 | 25 | 50|25| 25 | 50 | 25|
92+
|Input TPM per PTU| 4,750 | 23,750 | 3,000|14,900| 59,400 | 3,000 | 5,400 |
93+
|Latency Target Value| 99% > 50 Tokens Per Second\* | 99% > 80 Tokens Per Second\* | 99% > 80 Tokens Per Second\* | 99% > 90 Tokens Per Second\*| 99% > 100 Tokens Per Second\* | 99% > 80 Tokens Per Second\* | 99% > 90 Tokens Per Second\* |
9494

9595
\* Calculated as p50 request latency on a per 5 minute basis.
9696

articles/ai-foundry/openai/includes/model-matrix/provisioned-global.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ ms.date: 09/04/2025
1414
| australiaeast |||||||||||||
1515
| brazilsouth |||||||||||||
1616
| canadaeast |||||||||||||
17-
| centralindia | - ||||||||||||
18-
| eastasia | - ||||||||||||
17+
| centralindia | ||||||||||||
18+
| eastasia | ||||||||||||
1919
| eastus |||||||||||||
2020
| eastus2 |||||||||||||
2121
| francecentral |||||||||||||
@@ -24,7 +24,7 @@ ms.date: 09/04/2025
2424
| japaneast |||||||||||||
2525
| koreacentral |||||||||||||
2626
| northcentralus |||||||||||||
27-
| northeurope | - ||||||||||||
27+
| northeurope | ||||||||||||
2828
| norwayeast |||||||||||||
2929
| polandcentral |||||||||||||
3030
| southafricanorth |||||||||||||

0 commit comments

Comments
 (0)