Skip to content

Commit 6db1143

Browse files
committed
reorder columns
1 parent fb7d962 commit 6db1143

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

articles/ai-foundry/openai/how-to/provisioned-throughput-onboarding.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -82,14 +82,14 @@ For example, for gpt-5 1 output token counts as 8 input tokens towards your util
8282
> [!NOTE]
8383
> gpt-4.1, gpt-4.1-mini and gpt-4.1-nano don't support long context (requests estimated at larger than 128k prompt tokens).
8484
85-
|Topic| **gpt-5-mini** | **gpt-5** | **gpt-4.1** | **gpt-4.1-mini** | **gpt-4.1-nano** | **o3** | **o4-mini** |
85+
|Topic| **gpt-5** | **gpt-5-mini** | **gpt-4.1** | **gpt-4.1-mini** | **gpt-4.1-nano** | **o3** | **o4-mini** |
8686
| --- | --- | --- | --- | --- | --- | --- | --- |
8787
|Global & data zone provisioned minimum deployment| 15 | 15 | 15|15| 15 | 15 | 15 |
8888
|Global & data zone provisioned scale increment| 5 | 5 | 5|5| 5 | 5 | 5 |
89-
|Regional provisioned minimum deployment| 25 | 50 | 50|25| 25 |50 |25|
90-
|Regional provisioned scale increment| 25 | 50 | 50|25| 25 | 50 | 25|
91-
|Input TPM per PTU| 23,750 | 4,750 | 3,000|14,900| 59,400 | 3,000 | 5,400 |
92-
|Latency Target Value| 99% > 80 Tokens Per Second\* | 99% > 50 Tokens Per Second\* | 99% > 80 Tokens Per Second\* | 99% > 90 Tokens Per Second\*| 99% > 100 Tokens Per Second\* | 99% > 80 Tokens Per Second\* | 99% > 90 Tokens Per Second\* |
89+
|Regional provisioned minimum deployment| 50 | 25 | 50|25| 25 |50 |25|
90+
|Regional provisioned scale increment| 50 | 25 | 50|25| 25 | 50 | 25|
91+
|Input TPM per PTU| 4,750 | 23,750 | 3,000|14,900| 59,400 | 3,000 | 5,400 |
92+
|Latency Target Value| 99% > 50 Tokens Per Second\* | 99% > 80 Tokens Per Second\* | 99% > 80 Tokens Per Second\* | 99% > 90 Tokens Per Second\*| 99% > 100 Tokens Per Second\* | 99% > 80 Tokens Per Second\* | 99% > 90 Tokens Per Second\* |
9393

9494
\* Calculated as p50 request latency on a per 5 minute basis.
9595

0 commit comments

Comments
 (0)