Skip to content

Commit 8241956

Browse files
Merge pull request #1724 from sydneemayers/docs-editor/provisioned-throughput-1732579346
Update provisioned note for global deployments support
2 parents 1db3cb3 + 7aee45e commit 8241956

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

articles/ai-services/openai/concepts/provisioned-throughput.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,15 +47,18 @@ To help with simplifying the sizing effort, the following table outlines the TPM
4747
| --- | --- | --- |
4848
|Global provisioned minimum deployment|15|15|
4949
|Global provisioned scale increment|5|5|
50-
| Regional provisioned minimum deployment | 50 | 25|
50+
|Regional provisioned minimum deployment | 50 | 25|
5151
|Regional provisioned scale increment|50|25|
5252
|Max Input TPM per PTU | 2,500 | 37,000 |
5353
|Max Output TPM per PTU| 833|12,333|
54-
| Latency Target Value |25 Tokens Per Second|33 Tokens Per Second|
54+
|Latency Target Value |25 Tokens Per Second|33 Tokens Per Second|
5555

5656
For a full list see the [AOAI Studio calculator](https://oai.azure.com/portal/calculator).
5757

5858

59+
> [!NOTE]
60+
> Global provisioned deployments are only supported for gpt-4o, 2024-08-06 and gpt-4o-mini, 2024-07-18 models at this time. For more information on model availability, review the [models documentation](./models.md).
61+
5962
## Key concepts
6063

6164
### Deployment types

0 commit comments

Comments
 (0)