Skip to content

Commit bc0bca6

Browse files
committed
updating PTU
1 parent 9d4598b commit bc0bca6

File tree

3 files changed

+41
-34
lines changed

3 files changed

+41
-34
lines changed

articles/ai-services/openai/concepts/provisioned-throughput.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Azure OpenAI Service provisioned throughput
33
description: Learn about provisioned throughput and Azure OpenAI.
44
ms.service: azure-ai-openai
55
ms.topic: conceptual
6-
ms.date: 03/31/2025
6+
ms.date: 04/30/2025
77
manager: nitinme
88
author: aahill #ChrisHMSFT
99
ms.author: aahi #chrhoder

articles/ai-services/openai/how-to/provisioned-throughput-onboarding.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -72,13 +72,20 @@ Customers that require long-term usage of provisioned, data zoned provisioned, a
7272
> Charges for deployments on a deleted resource will continue until the resource is purged. To prevent this, delete a resource’s deployment before deleting the resource. For more information, see [Recover or purge deleted Azure AI services resources](../../recover-purge-resources.md).
7373
7474
## How much throughput per PTU you get for each model
75-
The amount of throughput (measured in tokens per minute or TPM) a deployment gets per PTU is a function of the input and output tokens in a given minute.
7675

77-
Generating output tokens requires more processing than input tokens. For the models specified in the table below, 1 output token counts as 3 input tokens towards your TPM-per-PTU limit. The service dynamically balances the input & output costs, so users do not have to set specific input and output limits. This approach means your deployment is resilient to fluctuations in the workload.
7876

79-
To help with simplifying the sizing effort, the following table outlines the TPM-per-PTU for the specified models. To understand the impact of output tokens on the TPM-per-PTU limit, use the 3 input token to 1 output token ratio.
77+
To understand how much throughput (TPU) you get, keep the following in mind:
78+
79+
* The amount of throughput (measured in tokens per minute or TPM) a deployment gets per PTU is a function of the input and output tokens in a given minute.
80+
* Generating output tokens requires more processing than input tokens.  Provisioned-Managed matches the Standard offering for gpt-4.1 models and later.
81+
* 1 output token now counts as 4 input tokens towards your TPM-per-PTU limit.
82+
* In the standard offering, 1 output token is 4 times expensive as an input token. See the [pricing page for details](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/).
83+
84+
To help simplify the sizing effort, the following table outlines the TPM-per-PTU for the specified models.
85+
To understand the impact of output tokens on the TPM-per-PTU limit, use the 4 input token to 1 output token ratio.
86+
87+
For a detailed understanding of how different ratios of input and output tokens impact the throughput your workload needs, see the [Azure OpenAI capacity calculator](https://ai.azure.com/resource/calculator). The table also shows the [Service Level Agreement (SLA)](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services?lang=1) Latency Target Values per model.
8088

81-
For a detailed understanding of how different ratios of input and output tokens impact the throughput your workload needs, see the [Azure OpenAI capacity calculator](https://ai.azure.com/resource/calculator). The table also shows Service Level Agreement (SLA) Latency Target Values per model. For more information about the SLA for Azure OpenAI Service, see the [Service Level Agreements (SLA) for Online Services page](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services?lang=1)
8289

8390

8491
|Topic| **gpt-4o** | **gpt-4o-mini** | **o1**|

articles/ai-services/openai/includes/model-matrix/provisioned-global.md

Lines changed: 29 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -6,34 +6,34 @@ manager: nitinme
66
ms.service: azure-ai-openai
77
ms.topic: include
88
ms.custom: references_regions
9-
ms.date: 03/31/2025
9+
ms.date: 04/30/2025
1010
---
1111

12-
| **Region** | **o3-mini**, **2025-01-31** | **o1**, **2024-12-17** | **gpt-4o**, **2024-05-13** | **gpt-4o**, **2024-08-06** | **gpt-4o**, **2024-11-20** | **gpt-4o-mini**, **2024-07-18** |
13-
|:-------------------|:---------------------------:|:----------------------:|:--------------------------:|:--------------------------:|:--------------------------:|:-------------------------------:|
14-
| australiaeast |||||||
15-
| brazilsouth |||||||
16-
| canadaeast |||||||
17-
| eastus |||||||
18-
| eastus2 |||||||
19-
| francecentral |||||||
20-
| germanywestcentral |||||||
21-
| italynorth |||||||
22-
| japaneast |||||||
23-
| koreacentral |||||||
24-
| northcentralus |||||||
25-
| norwayeast |||||||
26-
| polandcentral |||||||
27-
| southafricanorth |||||||
28-
| southcentralus |||||||
29-
| southeastasia |||||||
30-
| southindia |||||||
31-
| spaincentral |||||||
32-
| swedencentral |||||||
33-
| switzerlandnorth |||||||
34-
| switzerlandwest |||||||
35-
| uaenorth |||||||
36-
| uksouth |||||||
37-
| westeurope |||||||
38-
| westus |||||||
39-
| westus3 |||||||
12+
| **Region** | **o3-mini**, **2025-01-31** | **o1**, **2024-12-17** | **gpt-4o**, **2024-05-13** | **gpt-4o**, **2024-08-06** | **gpt-4o**, **2024-11-20** | **gpt-4o-mini**, **2024-07-18** | **gpt-4.1**, **2025-04-14** |
13+
|:-------------------|:---------------------------:|:----------------------:|:--------------------------:|:--------------------------:|:--------------------------:|:-------------------------------:|:----------:|
14+
| australiaeast ||||||| |
15+
| brazilsouth ||||||| |
16+
| canadaeast ||||||| |
17+
| eastus ||||||| |
18+
| eastus2 ||||||| |
19+
| francecentral ||||||| |
20+
| germanywestcentral ||||||| |
21+
| italynorth ||||||| |
22+
| japaneast ||||||| |
23+
| koreacentral ||||||| |
24+
| northcentralus ||||||| |
25+
| norwayeast ||||||| |
26+
| polandcentral ||||||| |
27+
| southafricanorth ||||||| |
28+
| southcentralus ||||||| |
29+
| southeastasia ||||||| |
30+
| southindia ||||||| |
31+
| spaincentral ||||||| |
32+
| swedencentral ||||||||
33+
| switzerlandnorth ||||||| |
34+
| switzerlandwest ||||||| |
35+
| uaenorth ||||||| |
36+
| uksouth ||||||| |
37+
| westeurope ||||||| |
38+
| westus ||||||| |
39+
| westus3 ||||||| |

0 commit comments

Comments
 (0)