Skip to content

Commit 0d0de79

Browse files
committed
updating tokens
1 parent 19aee62 commit 0d0de79

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

articles/ai-services/openai/how-to/provisioned-throughput-onboarding.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Understanding costs associated with provisioned throughput units (PTU)
33
description: Learn about provisioned throughput costs and billing in Azure AI Foundry.
44
ms.service: azure-ai-openai
55
ms.topic: conceptual
6-
ms.date: 05/28/2025
6+
ms.date: 06/13/2025
77
manager: nitinme
88
author: aahill
99
ms.author: aahi
@@ -84,8 +84,9 @@ For example, for `gpt-4.1:2025-04-14`, 1 output token counts as 4 input tokens t
8484
|Regional provisioned minimum deployment|25| 50|25| 25 |50 | 25|25|50|25| NA|NA|
8585
|Regional provisioned scale increment|25| 50|25| 25 | 50 | 25|50|50|25|NA|NA|
8686
|Input TPM per PTU|5,400 | 3,000|14,900| 59,400 | 600 | 2,500|230|2,500|37,000|4,000|4,000|
87-
|Latency Target Value| 66 Tokens Per Second | 40 Tokens Per Second|50 Tokens Per Second| 60 Tokens Per Second | 40 Tokens Per Second | 66 Tokens Per Second |25 Tokens Per Second|25 Tokens Per Second|33 Tokens Per Second|50 Tokens Per Second|50 Tokens Per Second|
87+
|Latency Target Value| 99% > 66 Tokens Per Second\* | 99% > 40 Tokens Per Second\* | 99% > 50 Tokens Per Second\*| 99% > 60 Tokens Per Second\* | 99% > 40 Tokens Per Second\* | 99% > 66 Tokens Per Second\* | 99% > 25 Tokens Per Second\* | 99% > 25 Tokens Per Second\* | 99% > 33 Tokens Per Second\* | 50 Tokens Per Second|50 Tokens Per Second|
8888

89+
\* Calculated as the average request latency on a per-minute basis across the month.
8990

9091
For a full list, see the [Azure AI Foundry calculator](https://ai.azure.com/resource/calculator).
9192

0 commit comments

Comments
 (0)