You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/gov-provisioned.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ author: challenp
6
6
ms.service: azure-ai-openai
7
7
ms.topic: how-to
8
8
ms.custom: references_regions, azuregovernment
9
-
ms.date: 5/1/2025
9
+
ms.date: 05/30/2025
10
10
recommendations: false
11
11
---
12
12
@@ -84,7 +84,7 @@ In addition to the updates for the hourly payment model, new [Azure Reservations
84
84
85
85
#### Supported models on commitment payment model:
86
86
87
-
Only the following list of Azure OpenAI models are supported in Commitments. For onboarding any other models that aren't in the list below, or any newer models on provisioned throughput offering, refer to the [Azure OpenAI provisioned onboarding guide](../how-to/provisioned-throughput-onboarding.md) and [Azure Reservations for Azure OpenAI provisioned deployments](../how-to/provisioned-throughput-onboarding.md#azure-reservations-for-azure-openai-provisioned-deployments)
87
+
Only the following list of Azure OpenAI models are supported in Commitments. For onboarding any other models that aren't in the list below, or any newer models on provisioned throughput offering, refer to the [Azure OpenAI provisioned onboarding guide](../how-to/provisioned-throughput-onboarding.md) and [Azure Reservations for Azure OpenAI provisioned deployments](../how-to/provisioned-throughput-onboarding.md#azure-reservations-for-azure-ai-foundry-provisioned-throughput)
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/provisioned-migration.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ ms.service: azure-ai-openai
7
7
ms.custom:
8
8
- ignite-2024
9
9
ms.topic: how-to
10
-
ms.date: 03/26/2025
10
+
ms.date: 05/30/2025
11
11
author: aahill
12
12
ms.author: aahi
13
13
recommendations: false
@@ -44,7 +44,7 @@ This article is intended for existing users of the provisioned throughput offeri
44
44
| Default provisioned-managed quota in many regions | Get started quickly in new regions without having to first request quota. |
45
45
| Flexible choice of payment model for existing provisioned customers | Customers with commitments can stay on the commitment model until the end of life of the currently supported models, and can choose to migrate existing commitments to hourly/reservations via managed process. We recommend migrating to hourly/ reservations to take advantage of term discounts and to work with the latest models. |
46
46
| Supports latest model generations | The latest models are available only on hourly/ reservations in provisioned offering. |
47
-
| Differentiated pricing | Greater flexibility and control of pricing and performance. In December 2024, we introduced differentiated hourly pricing across [global provisioned](../how-to/deployment-types.md#global-provisioned), [data zone provisioned](../how-to/deployment-types.md#data-zone-provisioned), and [provisioned](../how-to/deployment-types.md#provisioned) deployment types with the option to purchase [Azure Reservations](#new-azure-reservations-for-global-and-data-zone-provisioned-deployments) to support additional discounts. For more information on the hourly price for each provisioned deployment type, see the [Pricing details](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) page. |
47
+
| Differentiated pricing | Greater flexibility and control of pricing and performance. In December 2024, we introduced differentiated hourly pricing across [global provisioned](../how-to/deployment-types.md#global-provisioned), [data zone provisioned](../how-to/deployment-types.md#data-zone-provisioned), and [regional provisioned](../how-to/deployment-types.md#regional-provisioned) deployment types with the option to purchase [Azure Reservations](#new-azure-reservations-for-global-and-data-zone-provisioned-deployments) to support additional discounts. For more information on the hourly price for each provisioned deployment type, see the [Pricing details](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) page. |
48
48
49
49
## Usability improvement details
50
50
@@ -81,7 +81,7 @@ We also recommend that customers using commitments now create their deployments
81
81
See the following links for more information. The guidance for reservations and commitments is the same:
@@ -112,7 +112,7 @@ In addition to the updates for the hourly payment model, in December 2024 new [A
112
112
- Commitments can't be canceled or altered during their term, except to add new PTUs.
113
113
114
114
#### Supported models on commitment payment model:
115
-
Only the following list of Azure OpenAI models are supported in Commitments. For onboarding any other models that aren't in the list below, or any newer models on provisioned throughput offering, refer to the [Azure OpenAI provisioned onboarding guide](../how-to/provisioned-throughput-onboarding.md) and [Azure Reservations for Azure OpenAI provisioned deployments](../how-to/provisioned-throughput-onboarding.md#azure-reservations-for-azure-openai-provisioned-deployments)
115
+
Only the following list of Azure OpenAI models are supported in Commitments. For onboarding any other models that aren't in the list below, or any newer models on provisioned throughput offering, refer to the [Azure OpenAI provisioned onboarding guide](../how-to/provisioned-throughput-onboarding.md) and [Azure Reservations for Azure OpenAI provisioned deployments](../how-to/provisioned-throughput-onboarding.md#azure-reservations-for-azure-ai-foundry-provisioned-throughput)
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/provisioned-throughput.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: Azure OpenAI in Azure AI Foundry Models provisioned throughput
3
3
description: Learn about provisioned throughput and Azure OpenAI.
4
4
ms.service: azure-ai-openai
5
5
ms.topic: conceptual
6
-
ms.date: 04/30/2025
6
+
ms.date: 05/30/2025
7
7
manager: nitinme
8
8
author: aahill #ChrisHMSFT
9
9
ms.author: aahi #chrhoder
@@ -23,7 +23,7 @@ The provisioned throughput offering is a model deployment type that allows you t
23
23
24
24
> [!TIP]
25
25
> * You can take advantage of additional cost savings when you buy [Microsoft Azure OpenAI in Azure AI Foundry Models reservations](/azure/cost-management-billing/reservations/azure-openai#buy-a-microsoft-azure-openai-service-reservation).
26
-
> * Provisioned throughput is available as the following deployment types: [global provisioned](../how-to/deployment-types.md#global-provisioned), [data zone provisioned](../how-to/deployment-types.md#data-zone-provisioned) and [standard provisioned](../how-to/deployment-types.md#provisioned).
26
+
> * Provisioned throughput is available as the following deployment types: [global provisioned](../how-to/deployment-types.md#global-provisioned), [data zone provisioned](../how-to/deployment-types.md#data-zone-provisioned) and [regional provisioned](../how-to/deployment-types.md#regional-provisioned).
27
27
28
28
<!--
29
29
Throughput is defined in terms of provisioned throughput units (PTU) which is a normalized way of representing the throughput for your deployment. Each model-version pair requires different amounts of PTU to deploy, and provides different amounts of throughput per PTU.
@@ -387,7 +387,7 @@ Azure OpenAI fine-tuning supports the following deployment types.
387
387
|GPT-4o-finetune|North Central US, Sweden Central|
388
388
|GPT-4o-mini-finetune|North Central US, Sweden Central|
389
389
390
-
[Provisioned throughput](./deployment-types.md#provisioned) fine-tuned deployments offer [predictable performance](../concepts/provisioned-throughput.md) for latency-sensitive agents and applications. They use the same regional provisioned throughput (PTU) capacity as base models, so if you already have regional PTU quota you can deploy your fine-tuned model in support regions.
390
+
[Provisioned throughput](./deployment-types.md#regional-provisioned) fine-tuned deployments offer [predictable performance](../concepts/provisioned-throughput.md) for latency-sensitive agents and applications. They use the same regional provisioned throughput (PTU) capacity as base models, so if you already have regional PTU quota you can deploy your fine-tuned model in support regions.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/provisioned-throughput-onboarding.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,7 +23,7 @@ Provisioned throughput units (PTUs) are generic units of model processing capaci
23
23
24
24
## Understanding provisioned throughput billing
25
25
26
-
Azure AI Foundry [Regional Provisioned Throughput](../how-to/deployment-types.md#provisioned), [Data Zone Provisioned Throughput](../how-to/deployment-types.md#data-zone-provisioned), and [Global Provisioned Throughput](../how-to/deployment-types.md#global-provisioned) are purchased on-demand at an hourly basis based on the number of deployed PTUs, with substantial term discount available via the purchase of [Azure Reservations](#azure-reservations-for-azure-openai-provisioned-deployments).
26
+
Azure AI Foundry [Regional Provisioned Throughput](../how-to/deployment-types.md#regional-provisioned), [Data Zone Provisioned Throughput](../how-to/deployment-types.md#data-zone-provisioned), and [Global Provisioned Throughput](../how-to/deployment-types.md#global-provisioned) are purchased on-demand at an hourly basis based on the number of deployed PTUs, with substantial term discount available via the purchase of [Azure Reservations](#azure-reservations-for-azure-ai-foundry-provisioned-throughput).
27
27
28
28
The hourly model is useful for short-term deployment needs, such as validating new models or acquiring capacity for a hackathon. However, the discounts provided by the Azure Reservation for Azure AI Foundry Regional Provisioned, Data Zone Provisioned, and Global Provisioned are considerable and most customers with consistent long-term usage will find a reserved model to be a better value proposition.
29
29
@@ -37,11 +37,11 @@ Unlike the Tokens Per Minute (TPM) quota used by other Azure AI Foundry offering
37
37
38
38
:::image type="content" source="../media/provisioned/model-independent-quota.png" alt-text="Diagram of model independent quota with one pool of PTUs available to multiple Azure OpenAI models." lightbox="../media/provisioned/model-independent-quota.png":::
39
39
40
-
Quota for provisioned deployments shows up in Azure AI Foundry as the following deployment types: [global provisioned](../how-to/deployment-types.md#global-provisioned), [data zone provisioned](../how-to/deployment-types.md#data-zone-provisioned) and [regional provisioned](../how-to/deployment-types.md#provisioned).
40
+
Quota for provisioned deployments shows up in Azure AI Foundry as the following deployment types: [global provisioned](../how-to/deployment-types.md#global-provisioned), [data zone provisioned](../how-to/deployment-types.md#data-zone-provisioned) and [regional provisioned](../how-to/deployment-types.md#regional-provisioned).
41
41
42
42
|deployment type |Quota name |
43
43
|---------|---------|
44
-
|[Regional Provisioned](../how-to/deployment-types.md#provisioned)| Regional Provisioned Throughput Unit |
44
+
|[Regional Provisioned](../how-to/deployment-types.md#regional-provisioned)| Regional Provisioned Throughput Unit |
45
45
|[Global Provisioned](../how-to/deployment-types.md#global-provisioned)| Global Provisioned Throughput Unit |
46
46
|[Data zone Provisioned](../how-to/deployment-types.md#data-zone-provisioned)| Data Zone Provisioned Throughput Unit |
47
47
@@ -59,7 +59,7 @@ If the deployment size is changed, the costs of the deployment will adjust to ma
59
59
60
60
Paying for regional provisioned, data zone provisioned, and global provisioned deployments on an hourly basis is ideal for short-term deployment scenarios. For example: Quality and performance benchmarking of new models, or temporarily increasing PTU capacity to cover an event such as a hackathon.
61
61
62
-
Customers that require long-term usage of regional provisioned, data zone provisioned, and global provisioned deployments, however, might pay significantly less per month by purchasing a term discount via [Azure Reservations](#azure-reservations-for-azure-openai-provisioned-deployments) as discussed later in the article.
62
+
Customers that require long-term usage of regional provisioned, data zone provisioned, and global provisioned deployments, however, might pay significantly less per month by purchasing a term discount via [Azure Reservations](#azure-reservations-for-azure-ai-foundry-provisioned-throughput) as discussed later in the article.
63
63
64
64
> [!IMPORTANT]
65
65
> It's not recommended to scale production deployments according to incoming traffic and pay for them purely on an hourly basis. There are two reasons for this:
0 commit comments