Skip to content

Commit b11b945

Browse files
committed
fixes
1 parent 8e93a3a commit b11b945

File tree

1 file changed

+53
-40
lines changed

1 file changed

+53
-40
lines changed

articles/ai-services/openai/concepts/provisioned-migration.md

Lines changed: 53 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ recommendations: false
1414

1515
# Azure OpenAI provisioned August 2024 update
1616

17-
In mid-August, 2024, Microsoft is launching improvements to its Provisioned Throughput offering that address customer feedback on usability and operational agility, and that open new payment options and deployment scenarios.
17+
In mid-August, 2024, Microsoft launched improvements to its Provisioned Throughput offering that address customer feedback on usability and operational agility that open new payment options and deployment scenarios.
1818

1919
This article is intended for existing users of the provisioned throughput offering. New customers should refer to the [Azure OpenAI provisioned onboarding guide](../how-to/provisioned-throughput-onboarding.md).
2020

@@ -23,123 +23,136 @@ This article is intended for existing users of the provisioned throughput offeri
2323
The capabilities below are rolling out for the Provisioned Managed offering.
2424

2525
> [!IMPORTANT]
26-
> The changes in this article do not apply to the older *Provisioned Classic (PTU-C)* offering. They only affect the Provisioned (also known as the Provisioned Managed) offering.
26+
> The changes in this article do not apply to the older *"Provisioned Classic (PTU-C)"* offering. They only affect the Provisioned (also known as the Provisioned Managed) offering.
2727
2828
### Usability improvements
2929

3030
|Feature | Benefit|
3131
|---|---|
32-
|Model-independent quota | A single quota limit covering all models/versions reduces quota administration and accelerates experimentation with new models |
33-
|Self-service quota requests | Request quota increases without engaging the sales team – many will be autoapproved |
34-
|Default provisioned-managed quota in many regions | Get started quickly without having to first request quota |
35-
|Transparent information on real-time capacity availability + New deployment flow | Reduced negotiation around availability accelerates time-to-market |
32+
|Model-independent quota | A single quota limit covering all models/versions reduces quota administration and accelerates experimentation with new models. |
33+
|Self-service quota requests | Request quota increases without engaging the sales team – many can be autoapproved. |
34+
|Default provisioned-managed quota in many regions | Get started quickly without having to first request quota. |
35+
|Transparent information on real-time capacity availability + New deployment flow | Reduced negotiation around availability accelerates time-to-market. |
3636

3737
### New hourly/reservation commercial model
3838

3939
|Feature | Benefit|
4040
|---|---|
41-
|Hourly, uncommitted usage | Hourly payment option without a required commitment enables short-term deployment scenarios |
41+
|Hourly, uncommitted usage | Hourly payment option without a required commitment enables short-term deployment scenarios. |
4242
|Term discounts via Azure Reservations | Azure reservations provide substantial discounts over the hourly rate for one month and one year terms, and provide flexible scopes that minimize administration and associated with today’s resource-bound commitments.|
43-
| Default provisioned-managed quota in many regions | Get started quickly in new regions without having to first request quota |
43+
| Default provisioned-managed quota in many regions | Get started quickly in new regions without having to first request quota. |
4444
| Flexible choice of payment model for existing provisioned customers | Customers with commitments can stay on the commitment model at least through the end of 2024, and can choose to migrate existing commitments to hourly/reservations via a self-service or managed process. |
45-
| Supports latest model generations | The hourly/reservation model will be required to deploy models released after June 28, 2024. |
45+
| Supports latest model generations | The hourly/reservation model is required to deploy models released after August 1, 2024. |
4646

47-
### Usability improvement details
47+
## Usability improvement details
4848

49-
Provisioned quota granularity is changing from model-specific to model-independent. Rather than each model and version within subscription and region having its own quota limit, there will be a single quota item per subscription and region that limits the total number of PTUs that can be deployed across all supported models and versions.
49+
Provisioned quota granularity is changing from model-specific to model-independent. Rather than each model and version within subscription and region having its own quota limit, there is a single quota item per subscription and region that limits the total number of PTUs that can be deployed across all supported models and versions.
5050

51-
Starting August 12, 2024, existing customers will have their current, model-specific quota converted to model-independent. This will happen automatically and be complete by August 14, 2024. No quota will be lost in the transition. Existing quota limits will be summed and assigned to a new model-independent quota item.
51+
## Model-independent quota
52+
53+
Starting August 12, 2024, existing customers' current, model-specific quota has been converted to model-independent. This happens automatically. No quota is lost in the transition. Existing quota limits are summed and assigned to a new model-independent quota item.
5254

5355
:::image type="content" source="../media/provisioned/consolidation.png" alt-text="Diagram showing quota consolidation." lightbox="../media/provisioned/consolidation.png":::
5456

55-
The new model-independent quota will show up as a quota item named **Provisioned Managed Throughput Unit**, with model and version no longer included in the name. In the Studio Quota pane, expanding the quota item will still show all of the deployments that contribute to the quota item.
57+
The new model-independent quota shows up as a quota item named **Provisioned Managed Throughput Unit**, with model and version no longer included in the name. In the Studio Quota pane, expanding the quota item still shows all of the deployments that contribute to the quota item.
5658

5759
### Default quota
5860

59-
New and existing subscriptions will be assigned a small amount of provisioned quota in many regions. This allows customers to start using those regions without having to first request quota.
61+
New and existing subscriptions are assigned a small amount of provisioned quota in many regions. This allows customers to start using those regions without having to first request quota.
6062

61-
For existing customers, if the region already contains a quota assignment, the quota limit won't be changed for the region. For example, it will not be automatically increased by the new default amount.
63+
For existing customers, if the region already contains a quota assignment, the quota limit isn't changed for the region. For example, it isn't automatically increased by the new default amount.
6264

6365
### Self-service quota requests
6466

65-
Customers will no longer obtain quota by contacting their sales teams. Instead, they'll use the self-service quota request form and specify the PTU-Managed quota type. The form is accessible from a link to the right of the quota item. The target is to respond to all quota requests within two business days.
67+
Customers no longer obtain quota by contacting their sales teams. Instead, they use the self-service quota request form and specify the PTU-Managed quota type. The form is accessible from a link to the right of the quota item. The target is to respond to all quota requests within two business days.
6668

67-
The Quota screenshot below shows model-independent quota being used by deployments of different types, as well as the link for requesting additional quota.
69+
The quota screenshot below shows model-independent quota being used by deployments of different types, as well as the link for requesting additional quota.
6870

6971
:::image type="content" source="../media/provisioned/quota-request-type.png" alt-text="Screenshot of new request type UI for Azure OpenAI provisioned for requesting more quota." lightbox="../media/provisioned/quota-request-type.png":::
7072

73+
## Quota as a limit
74+
75+
Prior to the August update, Azure OpenAI Provisioned was only available to a few customers, and quota was allocated to maximize the ability for them to deploy and use it. With these changes, the process of acquiring quota is simplified for all users, and there is a greater likelihood of running into service capacity limitations when deployments are attempted. A new API and Studio experience are available to help users find regions where the subscription has quota and the service has capacity to support deployments of a desired model.
76+
77+
We also recommend that customers using commitments now create their deployments prior to creating or expanding commitments to cover them. This guarantees that capacity is available prior to creating a commitment and prevents over-purchase of the commitment. To support this, the restriction that prevented deployments from being created larger than their commitments has been removed. This new approach to quota, capacity availability and commitments matches what is provided under the hourly/reservation model, and the guidance to deploy before purchasing a commitment (or reservation, for the hourly model) is the same for both.
78+
79+
See the following links for more information. The guidance for reservations and commitments is the same:
80+
81+
* [Capacity Transparency](#self-service-migration)
82+
* [Sizing reservations](../how-to/provisioned-throughput-onboarding.md#important-sizing-azure-openai-provisioned-reservations)
7183

7284
## New hourly reservation payment model
7385

7486
> [!NOTE]
75-
> The following discussion of payment models does not apply to the older Provisioned Classic (PTU-C) offering. They only affect the Provisioned (aka Provisioned Managed) offering. Provisioned Classic will continue to be governed by the monthly commitment payment model, unchanged from today.
87+
> The following discussion of payment models does not apply to the older "Provisioned Classic (PTU-C)" offering. They only affect the Provisioned (aka Provisioned Managed) offering. Provisioned Classic continues to be governed by the monthly commitment payment model, unchanged from today.
7688
77-
Microsoft has introduced a new Hourly/reservation payment model for provisioned deployments. This is in addition to the current **Commitment** payment model, which will continue to be supported at least through the end of 2024.
89+
Microsoft has introduced a new "Hourly/reservation" payment model for provisioned deployments. This is in addition to the current **Commitment** payment model, which will continue to be supported at least through the end of 2024.
7890

79-
### Commitment payment mode (current model)
91+
### Commitment payment model
8092

81-
- Regional, monthly commitment is required to use provisioned (longer terms available contractually)
93+
- Regional, monthly commitment is required to use provisioned (longer terms available contractually).
8294

8395
- Commitments are bound to Azure OpenAI resources, making moving deployments across resources difficult.
8496

8597
- Commitments can't be canceled or altered during their term, except to add new PTUs.
8698

87-
- Supports models released prior to June 29, 2024.
99+
- Supports models released prior to August 1, 2024.
88100

89101
### Hourly reservation payment model
90102

91-
- Payment model aligned with Azure standards for other products.
103+
- The payment model is aligned with Azure standards for other products.
92104

93105
- Hourly usage is supported, without commitment.
94106

95107
- One month and one year term discounts can be purchased as regional Azure Reservations.
96108

97109
- Reservations can be flexibly scoped to cover multiple subscriptions, and the scope can be changed mid-term.
98110

99-
- Supports all models, both old and new
111+
- Supports all models, both old and new.
100112

101113
> [!IMPORTANT]
102-
> **Models released after July 28, 2024 require the use of the Hourly/Reservation payment model.** They are not deployable on Azure OpenAI resources that have active commitments. To deploy models released after July 28, exiting customers must either:
103-
> - Create deployments on new Azure OpenAI resources without commitments.
114+
> **Models released after August 1, 2024 require the use of the Hourly/Reservation payment model.** They are not deployable on Azure OpenAI resources that have active commitments. To deploy models released after August 1, exiting customers must either:
115+
> - Create deployments on Azure OpenAI resources without commitments.
104116
> - Migrate an existing resources off its commitments.
105117
106118

107119
## Hourly reservation model details
108120

109-
Details on the hourly/reservation model can be found in the [Azure OpenAI Provisioned Onboarding Guide](../how-to/provisioned-throughput-onboarding.md)
121+
Details on the hourly/reservation model can be found in the [Azure OpenAI Provisioned Onboarding Guide](../how-to/provisioned-throughput-onboarding.md).
110122

111123
### Commitment and hourly reservation coexistence
112124

113-
Customers that have commitments today aren't required to use the hourly/reservation model. They can continue to use existing commitments, purchase new commitments, and manage commitments as they do today.
125+
Customers that have commitments aren't required to use the hourly/reservation model. They can continue to use existing commitments, purchase new commitments, and manage commitments as they do currently.
114126

115127
A customer can also decide to use both payment models in the same subscription/region. In this case, **the payment model for a deployment depends on the resource to which it is attached.**
116128

117-
**Deployments on resources with active commitments will follow the commitment payment model.**
129+
**Deployments on resources with active commitments follow the commitment payment model.**
118130

119-
- The monthly commitment purchase will cover the deployed PTUs.
131+
- The monthly commitment purchase covers the deployed PTUs.
120132

121-
- Hourly overage charges will be generated if the deployed PTUs ever become greater than the committed PTUs.
133+
- Hourly overage charges are generated if the deployed PTUs ever become greater than the committed PTUs.
122134

123-
- All existing discounts attached to the monthly commitment SKU will continue to apply.
135+
- All existing discounts attached to the monthly commitment SKU continue to apply.
124136

125137
- **Azure Reservations DO NOT apply additional discounts on top of the monthly commitment SKU**, however they will apply discounts to any overages (this behavior is new).
126138

127-
- The **Manage Commitments** page in Studio will be used to purchase and manage commitments.
139+
- The **Manage Commitments** page in Studio is used to purchase and manage commitments.
128140

129-
Deployments on resources without commitments (or only expired commitments) will follow the Hourly/Reservation payment model.
130-
- Deployments will generate hourly charges under the new Hourly/Reservation SKU and meter.
141+
Deployments on resources without commitments (or only expired commitments) follow the Hourly/Reservation payment model.
142+
- Deployments generate hourly charges under the new Hourly/Reservation SKU and meter.
131143
- Azure Reservations can be purchased to discount the PTUs for deployments.
132144
- Reservations are purchased and managed from the Reservation blade of the Azure portal (not within Studio).
133145

134-
If a deployment is on a resource that has a commitment, and that commitment expires. The deployment will automatically shift to be billed
146+
If a deployment is on a resource that has a commitment, and that commitment expires. The deployment will automatically shift to be billed.
135147

136148
### Changes to the existing payment mode
137149

138-
Customers that have commitments today can continue to use them at least through the end of 2024. This includes purchasing new PTUs on new or existing commitments and managing commitment renewal behaviors. However, the changes on July 29, 2024 will change these aspects of commitment operation.
150+
Customers that have commitments today can continue to use them at least through the end of 2024. This includes purchasing new PTUs on new or existing commitments and managing commitment renewal behaviors. However, the August update has changed certain aspects of commitment operation.
139151

140-
- Only models released as provisioned July 28, 2024 or before can be deployed on a resource with a commitment.
152+
- Only models released as provisioned prior to August 1, 2023 or before can be deployed on a resource with a commitment.
141153

142-
- Overage charges will be emitted against the hourly SKU used for the hourly/reservations model, allowing the overage charges to be discounted by an Azure Reservation if one exists.
154+
- If the deployed PTUs under a commitment exceed the committed PTUs, the hourly overage charges will be emitted against the same hourly meter as used for the new hourly/reservation payment model. This allows the overage charges to be discounted via an Azure Reservation.
155+
- It is possible to deploy more PTUs than are committed on the resource. This supports the ability to guarantee capacity availability prior to increasing the commitment size to cover it.
143156

144157
## Migrating existing resources off commitments
145158

@@ -171,7 +184,7 @@ An alternative approach to self-service migration is to switch the reservation p
171184
* There will be a short period of double-billing or hourly charges during the switchover from committed to hourly/reservation billing.
172185

173186
> [!IMPORTANT]
174-
> Both self-service approaches will generate some additional charges as the payment mode is switched from Committed to Hourly/Reservation. These are characteristics of the migration approaches and customers will not be credited for these charges. Customers may choose to use the managed migration approach described below to avoid them.
187+
> Both self-service approaches generate some additional charges as the payment mode is switched from Committed to Hourly/Reservation. These are characteristics of the migration approaches and customers aren't credited for these charges. Customers may choose to use the managed migration approach described below to avoid them.
175188
176189
### Managed migration
177190

0 commit comments

Comments
 (0)