You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/provisioned-migration.md
+53-40Lines changed: 53 additions & 40 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ recommendations: false
14
14
15
15
# Azure OpenAI provisioned August 2024 update
16
16
17
-
In mid-August, 2024, Microsoft is launching improvements to its Provisioned Throughput offering that address customer feedback on usability and operational agility, and that open new payment options and deployment scenarios.
17
+
In mid-August, 2024, Microsoft launched improvements to its Provisioned Throughput offering that address customer feedback on usability and operational agility that open new payment options and deployment scenarios.
18
18
19
19
This article is intended for existing users of the provisioned throughput offering. New customers should refer to the [Azure OpenAI provisioned onboarding guide](../how-to/provisioned-throughput-onboarding.md).
20
20
@@ -23,123 +23,136 @@ This article is intended for existing users of the provisioned throughput offeri
23
23
The capabilities below are rolling out for the Provisioned Managed offering.
24
24
25
25
> [!IMPORTANT]
26
-
> The changes in this article do not apply to the older *“Provisioned Classic (PTU-C)”* offering. They only affect the Provisioned (also known as the Provisioned Managed) offering.
26
+
> The changes in this article do not apply to the older *"Provisioned Classic (PTU-C)"* offering. They only affect the Provisioned (also known as the Provisioned Managed) offering.
27
27
28
28
### Usability improvements
29
29
30
30
|Feature | Benefit|
31
31
|---|---|
32
-
|Model-independent quota | A single quota limit covering all models/versions reduces quota administration and accelerates experimentation with new models |
33
-
|Self-service quota requests | Request quota increases without engaging the sales team – many will be autoapproved |
34
-
|Default provisioned-managed quota in many regions | Get started quickly without having to first request quota |
35
-
|Transparent information on real-time capacity availability + New deployment flow | Reduced negotiation around availability accelerates time-to-market |
32
+
|Model-independent quota | A single quota limit covering all models/versions reduces quota administration and accelerates experimentation with new models.|
33
+
|Self-service quota requests | Request quota increases without engaging the sales team – many can be autoapproved.|
34
+
|Default provisioned-managed quota in many regions | Get started quickly without having to first request quota.|
35
+
|Transparent information on real-time capacity availability + New deployment flow | Reduced negotiation around availability accelerates time-to-market.|
36
36
37
37
### New hourly/reservation commercial model
38
38
39
39
|Feature | Benefit|
40
40
|---|---|
41
-
|Hourly, uncommitted usage | Hourly payment option without a required commitment enables short-term deployment scenarios |
41
+
|Hourly, uncommitted usage | Hourly payment option without a required commitment enables short-term deployment scenarios.|
42
42
|Term discounts via Azure Reservations | Azure reservations provide substantial discounts over the hourly rate for one month and one year terms, and provide flexible scopes that minimize administration and associated with today’s resource-bound commitments.|
43
-
| Default provisioned-managed quota in many regions | Get started quickly in new regions without having to first request quota |
43
+
| Default provisioned-managed quota in many regions | Get started quickly in new regions without having to first request quota.|
44
44
| Flexible choice of payment model for existing provisioned customers | Customers with commitments can stay on the commitment model at least through the end of 2024, and can choose to migrate existing commitments to hourly/reservations via a self-service or managed process. |
45
-
| Supports latest model generations | The hourly/reservation model will be required to deploy models released after June 28, 2024. |
45
+
| Supports latest model generations | The hourly/reservation model is required to deploy models released after August 1, 2024. |
46
46
47
-
###Usability improvement details
47
+
## Usability improvement details
48
48
49
-
Provisioned quota granularity is changing from model-specific to model-independent. Rather than each model and version within subscription and region having its own quota limit, there will be a single quota item per subscription and region that limits the total number of PTUs that can be deployed across all supported models and versions.
49
+
Provisioned quota granularity is changing from model-specific to model-independent. Rather than each model and version within subscription and region having its own quota limit, there is a single quota item per subscription and region that limits the total number of PTUs that can be deployed across all supported models and versions.
50
50
51
-
Starting August 12, 2024, existing customers will have their current, model-specific quota converted to model-independent. This will happen automatically and be complete by August 14, 2024. No quota will be lost in the transition. Existing quota limits will be summed and assigned to a new model-independent quota item.
51
+
## Model-independent quota
52
+
53
+
Starting August 12, 2024, existing customers' current, model-specific quota has been converted to model-independent. This happens automatically. No quota is lost in the transition. Existing quota limits are summed and assigned to a new model-independent quota item.
The new model-independent quota will show up as a quota item named **Provisioned Managed Throughput Unit**, with model and version no longer included in the name. In the Studio Quota pane, expanding the quota item will still show all of the deployments that contribute to the quota item.
57
+
The new model-independent quota shows up as a quota item named **Provisioned Managed Throughput Unit**, with model and version no longer included in the name. In the Studio Quota pane, expanding the quota item still shows all of the deployments that contribute to the quota item.
56
58
57
59
### Default quota
58
60
59
-
New and existing subscriptions will be assigned a small amount of provisioned quota in many regions. This allows customers to start using those regions without having to first request quota.
61
+
New and existing subscriptions are assigned a small amount of provisioned quota in many regions. This allows customers to start using those regions without having to first request quota.
60
62
61
-
For existing customers, if the region already contains a quota assignment, the quota limit won't be changed for the region. For example, it will not be automatically increased by the new default amount.
63
+
For existing customers, if the region already contains a quota assignment, the quota limit isn't changed for the region. For example, it isn't automatically increased by the new default amount.
62
64
63
65
### Self-service quota requests
64
66
65
-
Customers will no longer obtain quota by contacting their sales teams. Instead, they'll use the self-service quota request form and specify the PTU-Managed quota type. The form is accessible from a link to the right of the quota item. The target is to respond to all quota requests within two business days.
67
+
Customers no longer obtain quota by contacting their sales teams. Instead, they use the self-service quota request form and specify the PTU-Managed quota type. The form is accessible from a link to the right of the quota item. The target is to respond to all quota requests within two business days.
66
68
67
-
The Quota screenshot below shows model-independent quota being used by deployments of different types, as well as the link for requesting additional quota.
69
+
The quota screenshot below shows model-independent quota being used by deployments of different types, as well as the link for requesting additional quota.
68
70
69
71
:::image type="content" source="../media/provisioned/quota-request-type.png" alt-text="Screenshot of new request type UI for Azure OpenAI provisioned for requesting more quota." lightbox="../media/provisioned/quota-request-type.png":::
70
72
73
+
## Quota as a limit
74
+
75
+
Prior to the August update, Azure OpenAI Provisioned was only available to a few customers, and quota was allocated to maximize the ability for them to deploy and use it. With these changes, the process of acquiring quota is simplified for all users, and there is a greater likelihood of running into service capacity limitations when deployments are attempted. A new API and Studio experience are available to help users find regions where the subscription has quota and the service has capacity to support deployments of a desired model.
76
+
77
+
We also recommend that customers using commitments now create their deployments prior to creating or expanding commitments to cover them. This guarantees that capacity is available prior to creating a commitment and prevents over-purchase of the commitment. To support this, the restriction that prevented deployments from being created larger than their commitments has been removed. This new approach to quota, capacity availability and commitments matches what is provided under the hourly/reservation model, and the guidance to deploy before purchasing a commitment (or reservation, for the hourly model) is the same for both.
78
+
79
+
See the following links for more information. The guidance for reservations and commitments is the same:
> The following discussion of payment models does not apply to the older “Provisioned Classic (PTU-C)” offering. They only affect the Provisioned (aka Provisioned Managed) offering. Provisioned Classic will continue to be governed by the monthly commitment payment model, unchanged from today.
87
+
> The following discussion of payment models does not apply to the older "Provisioned Classic (PTU-C)" offering. They only affect the Provisioned (aka Provisioned Managed) offering. Provisioned Classic continues to be governed by the monthly commitment payment model, unchanged from today.
76
88
77
-
Microsoft has introduced a new “Hourly/reservation” payment model for provisioned deployments. This is in addition to the current **Commitment** payment model, which will continue to be supported at least through the end of 2024.
89
+
Microsoft has introduced a new "Hourly/reservation" payment model for provisioned deployments. This is in addition to the current **Commitment** payment model, which will continue to be supported at least through the end of 2024.
78
90
79
-
### Commitment payment mode (current model)
91
+
### Commitment payment model
80
92
81
-
- Regional, monthly commitment is required to use provisioned (longer terms available contractually)
93
+
- Regional, monthly commitment is required to use provisioned (longer terms available contractually).
82
94
83
95
- Commitments are bound to Azure OpenAI resources, making moving deployments across resources difficult.
84
96
85
97
- Commitments can't be canceled or altered during their term, except to add new PTUs.
86
98
87
-
- Supports models released prior to June 29, 2024.
99
+
- Supports models released prior to August 1, 2024.
88
100
89
101
### Hourly reservation payment model
90
102
91
-
-Payment model aligned with Azure standards for other products.
103
+
-The payment model is aligned with Azure standards for other products.
92
104
93
105
- Hourly usage is supported, without commitment.
94
106
95
107
- One month and one year term discounts can be purchased as regional Azure Reservations.
96
108
97
109
- Reservations can be flexibly scoped to cover multiple subscriptions, and the scope can be changed mid-term.
98
110
99
-
- Supports all models, both old and new
111
+
- Supports all models, both old and new.
100
112
101
113
> [!IMPORTANT]
102
-
> **Models released after July 28, 2024 require the use of the Hourly/Reservation payment model.** They are not deployable on Azure OpenAI resources that have active commitments. To deploy models released after July 28, exiting customers must either:
103
-
> - Create deployments on new Azure OpenAI resources without commitments.
114
+
> **Models released after August 1, 2024 require the use of the Hourly/Reservation payment model.** They are not deployable on Azure OpenAI resources that have active commitments. To deploy models released after August 1, exiting customers must either:
115
+
> - Create deployments on Azure OpenAI resources without commitments.
104
116
> - Migrate an existing resources off its commitments.
105
117
106
118
107
119
## Hourly reservation model details
108
120
109
-
Details on the hourly/reservation model can be found in the [Azure OpenAI Provisioned Onboarding Guide](../how-to/provisioned-throughput-onboarding.md)
121
+
Details on the hourly/reservation model can be found in the [Azure OpenAI Provisioned Onboarding Guide](../how-to/provisioned-throughput-onboarding.md).
110
122
111
123
### Commitment and hourly reservation coexistence
112
124
113
-
Customers that have commitments today aren't required to use the hourly/reservation model. They can continue to use existing commitments, purchase new commitments, and manage commitments as they do today.
125
+
Customers that have commitments aren't required to use the hourly/reservation model. They can continue to use existing commitments, purchase new commitments, and manage commitments as they do currently.
114
126
115
127
A customer can also decide to use both payment models in the same subscription/region. In this case, **the payment model for a deployment depends on the resource to which it is attached.**
116
128
117
-
**Deployments on resources with active commitments will follow the commitment payment model.**
129
+
**Deployments on resources with active commitments follow the commitment payment model.**
118
130
119
-
- The monthly commitment purchase will cover the deployed PTUs.
131
+
- The monthly commitment purchase covers the deployed PTUs.
120
132
121
-
- Hourly overage charges will be generated if the deployed PTUs ever become greater than the committed PTUs.
133
+
- Hourly overage charges are generated if the deployed PTUs ever become greater than the committed PTUs.
122
134
123
-
- All existing discounts attached to the monthly commitment SKU will continue to apply.
135
+
- All existing discounts attached to the monthly commitment SKU continue to apply.
124
136
125
137
-**Azure Reservations DO NOT apply additional discounts on top of the monthly commitment SKU**, however they will apply discounts to any overages (this behavior is new).
126
138
127
-
- The **Manage Commitments** page in Studio will be used to purchase and manage commitments.
139
+
- The **Manage Commitments** page in Studio is used to purchase and manage commitments.
128
140
129
-
Deployments on resources without commitments (or only expired commitments) will follow the Hourly/Reservation payment model.
130
-
- Deployments will generate hourly charges under the new Hourly/Reservation SKU and meter.
141
+
Deployments on resources without commitments (or only expired commitments) follow the Hourly/Reservation payment model.
142
+
- Deployments generate hourly charges under the new Hourly/Reservation SKU and meter.
131
143
- Azure Reservations can be purchased to discount the PTUs for deployments.
132
144
- Reservations are purchased and managed from the Reservation blade of the Azure portal (not within Studio).
133
145
134
-
If a deployment is on a resource that has a commitment, and that commitment expires. The deployment will automatically shift to be billed
146
+
If a deployment is on a resource that has a commitment, and that commitment expires. The deployment will automatically shift to be billed.
135
147
136
148
### Changes to the existing payment mode
137
149
138
-
Customers that have commitments today can continue to use them at least through the end of 2024. This includes purchasing new PTUs on new or existing commitments and managing commitment renewal behaviors. However, the changes on July 29, 2024 will change these aspects of commitment operation.
150
+
Customers that have commitments today can continue to use them at least through the end of 2024. This includes purchasing new PTUs on new or existing commitments and managing commitment renewal behaviors. However, the August update has changed certain aspects of commitment operation.
139
151
140
-
- Only models released as provisioned July 28, 2024 or before can be deployed on a resource with a commitment.
152
+
- Only models released as provisioned prior to August 1, 2023 or before can be deployed on a resource with a commitment.
141
153
142
-
- Overage charges will be emitted against the hourly SKU used for the hourly/reservations model, allowing the overage charges to be discounted by an Azure Reservation if one exists.
154
+
- If the deployed PTUs under a commitment exceed the committed PTUs, the hourly overage charges will be emitted against the same hourly meter as used for the new hourly/reservation payment model. This allows the overage charges to be discounted via an Azure Reservation.
155
+
- It is possible to deploy more PTUs than are committed on the resource. This supports the ability to guarantee capacity availability prior to increasing the commitment size to cover it.
143
156
144
157
## Migrating existing resources off commitments
145
158
@@ -171,7 +184,7 @@ An alternative approach to self-service migration is to switch the reservation p
171
184
* There will be a short period of double-billing or hourly charges during the switchover from committed to hourly/reservation billing.
172
185
173
186
> [!IMPORTANT]
174
-
> Both self-service approaches will generate some additional charges as the payment mode is switched from Committed to Hourly/Reservation. These are characteristics of the migration approaches and customers will not be credited for these charges. Customers may choose to use the managed migration approach described below to avoid them.
187
+
> Both self-service approaches generate some additional charges as the payment mode is switched from Committed to Hourly/Reservation. These are characteristics of the migration approaches and customers aren't credited for these charges. Customers may choose to use the managed migration approach described below to avoid them.
0 commit comments