You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: learn-pr/azure/optimize-spend-performance-azure-openai-service-provisioned-reservations/includes/choose-purchase-azure-openai-service-provisioned-reservation.md
+14-13Lines changed: 14 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -61,10 +61,13 @@ You can also choose to replace this reservation with a new reservation purchase.
61
61
62
62
To purchase an Azure OpenAI Service provisioned reservation, choose an Azure region and quantity, and then add the Azure OpenAI SKU to your cart. Then choose the quantity of PTUs that you want to purchase.
63
63
64
+
> [!NOTE]
65
+
> Reservations for Global, Data Zone, and Regional deployments aren't interchangeable. You need to purchase a separate reservation for each deployment type.
66
+
64
67
To buy a reservation, you must have the owner role or the reservation purchaser role on an Azure subscription that's of one of the following types:
65
68
66
69
- Enterprise (MS-AZR-0017P or MS-AZR-0148P)
67
-
- Pay-As-You-Go (MS-AZR-0003P or MS-AZR-0023P)
70
+
- Pay-as-you-go (MS-AZR-0003P or MS-AZR-0023P)
68
71
- Microsoft Customer Agreement.
69
72
70
73
Cloud solution providers (CSP) can use the Azure portal or [Partner Center](/partner-center/azure-reservations) to purchase Azure reservations. CSP partners can buy reservations for them in Partner Center when authorized by their customers. For more information, refer to [Buy Microsoft Azure reservations on behalf of your customers](/partner-center/azure-reservations-buying). Alternatively, once the partner has given permission to the end customer and they have the reservation purchaser role, they can purchase reservations in the Azure portal.
@@ -115,17 +118,18 @@ To buy an Azure OpenAI reservation, perform the following steps:
115
118
116
119
-**Management group.** Applies the reservation discount to the matching resource in the list of subscriptions that are a part of both the management group and billing scope. The management group scope applies to all subscriptions throughout the entire management group hierarchy. To buy a reservation for a management group, you must have at least read permission on the management group and be a reservation owner or reservation purchaser on the billing subscription.
117
120
118
-
5. Select a region to choose an Azure region that the reservation covers, and then select **Add to cart**.
121
+
5. Select a region to choose an Azure region that the reservation covers.
122
+
6. Select the products to cover your deployment type (Global, Data Zone, or Regional) and select **Add to cart**.
119
123
120
124
:::image type="content" source="../media/5-select-product-you-want-to-purchase-small.png" alt-text="A screenshot of the Select the product you want to purchase dialog box." border="true" lightbox="../media/5-select-product-you-want-to-purchase.png":::
121
125
122
-
6. In the cart, choose the quantity of PTUs that you want to purchase. For example, a quantity of 64 would cover up to 64 deployed PTUs every hour.
126
+
7. In the cart, choose the quantity of PTUs that you want to purchase. For example, a quantity of 64 would cover up to 64 deployed PTUs every hour.
123
127
124
-
7. Select **Next: Review + Buy** and review your purchase choices and their prices.
128
+
8. Select **Next: Review + Buy** and review your purchase choices and their prices.
125
129
126
-
8. Select **Buy now**.
130
+
9. Select **Buy now**.
127
131
128
-
9. After purchase, you can select **View this Reservation** to review your purchase status.
132
+
10. After purchase, you can select **View this Reservation** to review your purchase status.
129
133
130
134
## How reservation discounts apply to Azure OpenAI
131
135
@@ -143,13 +147,10 @@ The Azure OpenAI reservation application is based on an hourly comparison betwee
143
147
144
148
The following examples illustrate how the Azure OpenAI reservation discount applies, depending on the deployments.
145
149
146
-
-**A reservation is the same size as the deployed units.** For example, you purchase 100 PTUs on a reservation and you deploy 100 PTUs. In this example, you only pay the reservation price.
147
-
148
-
-**A reservation is larger than your deployed units.** For example, you purchase 300 PTUs on a reservation and you only deploy 100 PTUs. In this example, the reservation discount is applied to 100 PTUs. The remaining 200 PTUs in the reservation will go unused and won't carry forward to future billing periods.
149
-
150
-
-**A reservation is smaller than the deployed units.** For example, you purchase 200 PTUs on a reservation and you deploy 600 PTUs. In this example, the reservation discount is applied to the 200 PTUs that were used. The remaining 400 PTUs are charged at the hourly rate.
151
-
152
-
-**A reservation is the same size as the total of two deployments.** For example, you purchase 200 PTUs on a reservation and you have two deployments of 100 PTUs each. In this example, the discount is applied to the sum of deployed units.
150
+
-**A regional reservation that's exactly the same size as the regional deployed units.** For example, you purchase 100 PTUs on a regional reservation and you deploy 100 regional PTUs. In this example, you only pay the reservation price.
151
+
-**A global reservation that's larger than your global deployed units.** For example, you purchase 300 PTUs on a global reservation and you only deploy 100 global PTUs. In this example, the global reservation discount is applied to 100 global PTUs. The remaining 200 PTUs, in the global reservation will go unused, and won't carry forward to future billing periods.
152
+
-**A data zone reservation that's smaller than the data zone deployed units.** For example, you purchase 200 PTUs on a data zone reservation and you deploy 600 data zone PTUs. In this example, the data zone reservation discount is applied to the 200 data zone PTUs that were used. The remaining 400 data zone PTUs are charged at the pay-as-you-go rate.
153
+
-**A regional reservation that's the same size as the total of two regional deployments.** For example, you purchase 200 regional PTUs on a reservation and you have two deployments of 100 regional PTUs each. In this example, the discount is applied to the sum of deployed units.
Copy file name to clipboardExpand all lines: learn-pr/azure/optimize-spend-performance-azure-openai-service-provisioned-reservations/includes/estimate-request-deploy-provisioned-throughput-units.md
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -58,6 +58,9 @@ For example, the following screenshot displays a quota limit of 500 PTUs in West
58
58
59
59
By default, PTU quota is available in many regions. If an additional quota is required, customers can request it by using the **Request Quota** link next to the **Provisioned Managed Throughput Unit** quota item in Azure OpenAI Foundry. The form allows customers to request an increase in the PTU quota for a specified region. After the request is approved, customers will receive an email at the included address, typically within two business days.
60
60
61
+
> [!NOTE]
62
+
> You must specify quota for the deployment (Global PTU, Data Zone PTU, and Regional PTU have separate quota).
63
+
61
64
## Creating a provisioned deployment - capacity is available
62
65
63
66
You can create PTUs by using Azure OpenAI resource objects within Azure. You must have an Azure OpenAI resource in each region where you intend to create a deployment. Use the Azure portal to create a resource in a region with an available quota, if required. Note that Azure OpenAI resources can support multiple types of Azure OpenAI deployments at the same time. It isn't necessary to dedicate new resources for your provisioned deployments.
Copy file name to clipboardExpand all lines: learn-pr/azure/optimize-spend-performance-azure-openai-service-provisioned-reservations/includes/manage-monitor-provisioned-reservations.md
+25Lines changed: 25 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -86,6 +86,31 @@ If you discover that your organization's reservations are underused, you can eit
86
86
> [!NOTE]
87
87
> The total canceled commitment can't exceed (US dollar) \\$50,000 in a 12-month rolling window. Azure won't process any refund that exceeds the \\$50,000 limit in a 12-month window for either a billing profile or EA enrollment.
88
88
89
+
## Make optimizations with exchanges or scope changes
90
+
91
+
If you find that your organization's reservations are being underused, you have several ways to act.
92
+
93
+
### Exchange a reservation
94
+
95
+
If your reservation is consistently underutilized or if you need to move your Azure OpenAI Service workloads to a new deployment, consider exchanging them to align to your business.
96
+
97
+
- For Azure OpenAI Service Provisioned reservations you can change the region, term, deployment type, and payment option when you make an exchange.
98
+
- For example, you can exchange a quantity of 50 Global provisioned reservations with a one-year term in US West for the following:
99
+
-**New Deployment**: 50 Data Zone provisioned reservations with a one-year term in US West
100
+
-**New Region**: 50 Global provisioned reservation with a one-year term in Sweden Central
101
+
-**New Deployment and Regional**: 50 Data Zone provisioned reservations with a one-year term in Sweden Central
102
+
-**New Term**: 50 Global provisioned reservations with a three-year term in US West
103
+
- When you exchange a reservation, The prorated reservation amount is refunded, and you're charged fully for the new purchase. The prorated reservation amount is the daily prorated residual value of the reservation being returned.
104
+
- The new reservation's lifetime commitment should equal to or greater than the returned reservation's remaining commitment. For example, for a three-year reservation that's USD 100 per month which is exchanged after the 18th payment, the new reservation's lifetime commitment should be USD 1,800 or more (paid monthly or upfront).
105
+
- There are no fees or penalty for exchanges.
106
+
107
+
### Change the reservation scope
108
+
109
+
If you scope your reservation to a single subscription or resource group, then it’s possible that another subscription or resource group has matching resources that can benefit from the reservation. Consider one of the following two actions:
110
+
111
+
- Change the reservation scope to shared scope.
112
+
- Split the reservation into smaller chunks and assign them individually to scopes that have utilization for matching resources.
113
+
89
114
## Analyze and report after reservation purchase
90
115
91
116
You can create several reports to help you analyze and perform reporting after buying a reservation. When it comes to cost reporting, it is important to understand two concepts:
Copy file name to clipboardExpand all lines: learn-pr/azure/optimize-spend-performance-azure-openai-service-provisioned-reservations/includes/summary.md
+7-17Lines changed: 7 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,29 +4,19 @@ In this module, you learned how to efficiently allocate and use resources needed
4
4
5
5
You implemented the following process to resolve Contoso's requirements:
6
6
7
-
1. Choose to use [standard or provisioned deployments](/azure/ai-services/openai/how-to/deployment-types):
7
+
1. Choose to use [standard or provisioned deployments](/azure/ai-services/openai/how-to/deployment-types): Contoso decides to use provisioned.
8
8
9
-
Contoso decides to use provisioned.
9
+
2. Choose to use [global, data zone, or regional deployments](/azure/ai-services/openai/how-to/deployment-types): Contoso decides to use data zone.
10
10
11
-
2. Use [capacity calculator to forecast PTU](/azure/ai-services/openai/how-to/provisioned-throughput-onboarding) usage based off estimated configuration and usage in specific region:
11
+
3. Use [capacity calculator to forecast PTU](/azure/ai-services/openai/how-to/provisioned-throughput-onboarding) usage based off estimated configuration and usage in specific region: Contoso estimates they'll need 100 PTUs: GPT -4os in US West.
12
12
13
-
Contoso estimates they'll need 100 PTUs: GPT -4os in US West.
13
+
4.[Check PTU quota](/azure/ai-services/openai/how-to/provisioned-get-started) based on chosen region in Azure OpenAI Foundry: Contoso confirms 100 PTUs of PGT -4os are available in US West.
14
14
15
-
3.[Check PTU quota](/azure/ai-services/openai/how-to/provisioned-get-started)based on chosen region in Azure OpenAI Foundry:
15
+
5. Create an [Azure OpenAI resource](/azure/ai-services/openai/how-to/provisioned-get-started)in chosen region: Contoso creates an Azure OpenAI resource in US West.
16
16
17
-
Contoso confirms 100 PTUs of PGT -4os are available in US West.
17
+
6.[Create provisioned deployment](/azure/ai-services/openai/how-to/provisioned-get-started) in region where quota is available: Contoso creates a deployment in US West using PTU hourly.
18
18
19
-
4. Create an [Azure OpenAI resource](/azure/ai-services/openai/how-to/provisioned-get-started) in chosen region:
20
-
21
-
Contoso creates an Azure OpenAI resource in US West.
22
-
23
-
5.[Create provisioned deployment](/azure/ai-services/openai/how-to/provisioned-get-started) in region where quota is available:
24
-
25
-
Contoso creates a deployment in US West using PTU hourly.
26
-
27
-
6.[Purchase provisioned reservations to](/azure/cost-management-billing/reservations/azure-openai) cover PTU hourly usage in specific region:
28
-
29
-
Contoso purchases 100 PTU reservations (monthly or yearly) in US west to get savings on 100 PTU hourly deployments.
19
+
7.[Purchase provisioned reservations to](/azure/cost-management-billing/reservations/azure-openai) cover PTU hourly usage in specific region: Contoso purchases 100 PTU reservations (monthly or yearly) in US west to get savings on 100 PTU hourly deployments.
Copy file name to clipboardExpand all lines: learn-pr/azure/optimize-spend-performance-azure-openai-service-provisioned-reservations/includes/understand-azure-openai-deployment-models.md
+7Lines changed: 7 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,6 +30,13 @@ As part of your solution design, you need to make two key decisions:
30
30
31
31
**Data zone provisioned**: Data zone provisioned deployments are available in the same Azure OpenAI resource as all other Azure OpenAI deployment types but allow you to leverage Azure global infrastructure to dynamically route traffic to the data center within the Microsoft specified data zone with the best availability for each request. Data zone provisioned deployments provide reserved model processing capacity for high and predictable throughput using Azure infrastructure within the Microsoft specified data zone.
32
32
33
+
**Regional Standard**: Standard deployments provide a pay-per-token billing model on the chosen model. It's the fastest way to get started because you only pay for what you consume during service usage. Specifically, for Azure OpenAI Service, the Standard deployment type lets you pay only for tokens processed. Standard deployments are optimized for low-to-medium-volume workloads with high burstiness. Customers with high, consistent volume may experience a more significant latency variability.
34
+
35
+
**Regional Provisioned**: Provisioned deployments allow you to specify the amount of throughput required in your Azure OpenAI Service deployment. The service then allocates the necessary model processing capacity and ensures it's ready for you. Throughput is defined in terms of provisioned throughput units or PTUs, which is a normalized way of representing the throughput for your deployment. This deployment type ensures consistent throughput and minimal latency variance for scalable solutions.
36
+
37
+
> [!NOTE]
38
+
> All deployments can perform the same inference operations. However, the billing, scale, and performance are substantially different.
39
+
33
40
## What does the provisioned deployment type provide?
Copy file name to clipboardExpand all lines: learn-pr/azure/optimize-spend-performance-azure-openai-service-provisioned-reservations/knowledge-check.yml
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -143,11 +143,11 @@ quiz:
143
143
- content: "Can you exchange Azure OpenAI Service provisioned reservations?"
144
144
choices:
145
145
- content: "Yes"
146
-
isCorrect: false
147
-
explanation: "Incorrect. You can't exchange Azure OpenAI Service provisioned reservations."
148
-
- content: "No"
149
146
isCorrect: true
150
-
explanation: "Correct. You can't exchange Azure OpenAI Service provisioned reservations."
147
+
explanation: "Correct. You can exchange Azure OpenAI Service provisioned reservations."
148
+
- content: "No"
149
+
isCorrect: false
150
+
explanation: "Incorrect. You can exchange Azure OpenAI Service provisioned reservations."
151
151
- content: "You need to know the monetary value of reservation consumption by an Azure subscription. Which cost data should you look for?"
0 commit comments