You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/provisioned-throughput.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: Azure OpenAI Service provisioned throughput
3
3
description: Learn about provisioned throughput and Azure OpenAI.
4
4
ms.service: azure-ai-openai
5
5
ms.topic: conceptual
6
-
ms.date: 07/32/2024
6
+
ms.date: 07/23/2024
7
7
manager: nitinme
8
8
author: mrbullwinkle #ChrisHMSFT
9
9
ms.author: mbullwin #chrhoder
@@ -79,7 +79,7 @@ Provisioned quota is granted on a per subscription/region basis, and unlike Stan
79
79
80
80
The new quota shows up in the AI Studio and Azure OpenAI Studio as a quota item named **Provisioned Managed Throughput Unit**. In the Studio Quota pane, expanding the quota item will show the deployments contributing to usage of the quota.
81
81
82
-
:::image type="content" source="../media/provisioned/quota.png" alt-text="Screenshot of new quota UI for Azure OpenAI provisioned." lightbox="../media/provisioned/quota.png":::
82
+
:::image type="content" source="../media/provisioned/quota.png" alt-text="Screenshot of quota UI for Azure OpenAI provisioned." lightbox="../media/provisioned/quota.png":::
83
83
84
84
## Capacity transparency and quota definitions
85
85
@@ -96,17 +96,17 @@ To assist users to find the capacity needed for their deployments, customers wil
96
96
97
97
In AI Studio and Azure OpenAI Studio, the deployment experience will identify when a region lacks the capacity to support the desired model, version and number of PTUs, and will direct the user to a select an alternative region when needed.
98
98
99
-
:::image type="content" source="../media/provisioned/check-capacity.png" alt-text="Screenshot of new quota UI for Azure OpenAI provisioned." lightbox="./media/provisioned/check-capacity.png":::
99
+
:::image type="content" source="../media/provisioned/check-capacity.png" alt-text="Screenshot of the check capacity experience for quota for Azure OpenAI provisioned." lightbox="../media/provisioned/check-capacity.png":::
100
100
101
101
Details on the new deployment experience can be found in the updated Azure OpenAI [provisioned onboarding guide](../how-to/provisioned-throughput-onboarding.md).
102
102
103
-
The new [model capacities API](/rest/api/aiservices/accountmanagement/model-capacities/list?view=rest-aiservices-accountmanagement-2024-04-01-preview&tabs=HTTP) can also be used to programmatically identify the maximum sized deployment of a specified model that can be created in each region based on the availability of both quota in the subscription and service capacity in the region.
103
+
The new [model capacities API](/rest/api/aiservices/accountmanagement/model-capacities/list?view=rest-aiservices-accountmanagement-2024-04-01-preview&tabs=HTTP&preserve-view=true) can also be used to programmatically identify the maximum sized deployment of a specified model that can be created in each region based on the availability of both quota in the subscription and service capacity in the region.
104
104
105
105
If an acceptable region isn't available to support the desire model, version and/or PTUs, customers can also try the following steps:
106
106
107
107
- Attempt the deployment with a smaller number of PTUs.
108
108
- Attempt the deployment at a different time. Capacity availability changes dynamically based on customer demand and more capacity may become available later.
109
-
- Ensure that quota is available in all acceptable regions. The [model capacities API](/rest/api/aiservices/accountmanagement/model-capacities/list?view=rest-aiservices-accountmanagement-2024-04-01-preview&tabs=HTTP) and Studio experience consider quota availability in returning alternative regions for creating a deployment.
109
+
- Ensure that quota is available in all acceptable regions. The [model capacities API](/rest/api/aiservices/accountmanagement/model-capacities/list?view=rest-aiservices-accountmanagement-2024-04-01-preview&tabs=HTTP&preserve-view=true) and Studio experience consider quota availability in returning alternative regions for creating a deployment.
110
110
111
111
### Determining the number of PTUs needed for a workload
0 commit comments