You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/quota.md
+50-4Lines changed: 50 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ manager: nitinme
8
8
ms.service: cognitive-services
9
9
ms.subservice: openai
10
10
ms.topic: how-to
11
-
ms.date: 07/31/2023
11
+
ms.date: 08/01/2023
12
12
ms.author: mbullwin
13
13
---
14
14
@@ -19,7 +19,8 @@ Quota provides the flexibility to actively manage the allocation of rate limits
19
19
## Prerequisites
20
20
21
21
> [!IMPORTANT]
22
-
> Quota requires the **Cognitive Services Usages Reader** role. This role provides the minimal access necessary to view quota usage across an Azure subscription. This role can be found in the Azure portal under **Subscriptions** > **Access control (IAM)** > **Add role assignment** > search for **Cognitive Services Usages Reader**.
22
+
> Viewing quota and deploying models requires the **Cognitive Services Usages Reader** role. This role provides the minimal access necessary to view quota usage across an Azure subscription. This role can be found in the Azure portal under **Subscriptions** > **Access control (IAM)** > **Add role assignment** > search for **Cognitive Services Usages Reader**.
23
+
> This role **must be applied at the subscription level**, it does not exist at the resource level. If you do not wish to use this role alternatively the Subscription **Reader** role will provide equivalent access, but it will also grant read access beyond the scope of what is needed for quota and model deployment.
23
24
24
25
## Introduction to quota
25
26
@@ -106,10 +107,12 @@ To minimize issues related to rate limits, it's a good idea to use the following
106
107
107
108
## Automate deployment
108
109
109
-
This section contains brief example templates to help get you started programmatically managing quota, and deploying resources. With the introduction of quota you must use API version `2023-05-01` for resource management related activities. This API version is only for managing your resources, and does not impact the API version used for inferencing calls like completions, chat completions, embedding, image generation etc.
110
+
This section contains brief example templates to help get you started programmatically creating deployments that use quota to set TPM rate limits. With the introduction of quota you must use API version `2023-05-01` for resource management related activities. This API version is only for managing your resources, and does not impact the API version used for inferencing calls like completions, chat completions, embedding, image generation etc.
110
111
111
112
# [REST](#tab/rest)
112
113
114
+
### Deployment
115
+
113
116
```http
114
117
PUT https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/deployments/{deploymentName}?api-version=2023-05-01
115
118
```
@@ -149,6 +152,33 @@ curl -X PUT https://management.azure.com/subscriptions/00000000-0000-0000-0000-0
149
152
> [!NOTE]
150
153
> There are multiple ways to generate an authorization token. The easiest method for initial testing is to launch the Cloud Shell from the [Azure portal](https://portal.azure.com). Then run [`az account get-access-token`](/cli/azure/account?view=azure-cli-latest#az-account-get-access-token&preserve-view=true). You can use this token as your temporary authorization token for API testing.
151
154
155
+
### Usage
156
+
157
+
To query your quota usage in a given region, for a specific subscription
158
+
159
+
```html
160
+
GET https://management.azure.com/subscriptions/{subscriptionId}/providers/Microsoft.CognitiveServices/locations/{location}/usages?api-version=2023-05-01
161
+
```
162
+
**Path parameters**
163
+
164
+
| Parameter | Type | Required? | Description |
165
+
|--|--|--|--|
166
+
|```subscriptionId```| string | Required | Subscription ID for the associated subscription. |
167
+
|```location```| string | Required | Location to view usage for ex: `eastus`|
168
+
|```api-version```| string | Required |The API version to use for this operation. This follows the YYYY-MM-DD format. |
curl -X GET https://management.azure.com/subscriptions/00000000-0000-0000-0000-000000000000/providers/Microsoft.CognitiveServices/locations/eastus/usages?api-version=2023-05-01 \
178
+
-H "Content-Type: application/json" \
179
+
-H 'Authorization: Bearer YOUR_AUTH_TOKEN'
180
+
```
181
+
152
182
# [Azure CLI](#tab/cli)
153
183
154
184
Install the [Azure CLI](/cli/azure/install-azure-cli). Quota requires `Azure CLI version 2.51.0`. If you already have Azure CLI installed locally run `az upgrade` to update to the latest version.
@@ -184,7 +214,23 @@ By setting sku-capacity to 10 in the command below this deployment will be set w
For more details, consult the [full Azure CLI reference documentation](https://learn.microsoft.com/en-us]/cli/azure/cognitiveservices/account/deployment?view=azure-cli-latest)
217
+
### Usage
218
+
219
+
To [query your quota usage](/cli/azure/cognitiveservices/usage?view=azure-cli-latest) in a given region, for a specific subscription
220
+
221
+
```azurecli
222
+
az cognitiveservices usage list --location
223
+
```
224
+
225
+
### Example
226
+
227
+
```azurecli
228
+
az cognitiveservices usage list -l eastus
229
+
```
230
+
231
+
This command runs in the context of the currently active subscription for Azure CLI. Use `az-account-set --subscription` to [modify the active subscription](/cli/azure/manage-azure-subscriptions-azure-cli#change-the-active-subscription).
232
+
233
+
For more details on `az cognitiveservices account` and `az cognitivesservices usage` consult the [Azure CLI reference documentation](/cli/azure/cognitiveservices/account/deployment?view=azure-cli-latest)
0 commit comments