Skip to content

Commit 70cd302

Browse files
committed
update
1 parent 259873e commit 70cd302

File tree

1 file changed

+179
-0
lines changed
  • articles/ai-services/openai/how-to

1 file changed

+179
-0
lines changed

articles/ai-services/openai/how-to/quota.md

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,185 @@ To minimize issues related to rate limits, it's a good idea to use the following
104104
- Avoid sharp changes in the workload. Increase the workload gradually.
105105
- Test different load increase patterns.
106106

107+
## Automate deployment
108+
109+
This section contains brief example templates to help get you started programmatically managing quota, and deploying resources. With the introduction of quota you must use API version `2023-05-01` for resource management related activities. This API version is only for managing your resources, and does not impact the API version used for inferencing calls like completions, chat completions, embedding, image generation etc.
110+
111+
# [REST](#tab/rest)
112+
113+
```http
114+
PUT https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/deployments/{deploymentName}?api-version=2023-05-01
115+
```
116+
117+
**Path parameters**
118+
119+
| Parameter | Type | Required? | Description |
120+
|--|--|--|--|
121+
| ```acountname``` | string | Required | The name of your Azure OpenAI Resource. |
122+
| ```deploymentName``` | string | Required | The deployment name you chose when you deployed an existing model or the name you would like a new model deployment to have. |
123+
| ```resourceGroupName``` | string | Required | The name of the associated resource group for this model deployment. |
124+
| ```subscriptionId``` | string | Required | Subscription ID for the associated subscription. |
125+
| ```api-version``` | string | Required |The API version to use for this operation. This follows the YYYY-MM-DD format. |
126+
127+
**Supported versions**
128+
129+
- `2023-05-01` [Swagger spec](https://github.com/Azure/azure-rest-api-specs/blob/1e71ad94aeb8843559d59d863c895770560d7c93/specification/cognitiveservices/resource-manager/Microsoft.CognitiveServices/stable/2023-05-01/cognitiveservices.json)
130+
131+
**Request body**
132+
133+
This is only a subset of the available request body parameters. For the full list of the parameters, you can refer to the [REST API reference documentation](https://learn.microsoft.com/en-us/rest/api/cognitiveservices/accountmanagement/deployments/create-or-update?tabs=HTTP).
134+
135+
|Parameter|Type| Description |
136+
|--|--|--|
137+
|sku | Sku | The resource model definition representing SKU.|
138+
|capacity|integer|This represents the amount of [quota](../how-to/quota.md) you are assigning to this deployment. A value of 1 equals 1,000 Tokens per Minute (TPM). A value of 10 equals 10k Tokens per Minute (TPM).|
139+
140+
#### Example request
141+
142+
```Bash
143+
curl -X PUT https://management.azure.com/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/resource-group-temp/providers/Microsoft.CognitiveServices/accounts/docs-openai-test-001/deployments/gpt-35-turbo-test-deployment?api-version=2023-05-01 \
144+
-H "Content-Type: application/json" \
145+
-H 'Authorization: Bearer YOUR_AUTH_TOKEN' \
146+
-d '{"sku":{"name":"Standard","capacity":10},"properties": {"model": {"format": "OpenAI","name": "gpt-35-turbo","version": "0613"}}}'
147+
```
148+
149+
> [!NOTE]
150+
> There are multiple ways to generate an authorization token. The easiest method for initial testing is to launch the Cloud Shell from the [Azure portal](https://portal.azure.com). Then run [`az account get-access-token`](/cli/azure/account?view=azure-cli-latest#az-account-get-access-token&preserve-view=true). You can use this token as your temporary authorization token for API testing.
151+
152+
# [Azure Resource Manager](#tab/arm)
153+
154+
```json
155+
//
156+
// This Azure Resource Manager template shows how to use the new schema introduced in the 2023-05-01 API version to
157+
// create deployments that set the model version and the TPM limits for standard deployments.
158+
//
159+
{
160+
"type": "Microsoft.CognitiveServices/accounts/deployments",
161+
"apiVersion": "2023-05-01",
162+
"name": "arm-je-aoai-test-resource/arm-je-std-deployment", // Update reference to parent Azure OpenAI resource
163+
"dependsOn": [
164+
"[resourceId('Microsoft.CognitiveServices/accounts', 'arm-je-aoai-test-resource')]" // Update reference to parent Azure OpenAI resource
165+
],
166+
"sku": {
167+
"name": "Standard",
168+
"capacity": 10 // The deployment will be created with a 10K TPM limit
169+
},
170+
"properties": {
171+
"model": {
172+
"format": "OpenAI",
173+
"name": "gpt-35-turbo",
174+
"version": "0613" // Version 0613 of gpt-35-turbo will be used
175+
}
176+
}
177+
}
178+
```
179+
180+
For more details, consult the [full Azure Resource Manager reference documentation](/azure/templates/microsoft.cognitiveservices/accounts/deployments?pivots=deployment-language-arm-template).
181+
182+
# [Bicep](#tab/bicep)
183+
184+
```bicep
185+
//
186+
// This Bicep template shows how to use the new schema introduced in the 2023-05-01 API version to
187+
// create deployments that set the model version and the TPM limits for standard deployments.
188+
//
189+
resource arm_je_std_deployment 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01' = {
190+
parent: arm_je_aoai_resource // Replace this with a reference to the parent Azure OpenAI resource
191+
name: 'arm-je-std-deployment'
192+
sku: {
193+
name: 'Standard'
194+
capacity: 10 // The deployment will be created with a 10K TPM limit
195+
}
196+
properties: {
197+
model: {
198+
format: 'OpenAI'
199+
name: 'gpt-35-turbo'
200+
version: '0613' // gpt-35-turbo version 0613 will be used
201+
}
202+
}
203+
}
204+
```
205+
206+
For more details consult the [full Bicep reference documentation](/azure/templates/microsoft.cognitiveservices/accounts/deployments?pivots=deployment-language-bicep).
207+
208+
# [Terraform](#tab/terraform)
209+
210+
```terraform
211+
# This Terraform template shows how to use the new schema introduced in the 2023-05-01 API version to
212+
# create deployments that set the model version and the TPM limits for standard deployments.
213+
#
214+
# The new schema is not yet available in the AzureRM provider (target v4.0), so this template uses the AzAPI
215+
# provider, which provides a Terraform-compatible interface to the underlying ARM structures.
216+
#
217+
# For more details on these providers:
218+
# AzureRM: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
219+
# AzAPI: https://registry.terraform.io/providers/azure/azapi/latest/docs
220+
#
221+
222+
#
223+
terraform {
224+
required_providers {
225+
azapi = { source = "Azure/azapi" }
226+
azurerm = { source = "hashicorp/azurerm" }
227+
}
228+
}
229+
230+
provider "azapi" {
231+
# Insert auth info here as necessary
232+
}
233+
234+
provider "azurerm" {
235+
# Insert auth info here as necessary
236+
features {
237+
}
238+
}
239+
240+
#
241+
# To create a complete example, AzureRM is used to create a new resource group and Azure OpenAI Resource
242+
#
243+
resource "azurerm_resource_group" "TERRAFORM-AOAI-TEST-GROUP" {
244+
name = "TERRAFORM-AOAI-TEST-GROUP"
245+
location = "canadaeast"
246+
}
247+
248+
resource "azurerm_cognitive_account" "TERRAFORM-AOAI-TEST-ACCOUNT" {
249+
name = "terraform-aoai-test-account"
250+
location = "canadaeast"
251+
resource_group_name = azurerm_resource_group.TERRAFORM-AOAI-TEST-GROUP.name
252+
kind = "OpenAI"
253+
sku_name = "S0"
254+
custom_subdomain_name = "terraform-test-account-"
255+
}
256+
257+
258+
#
259+
# AzAPI is used to create the deployment so that the TPM limit and model versions can be set
260+
#
261+
resource "azapi_resource" "TERRAFORM-AOAI-STD-DEPLOYMENT" {
262+
type = "Microsoft.CognitiveServices/accounts/deployments@2023-05-01"
263+
name = "TERRAFORM-AOAI-STD-DEPLOYMENT"
264+
parent_id = azurerm_cognitive_account.TERRAFORM-AOAI-TEST-ACCOUNT.id
265+
266+
body = jsonencode({
267+
sku = { # The sku object specifies the deployment type and limit in 2023-05-01
268+
name = "Standard",
269+
capacity = 10 # This deployment will be set with a 10K TPM limit
270+
},
271+
properties = {
272+
model = {
273+
format = "OpenAI",
274+
name = "gpt-35-turbo",
275+
version = "0613" # Deploy gpt-35-turbo version 0613
276+
}
277+
}
278+
})
279+
}
280+
```
281+
282+
For more details consult the [full Terraform reference documentation](/azure/templates/microsoft.cognitiveservices/accounts/deployments?pivots=deployment-language-terraform).
283+
284+
---
285+
107286
## Resource deletion
108287

109288
When an attempt to delete an Azure OpenAI resource is made from the Azure portal if any deployments are still present deletion is blocked until the associated deployments are deleted. Deleting the deployments first allows quota allocations to be properly freed up so they can be used on new deployments.

0 commit comments

Comments
 (0)