Merge pull request #245710 from mrbullwinkle/mrb_07_20_2023_quota_rbac

prmerger-automator[bot] · web-flow · commit 46b6dcab060b · 2023-07-20T20:36:59.000Z
[Azure AI] [Azure OpenAI] update info on RBAC prereqs
diff --git a/articles/ai-services/openai/how-to/quota.md b/articles/ai-services/openai/how-to/quota.md
@@ -8,14 +8,19 @@ manager: nitinme
 ms.service: cognitive-services
 ms.subservice: openai
 ms.topic: how-to
-ms.date: 07/18/2023
+ms.date: 07/20/2023
 ms.author: mbullwin
 ---
 
 # Manage Azure OpenAI Service quota
 
 Quota provides the flexibility to actively manage the allocation of rate limits across the deployments within your subscription. This article walks through the process of managing your Azure OpenAI quota.
 
+## Prerequisites
+
+> [!IMPORTANT]
+> Quota requires the **Cognitive Services Usages Reader** role. This role provides the minimal access necessary to view quota usage across an Azure subscription. This role can be found in the Azure portal under **Subscriptions** > **Access control (IAM)** > **Add role assignment** > search for **Cognitive Services Usages Reader**.
+
 ## Introduction to quota
 
 Azure OpenAI's quota feature enables assignment of rate limits to your deployments, up-to a global limit called your “quota.”  Quota is assigned to your subscription on a per-region, per-model basis in units of **Tokens-per-Minute (TPM)**. When you onboard a subscription to Azure OpenAI, you'll receive default quota for most available models. Then, you'll assign TPM to each deployment as it is created, and the available quota for that model will be reduced by that amount. You can continue to create deployments and assign them TPM until you reach your quota limit. Once that happens, you can only create new deployments of that model by reducing the TPM assigned to other deployments of the same model (thus freeing TPM for use), or by requesting and being approved for a model quota increase in the desired region.