Merge pull request #203839 from dlepow/quota

v-stsavell · web-flow · commit d989443f64e2 · 2022-07-07T12:05:35.000-05:00
[APIM] Note for quota policy
diff --git a/articles/api-management/api-management-access-restriction-policies.md b/articles/api-management/api-management-access-restriction-policies.md
@@ -186,7 +186,7 @@ If `identity-type=jwt` is configured, a JWT token is required to be validated. T
 | authorization-id | The authorization resource identifier. | Yes |   |
 | context-variable-name | The name of the context variable to receive the [`Authorization` object](#authorization-object). | Yes |   |
 | identity-type | Type of identity to be checked against the authorization access policy. <br> - `managed`: managed identity of the API Management service. <br> - `jwt`: JWT bearer token specified in the `identity` attribute. | No | managed |
-| identity | An Azure AD JWT bearer token to be checked against the authorization permissions. Ignored for `identity-type` other than `jwt`. <br><br>Expected claims: <br> - audience: https://azure-api.net/authorization-manager <br> - `oid`: Permission object id <br> - `tid`: Permission tenant id | No |   |
+| identity | An Azure AD JWT bearer token to be checked against the authorization permissions. Ignored for `identity-type` other than `jwt`. <br><br>Expected claims: <br> - audience: https://azure-api.net/authorization-manager <br> - `oid`: Permission object ID <br> - `tid`: Permission tenant ID | No |   |
 | ignore-error | Boolean. If acquiring the authorization context results in an error (for example, the authorization resource is not found or is in an error state): <br> - `true`: the context variable is assigned a value of null. <br> - `false`: return `500` | No | false |
 
 ### Authorization object
@@ -225,8 +225,7 @@ To understand the difference between rate limits and quotas, [see Rate limits an
 > * This policy can be used only once per policy document.
 > * [Policy expressions](api-management-policy-expressions.md) cannot be used in any of the policy attributes for this policy.
 
-> [!CAUTION]
-> Due to the distributed nature of throttling architecture, rate limiting is never completely accurate. The difference between configured and the real number of allowed requests varyies based on request volume and rate, backend latency, and other factors.
+[!INCLUDE [api-management-rate-limit-accuracy](../../includes/api-management-rate-limit-accuracy.md)]
 
 [!INCLUDE [api-management-policy-generic-alert](../../includes/api-management-policy-generic-alert.md)]
 
@@ -301,8 +300,7 @@ To understand the difference between rate limits and quotas, [see Rate limits an
 
 For more information and examples of this policy, see [Advanced request throttling with Azure API Management](./api-management-sample-flexible-throttling.md).
 
-> [!CAUTION]
-> Due to the distributed nature of throttling architecture, rate limiting is never completely accurate. The difference between configured and the real number of allowed requests vary based on request volume and rate, backend latency, and other factors.
+[!INCLUDE [api-management-rate-limit-accuracy](../../includes/api-management-rate-limit-accuracy.md)]
 
 [!INCLUDE [api-management-policy-form-alert](../../includes/api-management-policy-form-alert.md)]
 
@@ -430,6 +428,8 @@ To understand the difference between rate limits and quotas, [see Rate limits an
 > * This policy can be used only once per policy document.
 > * [Policy expressions](api-management-policy-expressions.md) cannot be used in any of the policy attributes for this policy.
 
+[!INCLUDE [api-management-quota-accuracy](../../includes/api-management-quota-accuracy.md)]
+
 [!INCLUDE [api-management-policy-generic-alert](../../includes/api-management-policy-generic-alert.md)]
 
 ### Policy statement
@@ -491,6 +491,9 @@ For more information and examples of this policy, see [Advanced request throttli
 
 To understand the difference between rate limits and quotas, [see Rate limits and quotas.](./api-management-sample-flexible-throttling.md#rate-limits-and-quotas)
 
+[!INCLUDE [api-management-quota-accuracy](../../includes/api-management-quota-accuracy.md)]
+
+
 [!INCLUDE [api-management-policy-form-alert](../../includes/api-management-policy-form-alert.md)]
 
 
diff --git a/articles/api-management/api-management-sample-flexible-throttling.md b/articles/api-management/api-management-sample-flexible-throttling.md
@@ -25,13 +25,16 @@ Rate limits and quotas are used for different purposes.
 ### Rate limits
 Rate limits are usually used to protect against short and intense volume bursts. For example, if you know your backend service has a bottleneck at its database with a high call volume, you could set a `rate-limit-by-key` policy to not allow high call volume by using this setting.
 
+[!INCLUDE [api-management-rate-limit-accuracy](../../includes/api-management-rate-limit-accuracy.md)]
+
+
 ### Quotas
 Quotas are usually used for controlling call rates over a longer period of time. For example, they can set the total number of calls that a particular subscriber can make within a given month. For monetizing your API, quotas can also be set differently for tier-based subscriptions. For example, a Basic tier subscription might be able to make no more than 10,000 calls a month but a Premium tier could go up to 100,000,000 calls each month.
 
 Within Azure API Management, rate limits are typically propagated faster across the nodes to protect against spikes. In contrast, usage quota information is used over a longer term and hence its implementation is different.
 
-> [!CAUTION]
-> Due to the distributed nature of throttling architecture, rate limiting is never completely accurate. The difference between the configured and the real number of allowed requests vary based on request volume and rate, backend latency, and other factors.
+[!INCLUDE [api-management-quota-accuracy](../../includes/api-management-quota-accuracy.md)]
+
 
 ## Product-based throttling
 Rate throttling capabilities that are scoped to a particular subscription are useful for the API provider to apply limits on the developers who have signed up to use their API. However, it does not help, for example, in throttling individual end users of the API. It is possible for a single user of the developer's application to consume the entire quota and then prevent other customers of the developer from being able to use the application. Also, several customers who might generate a high volume of requests may limit access to occasional users.
diff --git a/includes/api-management-quota-accuracy.md b/includes/api-management-quota-accuracy.md
@@ -0,0 +1,9 @@
+---
+author: dlepow
+ms.service: api-management
+ms.topic: include
+ms.date: 07/05/2022
+ms.author: danlep
+---
+> [!NOTE]
+> When underlying compute resources restart in the service platform, API Management may continue to handle requests for a short period after a quota is reached.
diff --git a/includes/api-management-rate-limit-accuracy.md b/includes/api-management-rate-limit-accuracy.md
@@ -0,0 +1,9 @@
+---
+author: dlepow
+ms.service: api-management
+ms.topic: include
+ms.date: 07/05/2022
+ms.author: danlep
+---
+> [!CAUTION]
+> Due to the distributed nature of throttling architecture, rate limiting is never completely accurate. The difference between the configured and the actual number of allowed requests varies based on request volume and rate, backend latency, and other factors.