AOAI policies - g/w support

dlepow · dlepow · commit 65a0faf2e198 · 2024-06-25T16:02:49.000-07:00
diff --git a/articles/api-management/api-management-gateways-overview.md b/articles/api-management/api-management-gateways-overview.md
@@ -131,7 +131,7 @@ Managed and self-hosted gateways support all available [policies](api-management
 
 <sup>1</sup> Configured policies that aren't supported by the self-hosted gateway are skipped during policy execution.<br/>
 <sup>2</sup> The quota by key policy isn't available in the v2 tiers.<br/>
-<sup>3</sup> The rate limit by key and quota by key policies aren't available in the Consumption tier.<br/>
+<sup>3</sup> The rate limit by key, quota by key, and Azure OpenAI token limit policies aren't available in the Consumption tier.<br/>
 <sup>4</sup> [!INCLUDE [api-management-self-hosted-gateway-rate-limit](../../includes/api-management-self-hosted-gateway-rate-limit.md)] [Learn more](how-to-self-hosted-gateway-on-kubernetes-in-production.md#request-throttling)
 
 
diff --git a/articles/api-management/azure-openai-emit-token-metric-policy.md b/articles/api-management/azure-openai-emit-token-metric-policy.md
@@ -6,7 +6,7 @@ author: dlepow
 
 ms.service: api-management
 ms.topic: article
-ms.date: 05/10/2024
+ms.date: 06/25/2024
 ms.author: danlep
 ms.custom:
   - build-2024
@@ -73,7 +73,7 @@ The `azure-openai-emit-token-metric` policy sends metrics to Application Insight
 
 - [**Policy sections:**](./api-management-howto-policies.md#sections) inbound
 - [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
--  [**Gateways:**](api-management-gateways-overview.md) classic, v2
+-  [**Gateways:**](api-management-gateways-overview.md) classic, v2, consumption, self-hosted
 
 ### Usage notes
 
diff --git a/articles/api-management/azure-openai-enable-semantic-caching.md b/articles/api-management/azure-openai-enable-semantic-caching.md
@@ -6,13 +6,13 @@ ms.service: api-management
 ms.custom:
   - build-2024
 ms.topic: how-to
-ms.date: 05/13/2024
+ms.date: 06/25/2024
 ms.author: danlep
 ---
 
 # Enable semantic caching for Azure OpenAI APIs in Azure API Management
 
-[!INCLUDE [api-management-availability-premium-dev-standard-basic-standardv2-basicv2](../../includes/api-management-availability-premium-dev-standard-basic-standardv2-basicv2.md)]
+[!INCLUDE [api-management-availability-all-tiers](../../includes/api-management-availability-all-tiers.md)]
 
 Enable semantic caching of responses to Azure OpenAI API requests to reduce bandwidth and processing requirements imposed on the backend APIs and lower latency perceived by API consumers. With semantic caching, you can return cached responses for identical prompts and also for prompts that are similar in meaning, even if the text isn't the same. For background, see [Tutorial: Use Azure Cache for Redis as a semantic cache](../azure-cache-for-redis/cache-tutorial-semantic-cache.md).
 
diff --git a/articles/api-management/azure-openai-semantic-cache-lookup-policy.md b/articles/api-management/azure-openai-semantic-cache-lookup-policy.md
@@ -8,13 +8,13 @@ ms.service: api-management
 ms.custom:
   - build-2024
 ms.topic: article
-ms.date: 05/10/2024
+ms.date: 06/25/2024
 ms.author: danlep
 ---
 
 # Get cached responses of Azure OpenAI API requests
 
-[!INCLUDE [api-management-availability-premium-dev-standard-basic-standardv2-basicv2](../../includes/api-management-availability-premium-dev-standard-basic-standardv2-basicv2.md)]
+[!INCLUDE [api-management-availability-all-tiers](../../includes/api-management-availability-all-tiers.md)]
 
 Use the `azure-openai-semantic-cache-lookup` policy to perform cache lookup of responses to Azure OpenAI Chat Completion API and Completion API requests from a configured external cache, based on vector proximity of the prompt to previous requests and a specified similarity score threshold. Response caching reduces bandwidth and processing requirements imposed on the backend Azure OpenAI API and lowers latency perceived by API consumers.
 
@@ -59,7 +59,7 @@ Use the `azure-openai-semantic-cache-lookup` policy to perform cache lookup of r
 
 - [**Policy sections:**](./api-management-howto-policies.md#sections) inbound
 - [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
--  [**Gateways:**](api-management-gateways-overview.md) classic, v2
+-  [**Gateways:**](api-management-gateways-overview.md) classic, v2, consumption, self-hosted
 
 ### Usage notes
 
diff --git a/articles/api-management/azure-openai-semantic-cache-store-policy.md b/articles/api-management/azure-openai-semantic-cache-store-policy.md
@@ -14,7 +14,7 @@ ms.author: danlep
 
 # Cache responses to Azure OpenAI API requests
 
-[!INCLUDE [api-management-availability-premium-dev-standard-basic-standardv2-basicv2](../../includes/api-management-availability-premium-dev-standard-basic-standardv2-basicv2.md)]
+[!INCLUDE [api-management-availability-all-tiers](../../includes/api-management-availability-all-tiers.md)]
 
 The `azure-openai-semantic-cache-store` policy caches responses to Azure OpenAI Chat Completion API and Completion API requests to a configured external cache. Response caching reduces bandwidth and processing requirements imposed on the backend Azure OpenAI API and lowers latency perceived by API consumers.
 
@@ -43,7 +43,7 @@ The `azure-openai-semantic-cache-store` policy caches responses to Azure OpenAI
 
 - [**Policy sections:**](./api-management-howto-policies.md#sections) outbound
 - [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
--  [**Gateways:**](api-management-gateways-overview.md) classic, v2
+-  [**Gateways:**](api-management-gateways-overview.md) classic, v2, consumption, self-hosted
 
 ### Usage notes
 
diff --git a/articles/api-management/azure-openai-token-limit-policy.md b/articles/api-management/azure-openai-token-limit-policy.md
@@ -8,7 +8,7 @@ ms.service: api-management
 ms.custom:
   - build-2024
 ms.topic: article
-ms.date: 05/10/2024
+ms.date: 06/25/2024
 ms.author: danlep
 ---
 
@@ -66,7 +66,7 @@ For more information, see [Azure OpenAI Service models](../ai-services/openai/co
 
 - [**Policy sections:**](./api-management-howto-policies.md#sections) inbound
 - [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
-- [**Gateways:**](api-management-gateways-overview.md) classic, v2
+- [**Gateways:**](api-management-gateways-overview.md) classic, v2, self-hosted
 
 ### Usage notes