Skip to content

Commit 65a0faf

Browse files
committed
AOAI policies - g/w support
1 parent e23a811 commit 65a0faf

6 files changed

+12
-12
lines changed

articles/api-management/api-management-gateways-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@ Managed and self-hosted gateways support all available [policies](api-management
131131

132132
<sup>1</sup> Configured policies that aren't supported by the self-hosted gateway are skipped during policy execution.<br/>
133133
<sup>2</sup> The quota by key policy isn't available in the v2 tiers.<br/>
134-
<sup>3</sup> The rate limit by key and quota by key policies aren't available in the Consumption tier.<br/>
134+
<sup>3</sup> The rate limit by key, quota by key, and Azure OpenAI token limit policies aren't available in the Consumption tier.<br/>
135135
<sup>4</sup> [!INCLUDE [api-management-self-hosted-gateway-rate-limit](../../includes/api-management-self-hosted-gateway-rate-limit.md)] [Learn more](how-to-self-hosted-gateway-on-kubernetes-in-production.md#request-throttling)
136136

137137

articles/api-management/azure-openai-emit-token-metric-policy.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: dlepow
66

77
ms.service: api-management
88
ms.topic: article
9-
ms.date: 05/10/2024
9+
ms.date: 06/25/2024
1010
ms.author: danlep
1111
ms.custom:
1212
- build-2024
@@ -73,7 +73,7 @@ The `azure-openai-emit-token-metric` policy sends metrics to Application Insight
7373

7474
- [**Policy sections:**](./api-management-howto-policies.md#sections) inbound
7575
- [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
76-
- [**Gateways:**](api-management-gateways-overview.md) classic, v2
76+
- [**Gateways:**](api-management-gateways-overview.md) classic, v2, consumption, self-hosted
7777

7878
### Usage notes
7979

articles/api-management/azure-openai-enable-semantic-caching.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,13 @@ ms.service: api-management
66
ms.custom:
77
- build-2024
88
ms.topic: how-to
9-
ms.date: 05/13/2024
9+
ms.date: 06/25/2024
1010
ms.author: danlep
1111
---
1212

1313
# Enable semantic caching for Azure OpenAI APIs in Azure API Management
1414

15-
[!INCLUDE [api-management-availability-premium-dev-standard-basic-standardv2-basicv2](../../includes/api-management-availability-premium-dev-standard-basic-standardv2-basicv2.md)]
15+
[!INCLUDE [api-management-availability-all-tiers](../../includes/api-management-availability-all-tiers.md)]
1616

1717
Enable semantic caching of responses to Azure OpenAI API requests to reduce bandwidth and processing requirements imposed on the backend APIs and lower latency perceived by API consumers. With semantic caching, you can return cached responses for identical prompts and also for prompts that are similar in meaning, even if the text isn't the same. For background, see [Tutorial: Use Azure Cache for Redis as a semantic cache](../azure-cache-for-redis/cache-tutorial-semantic-cache.md).
1818

articles/api-management/azure-openai-semantic-cache-lookup-policy.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,13 @@ ms.service: api-management
88
ms.custom:
99
- build-2024
1010
ms.topic: article
11-
ms.date: 05/10/2024
11+
ms.date: 06/25/2024
1212
ms.author: danlep
1313
---
1414

1515
# Get cached responses of Azure OpenAI API requests
1616

17-
[!INCLUDE [api-management-availability-premium-dev-standard-basic-standardv2-basicv2](../../includes/api-management-availability-premium-dev-standard-basic-standardv2-basicv2.md)]
17+
[!INCLUDE [api-management-availability-all-tiers](../../includes/api-management-availability-all-tiers.md)]
1818

1919
Use the `azure-openai-semantic-cache-lookup` policy to perform cache lookup of responses to Azure OpenAI Chat Completion API and Completion API requests from a configured external cache, based on vector proximity of the prompt to previous requests and a specified similarity score threshold. Response caching reduces bandwidth and processing requirements imposed on the backend Azure OpenAI API and lowers latency perceived by API consumers.
2020

@@ -59,7 +59,7 @@ Use the `azure-openai-semantic-cache-lookup` policy to perform cache lookup of r
5959

6060
- [**Policy sections:**](./api-management-howto-policies.md#sections) inbound
6161
- [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
62-
- [**Gateways:**](api-management-gateways-overview.md) classic, v2
62+
- [**Gateways:**](api-management-gateways-overview.md) classic, v2, consumption, self-hosted
6363

6464
### Usage notes
6565

articles/api-management/azure-openai-semantic-cache-store-policy.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ ms.author: danlep
1414

1515
# Cache responses to Azure OpenAI API requests
1616

17-
[!INCLUDE [api-management-availability-premium-dev-standard-basic-standardv2-basicv2](../../includes/api-management-availability-premium-dev-standard-basic-standardv2-basicv2.md)]
17+
[!INCLUDE [api-management-availability-all-tiers](../../includes/api-management-availability-all-tiers.md)]
1818

1919
The `azure-openai-semantic-cache-store` policy caches responses to Azure OpenAI Chat Completion API and Completion API requests to a configured external cache. Response caching reduces bandwidth and processing requirements imposed on the backend Azure OpenAI API and lowers latency perceived by API consumers.
2020

@@ -43,7 +43,7 @@ The `azure-openai-semantic-cache-store` policy caches responses to Azure OpenAI
4343

4444
- [**Policy sections:**](./api-management-howto-policies.md#sections) outbound
4545
- [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
46-
- [**Gateways:**](api-management-gateways-overview.md) classic, v2
46+
- [**Gateways:**](api-management-gateways-overview.md) classic, v2, consumption, self-hosted
4747

4848
### Usage notes
4949

articles/api-management/azure-openai-token-limit-policy.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: api-management
88
ms.custom:
99
- build-2024
1010
ms.topic: article
11-
ms.date: 05/10/2024
11+
ms.date: 06/25/2024
1212
ms.author: danlep
1313
---
1414

@@ -66,7 +66,7 @@ For more information, see [Azure OpenAI Service models](../ai-services/openai/co
6666

6767
- [**Policy sections:**](./api-management-howto-policies.md#sections) inbound
6868
- [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
69-
- [**Gateways:**](api-management-gateways-overview.md) classic, v2
69+
- [**Gateways:**](api-management-gateways-overview.md) classic, v2, self-hosted
7070

7171
### Usage notes
7272

0 commit comments

Comments
 (0)