Skip to content

Commit e23a811

Browse files
committed
[APIM] Semantic caching - tier support
1 parent 508c73b commit e23a811

File tree

3 files changed

+8
-6
lines changed

3 files changed

+8
-6
lines changed

articles/api-management/azure-openai-enable-semantic-caching.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ ms.author: danlep
1212

1313
# Enable semantic caching for Azure OpenAI APIs in Azure API Management
1414

15+
[!INCLUDE [api-management-availability-premium-dev-standard-basic-standardv2-basicv2](../../includes/api-management-availability-premium-dev-standard-basic-standardv2-basicv2.md)]
16+
1517
Enable semantic caching of responses to Azure OpenAI API requests to reduce bandwidth and processing requirements imposed on the backend APIs and lower latency perceived by API consumers. With semantic caching, you can return cached responses for identical prompts and also for prompts that are similar in meaning, even if the text isn't the same. For background, see [Tutorial: Use Azure Cache for Redis as a semantic cache](../azure-cache-for-redis/cache-tutorial-semantic-cache.md).
1618

1719
## Prerequisites

articles/api-management/azure-openai-semantic-cache-lookup-policy.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ ms.author: danlep
1414

1515
# Get cached responses of Azure OpenAI API requests
1616

17-
[!INCLUDE [api-management-availability-basicv2-standardv2](../../includes/api-management-availability-basicv2-standardv2.md)]
17+
[!INCLUDE [api-management-availability-premium-dev-standard-basic-standardv2-basicv2](../../includes/api-management-availability-premium-dev-standard-basic-standardv2-basicv2.md)]
1818

1919
Use the `azure-openai-semantic-cache-lookup` policy to perform cache lookup of responses to Azure OpenAI Chat Completion API and Completion API requests from a configured external cache, based on vector proximity of the prompt to previous requests and a specified similarity score threshold. Response caching reduces bandwidth and processing requirements imposed on the backend Azure OpenAI API and lowers latency perceived by API consumers.
2020

@@ -59,7 +59,7 @@ Use the `azure-openai-semantic-cache-lookup` policy to perform cache lookup of r
5959

6060
- [**Policy sections:**](./api-management-howto-policies.md#sections) inbound
6161
- [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
62-
- [**Gateways:**](api-management-gateways-overview.md) v2
62+
- [**Gateways:**](api-management-gateways-overview.md) classic, v2
6363

6464
### Usage notes
6565

articles/api-management/azure-openai-semantic-cache-store-policy.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Azure API Management policy reference - azure-openai-sematic-cache-store
2+
title: Azure API Management policy reference - azure-openai-semantic-cache-store
33
description: Reference for the azure-openai-semantic-cache-store policy available for use in Azure API Management. Provides policy usage, settings, and examples.
44
services: api-management
55
author: dlepow
@@ -8,13 +8,13 @@ ms.service: api-management
88
ms.custom:
99
- build-2024
1010
ms.topic: article
11-
ms.date: 05/10/2024
11+
ms.date: 06/25/2024
1212
ms.author: danlep
1313
---
1414

1515
# Cache responses to Azure OpenAI API requests
1616

17-
[!INCLUDE [api-management-availability-basicv2-standardv2](../../includes/api-management-availability-basicv2-standardv2.md)]
17+
[!INCLUDE [api-management-availability-premium-dev-standard-basic-standardv2-basicv2](../../includes/api-management-availability-premium-dev-standard-basic-standardv2-basicv2.md)]
1818

1919
The `azure-openai-semantic-cache-store` policy caches responses to Azure OpenAI Chat Completion API and Completion API requests to a configured external cache. Response caching reduces bandwidth and processing requirements imposed on the backend Azure OpenAI API and lowers latency perceived by API consumers.
2020

@@ -43,7 +43,7 @@ The `azure-openai-semantic-cache-store` policy caches responses to Azure OpenAI
4343

4444
- [**Policy sections:**](./api-management-howto-policies.md#sections) outbound
4545
- [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API, operation
46-
- [**Gateways:**](api-management-gateways-overview.md) v2
46+
- [**Gateways:**](api-management-gateways-overview.md) classic, v2
4747

4848
### Usage notes
4949

0 commit comments

Comments
 (0)