Skip to content

Commit c6aa28e

Browse files
author
gitName
committed
wip
1 parent 6b4289a commit c6aa28e

File tree

4 files changed

+9
-6
lines changed

4 files changed

+9
-6
lines changed

articles/api-management/azure-openai-api-from-specification.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ ms.service: azure-api-management
55
author: dlepow
66
ms.author: danlep
77
ms.topic: how-to
8-
ms.date: 04/01/2025
8+
ms.date: 04/30/2025
99
ms.collection: ce-skilling-ai-copilot
1010
ms.custom: template-how-to, build-2024
1111
---
@@ -65,11 +65,12 @@ To import an Azure OpenAI API to API Management:
6565

6666
For example, if your API Management gateway endpoint is `https://contoso.azure-api.net`, set a **Base URL** similar to `https://contoso.azure-api.net/my-openai-api/openai`.
6767
1. Optionally select one or more products to associate with the API. Select **Next**.
68-
1. On the **Policies** tab, optionally enable policies to monitor and manage Azure OpenAI API token consumption. You can also set or edit policies later.
68+
1. On the **Policies** tab, optionally enable policies to help monitor and manage Azure OpenAI API token consumption and cache resonses. You can also set or edit policies later.
6969

7070
If selected, enter settings or accept defaults that define the following policies (see linked articles for prerequisites and configuration details):
7171
* [Manage token consumption](azure-openai-token-limit-policy.md)
7272
* [Track token usage](azure-openai-emit-token-metric-policy.md)
73+
* [Enable semantic caching of responses](azure-openai-enable-semantic-caching.md)
7374

7475
Select **Review + Create**.
7576
1. After settings are validated, select **Create**.

articles/api-management/azure-openai-enable-semantic-caching.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ ms.collection: ce-skilling-ai-copilot
1818
Enable semantic caching of responses to Azure OpenAI API requests to reduce bandwidth and processing requirements imposed on the backend APIs and lower latency perceived by API consumers. With semantic caching, you can return cached responses for identical prompts and also for prompts that are similar in meaning, even if the text isn't the same. For background, see [Tutorial: Use Azure Cache for Redis as a semantic cache](../redis/tutorial-semantic-cache.md).
1919

2020
> [!NOTE]
21-
> The configuration steps in this article enable semantic caching for Azure OpenAI APIs. These steps can be generalized to enable semantic caching for corresponding large language model (LLM) APIs available through the [Azure AI Model Inference API](/azure/ai-studio/reference/reference-model-inference-api).
21+
> The configuration steps in this article enable semantic caching for Azure OpenAI APIs. These steps can be generalized to enable semantic caching for corresponding large language model (LLM) APIs available through the [Azure AI Model Inference API](/azure/ai-studio/reference/reference-model-inference-api) or with OpenAI-compatible models served through third-party inference providers.
2222
2323
## Prerequisites
2424

articles/api-management/genai-gateway-capabilities.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: dlepow
77
ms.service: azure-api-management
88
ms.collection: ce-skilling-ai-copilot
99
ms.topic: concept-article
10-
ms.date: 04/29/2025
10+
ms.date: 04/29/20 25
1111
ms.author: danlep
1212
---
1313

@@ -36,7 +36,7 @@ The rest of this article describes how Azure API Management can help you address
3636

3737
## Import Azure OpenAI Service resource as an API
3838

39-
[Import an API from an Azure OpenAI Service endpoint](azure-openai-api-from-specification.md) to Azure API management using a single-click experience. API Management streamlines the onboarding process by automatically importing the OpenAPI schema for the Azure OpenAI API and sets up authentication to the Azure OpenAI endpoint using managed identity, removing the need for manual configuration. Within the same user-friendly experience, you can preconfigure policies for [token limits](#token-limit-policy) and [emitting token metrics](#emit-token-metric-policy).
39+
[Import an API from an Azure OpenAI Service endpoint](azure-openai-api-from-specification.md) to Azure API management using a single-click experience. API Management streamlines the onboarding process by automatically importing the OpenAPI schema for the Azure OpenAI API and sets up authentication to the Azure OpenAI endpoint using managed identity, removing the need for manual configuration. Within the same user-friendly experience, you can preconfigure policies for [token limits](#token-limit-policy), [emitting token metrics](#emit-token-metric-policy), and [semantic caching](#semantic-caching-policy).
4040

4141
:::image type="content" source="media/azure-openai-api-from-specification/azure-openai-api.png" alt-text="Screenshot of Azure OpenAI API tile in the portal.":::
4242

@@ -99,15 +99,17 @@ Configure [Azure OpenAI semantic caching](azure-openai-enable-semantic-caching.m
9999

100100
:::image type="content" source="media/genai-gateway-capabilities/semantic-caching.png" alt-text="Diagram of semantic caching in API Management.":::
101101

102-
In API Management, enable semantic caching by using Azure Redis Enterprise or another [external cache](api-management-howto-cache-external.md) compatible with RediSearch and onboarded to Azure API Management. By using the Azure OpenAI Service Embeddings API, the [azure-openai-semantic-cache-store](azure-openai-semantic-cache-store-policy.md) and [azure-openai-semantic-cache-lookup](azure-openai-semantic-cache-lookup-policy.md) policies store and retrieve semantically similar prompt completions from the cache. This approach ensures completions reuse, resulting in reduced token consumption and improved response performance.
102+
In API Management, enable semantic caching by using Azure Redis Enterprise, Azure Managed Redis, or another [external cache](api-management-howto-cache-external.md) compatible with RediSearch and onboarded to Azure API Management. By using the Azure OpenAI Service Embeddings API, the [azure-openai-semantic-cache-store](azure-openai-semantic-cache-store-policy.md) and [azure-openai-semantic-cache-lookup](azure-openai-semantic-cache-lookup-policy.md) policies store and retrieve semantically similar prompt completions from the cache. This approach ensures completions reuse, resulting in reduced token consumption and improved response performance.
103103

104104
> [!TIP]
105105
> To enable semantic caching for other LLM APIs, API Management provides the equivalent [llm-semantic-cache-store-policy](llm-semantic-cache-store-policy.md) and [llm-semantic-cache-lookup-policy](llm-semantic-cache-lookup-policy.md) policies.
106106
107107

108108
## Content safety policy
109109

110+
To help safeguard users from harmful, offensive, or misleading content, you can automatically moderate all incoming requests to an LLM API by configuring the [llm-content-safety](llm-content-safety-policy.md) policy. The policy enforces content safety checks on LLM prompts by transmitting them first to the [Azure AI Content Safety](azure/ai-services/content-safety/overview) service before sending to the backend LLM API.
110111

112+
:::image type="content" source="media/genai-gateway-capabilities/content-safety.png" alt-text="Diagram of moderating prompts by Azure AI Content Safety in an API Management policy.":::
111113

112114
## Labs and samples
113115

76.5 KB
Loading

0 commit comments

Comments
 (0)