Skip to content

Commit 17789aa

Browse files
author
gitName
committed
[APIM] AI gateway branding
1 parent 1cb1695 commit 17789aa

File tree

6 files changed

+20
-20
lines changed

6 files changed

+20
-20
lines changed

articles/api-management/TOC.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@
217217
href: grpc-api.md
218218
- name: Azure OpenAI and LLM APIs
219219
items:
220-
- name: GenAI gateway capabilities in API Management
220+
- name: AI gateway capabilities in API Management
221221
href: genai-gateway-capabilities.md
222222
- name: Import Azure OpenAI API
223223
href: azure-openai-api-from-specification.md

articles/api-management/api-management-gateways-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -201,4 +201,4 @@ Lear more about:
201201
- [API Management in a Hybrid and multicloud World](https://aka.ms/hybrid-and-multi-cloud-api-management)
202202
- [Capacity metric](api-management-capacity.md) for scaling decisions
203203
- [Observability capabilities](observability.md) in API Management
204-
- [GenAI gateway capabilities](genai-gateway-capabilities.md) in API Management
204+
- [AI gateway capabilities](genai-gateway-capabilities.md) in API Management

articles/api-management/api-management-key-concepts.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ API Management integrates with many complementary Azure services to create enter
146146
**More information**:
147147
* [Basic enterprise integration](/azure/architecture/reference-architectures/enterprise-integration/basic-enterprise-integration?toc=%2Fazure%2Fapi-management%2Ftoc.json&bc=/azure/api-management/breadcrumb/toc.json)
148148
* [Landing zone accelerator](/azure/cloud-adoption-framework/scenarios/app-platform/api-management/landing-zone-accelerator?toc=%2Fazure%2Fapi-management%2Ftoc.json&bc=/azure/api-management/breadcrumb/toc.json)
149-
* [GenAI gateway capabilities in API Management](genai-gateway-capabilities.md)
149+
* [AI gateway capabilities in API Management](genai-gateway-capabilities.md)
150150
* [Synchronize APIs to API Center from API Management](../api-center/synchronize-api-management-apis.md?toc=%2Fazure%2Fapi-management%2Ftoc.json&bc=/azure/api-management/breadcrumb/toc.json)
151151

152152
## Key concepts

articles/api-management/azure-openai-enable-semantic-caching.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -156,4 +156,4 @@ For example, if the cache was used, the **Output** section includes entries simi
156156

157157
* [Caching policies](api-management-policies.md#caching)
158158
* [Azure Cache for Redis](../azure-cache-for-redis/cache-overview.md)
159-
* [GenAI gateway capabilities](genai-gateway-capabilities.md) in Azure API Management
159+
* [AI gateway capabilities](genai-gateway-capabilities.md) in Azure API Management

articles/api-management/genai-gateway-capabilities.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: GenAI gateway capabilities in Azure API Management
2+
title: AI gateway capabilities in Azure API Management
33
description: Learn about Azure API Management's policies and features to manage generative AI APIs, such as token rate limiting, load balancing, and semantic caching.
44
services: api-management
55
author: dlepow
@@ -11,15 +11,15 @@ ms.date: 02/05/2025
1111
ms.author: danlep
1212
---
1313

14-
# Overview of generative AI gateway capabilities in Azure API Management
14+
# Overview of AI gateway capabilities in Azure API Management
1515

1616
[!INCLUDE [api-management-availability-all-tiers](../../includes/api-management-availability-all-tiers.md)]
1717

18-
This article introduces capabilities in Azure API Management to help you manage generative AI APIs, such as those provided by [Azure OpenAI Service](/azure/ai-services/openai/overview). Azure API Management provides a range of policies, metrics, and other features to enhance security, performance, and reliability for the APIs serving your intelligent apps. Collectively, these features are called *generative AI (GenAI) gateway capabilities* for your generative AI APIs.
18+
This article introduces capabilities in Azure API Management to help you manage generative AI APIs, such as those provided by [Azure OpenAI Service](/azure/ai-services/openai/overview). Azure API Management provides a range of policies, metrics, and other features to enhance security, performance, and reliability for the APIs serving your intelligent apps. Collectively, these features are called *AI gateway capabilities* for your generative AI APIs.
1919

2020
> [!NOTE]
21-
> * This article focuses on capabilities to manage APIs exposed by Azure OpenAI Service. Many of the GenAI gateway capabilities apply to other large language model (LLM) APIs, including those available through [Azure AI Model Inference API](/azure/ai-studio/reference/reference-model-inference-api).
22-
> * Generative AI gateway capabilities are features of API Management's existing API gateway, not a separate API gateway. For more information on API Management, see [Azure API Management overview](api-management-key-concepts.md).
21+
> * This article focuses on capabilities to manage APIs exposed by Azure OpenAI Service. Many of the AI gateway capabilities apply to other large language model (LLM) APIs, including those available through [Azure AI Model Inference API](/azure/ai-studio/reference/reference-model-inference-api).
22+
> * AI gateway capabilities are features of API Management's existing API gateway, not a separate API gateway. For more information on API Management, see [Azure API Management overview](api-management-key-concepts.md).
2323
2424
## Challenges in managing generative AI APIs
2525

@@ -44,7 +44,7 @@ The rest of this article describes how Azure API Management can help you address
4444

4545
Configure the [Azure OpenAI token limit policy](azure-openai-token-limit-policy.md) to manage and enforce limits per API consumer based on the usage of Azure OpenAI Service tokens. With this policy you can set a rate limit, expressed in tokens-per-minute (TPM). You can also set a token quota over a specified period, such as hourly, daily, weekly, monthly, or yearly.
4646

47-
:::image type="content" source="media/genai-gateway-capabilities/token-rate-limiting.png" alt-text="Diagram of limiting Azure OpenAI Service tokens in API Management.":::
47+
:::image type="content" source="media/AI-gateway-capabilities/token-rate-limiting.png" alt-text="Diagram of limiting Azure OpenAI Service tokens in API Management.":::
4848

4949
This policy provides flexibility to assign token-based limits on any counter key, such as subscription key, originating IP address, or an arbitrary key defined through a policy expression. The policy also enables precalculation of prompt tokens on the Azure API Management side, minimizing unnecessary requests to the Azure OpenAI Service backend if the prompt already exceeds the limit.
5050

@@ -64,7 +64,7 @@ The following basic example demonstrates how to set a TPM limit of 500 per subsc
6464

6565
The [Azure OpenAI emit token metric](azure-openai-emit-token-metric-policy.md) policy sends metrics to Application Insights about consumption of LLM tokens through Azure OpenAI Service APIs. The policy helps provide an overview of the utilization of Azure OpenAI Service models across multiple applications or API consumers. This policy could be useful for chargeback scenarios, monitoring, and capacity planning.
6666

67-
:::image type="content" source="media/genai-gateway-capabilities/emit-token-metrics.png" alt-text="Diagram of emitting Azure OpenAI Service token metrics using API Management.":::
67+
:::image type="content" source="media/AI-gateway-capabilities/emit-token-metrics.png" alt-text="Diagram of emitting Azure OpenAI Service token metrics using API Management.":::
6868

6969
This policy captures prompt, completions, and total token usage metrics and sends them to an Application Insights namespace of your choice. Moreover, you can configure or select from predefined dimensions to split token usage metrics, so you can analyze metrics by subscription ID, IP address, or a custom dimension of your choice.
7070

@@ -87,17 +87,17 @@ One of the challenges when building intelligent applications is to ensure that t
8787

8888
The backend [load balancer](backends.md#backends-in-api-management) supports round-robin, weighted, and priority-based load balancing, giving you flexibility to define a load distribution strategy that meets your specific requirements. For example, define priorities within the load balancer configuration to ensure optimal utilization of specific Azure OpenAI endpoints, particularly those purchased as PTUs.
8989

90-
:::image type="content" source="media/genai-gateway-capabilities/backend-load-balancing.png" alt-text="Diagram of using backend load balancing in API Management.":::
90+
:::image type="content" source="media/AI-gateway-capabilities/backend-load-balancing.png" alt-text="Diagram of using backend load balancing in API Management.":::
9191

9292
The backend [circuit breaker](backends.md#circuit-breaker) features dynamic trip duration, applying values from the Retry-After header provided by the backend. This ensures precise and timely recovery of the backends, maximizing the utilization of your priority backends.
9393

94-
:::image type="content" source="media/genai-gateway-capabilities/backend-circuit-breaker.png" alt-text="Diagram of using backend circuit breaker in API Management.":::
94+
:::image type="content" source="media/AI-gateway-capabilities/backend-circuit-breaker.png" alt-text="Diagram of using backend circuit breaker in API Management.":::
9595

9696
## Semantic caching policy
9797

9898
Configure [Azure OpenAI semantic caching](azure-openai-enable-semantic-caching.md) policies to optimize token use by storing completions for similar prompts.
9999

100-
:::image type="content" source="media/genai-gateway-capabilities/semantic-caching.png" alt-text="Diagram of semantic caching in API Management.":::
100+
:::image type="content" source="media/AI-gateway-capabilities/semantic-caching.png" alt-text="Diagram of semantic caching in API Management.":::
101101

102102
In API Management, enable semantic caching by using Azure Redis Enterprise or another [external cache](api-management-howto-cache-external.md) compatible with RediSearch and onboarded to Azure API Management. By using the Azure OpenAI Service Embeddings API, the [azure-openai-semantic-cache-store](azure-openai-semantic-cache-store-policy.md) and [azure-openai-semantic-cache-lookup](azure-openai-semantic-cache-lookup-policy.md) policies store and retrieve semantically similar prompt completions from the cache. This approach ensures completions reuse, resulting in reduced token consumption and improved response performance.
103103

@@ -107,20 +107,20 @@ In API Management, enable semantic caching by using Azure Redis Enterprise or an
107107

108108
## Labs and samples
109109

110-
* [Labs for the GenAI gateway capabilities of Azure API Management](https://github.com/Azure-Samples/AI-Gateway)
111-
* [Azure API Management (APIM) - Azure OpenAI Sample (Node.js)](https://github.com/Azure-Samples/genai-gateway-apim)
110+
* [Labs for the AI gateway capabilities of Azure API Management](https://github.com/Azure-Samples/AI-Gateway)
111+
* [Azure API Management (APIM) - Azure OpenAI Sample (Node.js)](https://github.com/Azure-Samples/AI-gateway-apim)
112112
* [Python sample code for using Azure OpenAI with API Management](https://github.com/Azure-Samples/openai-apim-lb/blob/main/docs/sample-code.md)
113113

114114
## Architecture and design considerations
115115

116-
* [GenAI gateway reference architecture using API Management](/ai/playbook/technology-guidance/generative-ai/dev-starters/genai-gateway/reference-architectures/apim-based)
116+
* [AI gateway reference architecture using API Management](/ai/playbook/technology-guidance/generative-ai/dev-starters/AI-gateway/reference-architectures/apim-based)
117117
* [AI hub gateway landing zone accelerator](https://github.com/Azure-Samples/ai-hub-gateway-solution-accelerator)
118-
* [Designing and implementing a gateway solution with Azure OpenAI resources](/ai/playbook/technology-guidance/generative-ai/dev-starters/genai-gateway/)
118+
* [Designing and implementing a gateway solution with Azure OpenAI resources](/ai/playbook/technology-guidance/generative-ai/dev-starters/AI-gateway/)
119119
* [Use a gateway in front of multiple Azure OpenAI deployments or instances](/azure/architecture/ai-ml/guide/azure-openai-gateway-multi-backend)
120120

121121
## Related content
122122

123-
* [Blog: Introducing GenAI capabilities in Azure API Management](https://techcommunity.microsoft.com/t5/azure-integration-services-blog/introducing-genai-gateway-capabilities-in-azure-api-management/ba-p/4146525)
123+
* [Blog: Introducing AI capabilities in Azure API Management](https://techcommunity.microsoft.com/t5/azure-integration-services-blog/introducing-AI-gateway-capabilities-in-azure-api-management/ba-p/4146525)
124124
* [Blog: Integrating Azure Content Safety with API Management for Azure OpenAI Endpoints](https://techcommunity.microsoft.com/t5/fasttrack-for-azure/integrating-azure-content-safety-with-api-management-for-azure/ba-p/4202505)
125125
* [Training: Manage your generative AI APIs with Azure API Management](/training/modules/api-management)
126126
* [Smart load balancing for OpenAI endpoints and Azure API Management](https://techcommunity.microsoft.com/t5/fasttrack-for-azure/smart-load-balancing-for-openai-endpoints-and-azure-api/ba-p/3991616)

articles/api-management/index.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ landingContent:
116116
url: v2-service-tiers-overview.md
117117
- text: Workspaces
118118
url: workspaces-overview.md
119-
- text: GenAI gateway capabilities
119+
- text: AI gateway capabilities
120120
url: genai-gateway-capabilities.md
121121
# Card
122122
- title: Related services

0 commit comments

Comments
 (0)