Skip to content
6 changes: 5 additions & 1 deletion blog/en/blog/2025/02/24/apisix-ai-gateway-features.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ image: https://static.api7.ai/uploads/2025/03/07/Qs4WrU0I_apisix-ai-gateway.webp

## Introduction: The Rise of AI Agents and the Evolution of AI Gateway

In recent years, AI agents such as AutoGPT, Chatbots, and AI Assistants have seen rapid development. These applications rely heavily on API calls to large language models (LLMs), which has brought about challenges considering high concurrency, cost control, and security.
In recent years, AI agents such as AutoGPT, Chatbots, and AI Assistants have seen rapid development. These applications rely heavily on API calls to large language models (LLMs), which have brought about challenges considering high concurrency, cost control, and security.

Traditional API gateways primarily serve Web APIs and microservices and are not optimized for the unique needs of AI applications. This has led to the emergence of the concept of AI gateway. An AI gateway needs to provide enhanced capabilities in the following areas:

Expand Down Expand Up @@ -60,6 +60,8 @@ Users can flexibly allocate traffic weights among different DeepSeek providers b

These capabilities enable AI applications to adapt flexibly to different LLMs, improve reliability, and reduce API calling costs.

![AI Proxy](https://static.api7.ai/uploads/2025/08/01/TmTsNypy_ai-proxy-multi-workflow.webp)

## AI Security Protection: Ensuring Safe and Compliant Use of AI

AI APIs may involve sensitive data, misleading information, and potential misuse. Therefore, an AI gateway needs to provide security at multiple levels.
Expand Down Expand Up @@ -99,6 +101,8 @@ Through Apache APISIX, enterprises can achieve fine-grained management of token

## Smart Routing: Dynamic Traffic Management for AI APIs

![Smart Routing](https://static.api7.ai/uploads/2025/04/28/bzziWsxs_smart-routing.webp)

During AI API calls, different tasks may require different LLMs. For example:

- Code generation requests → sent to GPT-4 or DeepSeek.
Expand Down
7 changes: 6 additions & 1 deletion blog/en/blog/2025/03/06/what-is-an-ai-gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ As AI systems become integral to business operations, ensuring their reliability

To address these challenges, the concept of an AI gateway has emerged. An AI gateway extends the functionalities of a traditional API gateway by incorporating features specifically designed for AI applications and LLM scenarios. It serves as a unified endpoint for connecting AI infrastructure and services, providing comprehensive control, security, and observability of AI traffic between applications and models.

![API7 AI gateway architecture](https://static.api7.ai/uploads/2025/03/06/iCGmdwUZ_api7-ai-gateway.webp)
![APISIX AI gateway architecture](https://static.api7.ai/uploads/2025/08/01/KvjMKKx2_apisix-ai-gateway-architecture.webp)

### Core Features of an AI Gateway

Expand All @@ -104,6 +104,8 @@ An effective AI gateway encompasses several key functionalities:
- **Prompt Protection**: Ensures that prompts sent to LLMs do not contain sensitive or inappropriate content, safeguarding against unintended data exposure.
- **Content Moderation**: Monitors and filters responses from AI models to prevent the dissemination of harmful or non-compliant information.

![Security Workflow](https://static.api7.ai/uploads/2025/08/01/unlrtuQl_ai-gateway-security-feature.webp)

#### 2. Observability

- **Usage Tracking**: Monitors token consumption and provides insights into how AI services are utilized, aiding in cost management and capacity planning.
Expand All @@ -119,6 +121,9 @@ An effective AI gateway encompasses several key functionalities:
#### 4. Reliability

- **Multi-LLM Load Balancing**: Distributes requests across multiple AI models to optimize performance and prevent overloading.

![AI Proxy](https://static.api7.ai/uploads/2025/08/01/TmTsNypy_ai-proxy-multi-workflow.webp)

- **Retry and Fallback Mechanisms**: Implements strategies to handle AI service failures gracefully, ensuring uninterrupted user experiences.
- **Traffic Prioritization**: Routes high-priority requests to the most reliable AI services while deferring less critical tasks.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,8 @@ To connect AI agents with external data and APIs, the **[Model Context Protocol
2. **Gateway Routing**: The AI gateway validates permissions, injects API keys, and routes the request to relevant services.
3. **Response Synthesis**: The gateway aggregates API responses (e.g., weather data + CRM contacts) and feeds them back to the AI model.

![How MCP Works](https://static.api7.ai/uploads/2025/08/01/zHkQ4hM0_how-mcp-works.webp)

**Example**: A user asks, "Email our top client in NYC about today's weather." The AI gateway uses MCP to:

- Fetch the top client from Salesforce.
Expand Down
4 changes: 4 additions & 0 deletions blog/en/blog/2025/03/24/6-essential-ai-gateway-use-cases.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ With this foundation in place, let's explore the six common application scenario

Modern enterprises increasingly rely on diverse AI models to address varied business needs, from customer-facing chatbots to internal document analysis. However, managing multiple vendors (e.g., OpenAI, Anthropic, Mistral) and deployment environments (cloud, on-prem, hybrid) introduces operational chaos.

![Centralized AI Service Management](https://static.api7.ai/uploads/2025/08/01/vwfP6Mwx_centralized-ai-gateway.webp)

Enterprises adopt specialized models for specific tasks:

- **GPT-4**: High-quality text generation for customer support.
Expand Down Expand Up @@ -79,6 +81,8 @@ AI services, particularly those based on large language models, can incur signif
- **Budget Enforcement**: Setting spending limits for different teams or applications
- **Caching Strategies**: Reducing redundant calls by storing frequent responses

![Cost Optimization and Rate Limiting](https://static.api7.ai/uploads/2025/08/01/D0JOkr1h_cost-optimization-and-rate-limiting.webp)

For instance, a customer service application might cache common questions about password resets or refund processes, significantly reducing the number of model invocations needed.

As AI adoption continues to accelerate, we can expect AI gateways to evolve with even more sophisticated cost management capabilities:
Expand Down
4 changes: 4 additions & 0 deletions blog/en/blog/2025/04/08/introducing-apisix-ai-gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ The [`ai-proxy-multi`](https://apisix.apache.org/docs/apisix/plugins/ai-proxy-mu

Additionally, the plugin supports logging LLM request information in the access log, such as token usage, model, time to first response, and more.

![AI Proxy](https://static.api7.ai/uploads/2025/08/01/TmTsNypy_ai-proxy-multi-workflow.webp)

**Example: Load Balancing**:

The following example demonstrates how to configure two models for load balancing, forwarding 80% of the traffic to one instance and 20% to another.
Expand Down Expand Up @@ -281,6 +283,8 @@ The [`ai-prompt-template`](https://apisix.apache.org/docs/apisix/plugins/ai-prom

The [`ai-prompt-guard`](https://apisix.apache.org/docs/apisix/plugins/ai-prompt-guard/) plugin protects your large language model (LLM) endpoints by inspecting and validating incoming prompt messages. It checks the request content against user-defined allow and deny patterns, ensuring only approved input is forwarded to the upstream LLM. Depending on its configuration, the plugin can check either the latest message or the entire conversation history and can be set to inspect prompts from all roles or only from the end user.

![ai-prompt-guard](https://static.api7.ai/uploads/2025/08/01/6Dl4AQGL_ai-prompt-guard-workflow.webp)

### Content Moderation

#### 8. ai-aws-content-moderation
Expand Down
Loading