diff --git a/blog/en/blog/2025/02/24/apisix-ai-gateway-features.md b/blog/en/blog/2025/02/24/apisix-ai-gateway-features.md index 355aacd2d4321..9a1cd5cb6001c 100644 --- a/blog/en/blog/2025/02/24/apisix-ai-gateway-features.md +++ b/blog/en/blog/2025/02/24/apisix-ai-gateway-features.md @@ -20,7 +20,7 @@ image: https://static.api7.ai/uploads/2025/03/07/Qs4WrU0I_apisix-ai-gateway.webp ## Introduction: The Rise of AI Agents and the Evolution of AI Gateway -In recent years, AI agents such as AutoGPT, Chatbots, and AI Assistants have seen rapid development. These applications rely heavily on API calls to large language models (LLMs), which has brought about challenges considering high concurrency, cost control, and security. +In recent years, AI agents such as AutoGPT, Chatbots, and AI Assistants have seen rapid development. These applications rely heavily on API calls to large language models (LLMs), which have brought about challenges considering high concurrency, cost control, and security. Traditional API gateways primarily serve Web APIs and microservices and are not optimized for the unique needs of AI applications. This has led to the emergence of the concept of AI gateway. An AI gateway needs to provide enhanced capabilities in the following areas: @@ -60,6 +60,8 @@ Users can flexibly allocate traffic weights among different DeepSeek providers b These capabilities enable AI applications to adapt flexibly to different LLMs, improve reliability, and reduce API calling costs. +![AI Proxy](https://static.api7.ai/uploads/2025/08/01/TmTsNypy_ai-proxy-multi-workflow.webp) + ## AI Security Protection: Ensuring Safe and Compliant Use of AI AI APIs may involve sensitive data, misleading information, and potential misuse. Therefore, an AI gateway needs to provide security at multiple levels. @@ -99,6 +101,8 @@ Through Apache APISIX, enterprises can achieve fine-grained management of token ## Smart Routing: Dynamic Traffic Management for AI APIs +![Smart Routing](https://static.api7.ai/uploads/2025/04/28/bzziWsxs_smart-routing.webp) + During AI API calls, different tasks may require different LLMs. For example: - Code generation requests → sent to GPT-4 or DeepSeek. diff --git a/blog/en/blog/2025/03/06/what-is-an-ai-gateway.md b/blog/en/blog/2025/03/06/what-is-an-ai-gateway.md index 4e57b81707b3d..54cd6afcecfdf 100644 --- a/blog/en/blog/2025/03/06/what-is-an-ai-gateway.md +++ b/blog/en/blog/2025/03/06/what-is-an-ai-gateway.md @@ -92,7 +92,7 @@ As AI systems become integral to business operations, ensuring their reliability To address these challenges, the concept of an AI gateway has emerged. An AI gateway extends the functionalities of a traditional API gateway by incorporating features specifically designed for AI applications and LLM scenarios. It serves as a unified endpoint for connecting AI infrastructure and services, providing comprehensive control, security, and observability of AI traffic between applications and models. -![API7 AI gateway architecture](https://static.api7.ai/uploads/2025/03/06/iCGmdwUZ_api7-ai-gateway.webp) +![APISIX AI gateway architecture](https://static.api7.ai/uploads/2025/08/01/KvjMKKx2_apisix-ai-gateway-architecture.webp) ### Core Features of an AI Gateway @@ -104,6 +104,8 @@ An effective AI gateway encompasses several key functionalities: - **Prompt Protection**: Ensures that prompts sent to LLMs do not contain sensitive or inappropriate content, safeguarding against unintended data exposure. - **Content Moderation**: Monitors and filters responses from AI models to prevent the dissemination of harmful or non-compliant information. +![Security Workflow](https://static.api7.ai/uploads/2025/08/01/unlrtuQl_ai-gateway-security-feature.webp) + #### 2. Observability - **Usage Tracking**: Monitors token consumption and provides insights into how AI services are utilized, aiding in cost management and capacity planning. @@ -119,6 +121,9 @@ An effective AI gateway encompasses several key functionalities: #### 4. Reliability - **Multi-LLM Load Balancing**: Distributes requests across multiple AI models to optimize performance and prevent overloading. + +![AI Proxy](https://static.api7.ai/uploads/2025/08/01/TmTsNypy_ai-proxy-multi-workflow.webp) + - **Retry and Fallback Mechanisms**: Implements strategies to handle AI service failures gracefully, ensuring uninterrupted user experiences. - **Traffic Prioritization**: Routes high-priority requests to the most reliable AI services while deferring less critical tasks. diff --git a/blog/en/blog/2025/03/21/ai-gateway-vs-api-gateway-differences-explained.md b/blog/en/blog/2025/03/21/ai-gateway-vs-api-gateway-differences-explained.md index ccfdd412ed174..237a24daf0d94 100644 --- a/blog/en/blog/2025/03/21/ai-gateway-vs-api-gateway-differences-explained.md +++ b/blog/en/blog/2025/03/21/ai-gateway-vs-api-gateway-differences-explained.md @@ -105,6 +105,8 @@ To connect AI agents with external data and APIs, the **[Model Context Protocol 2. **Gateway Routing**: The AI gateway validates permissions, injects API keys, and routes the request to relevant services. 3. **Response Synthesis**: The gateway aggregates API responses (e.g., weather data + CRM contacts) and feeds them back to the AI model. +![How MCP Works](https://static.api7.ai/uploads/2025/08/01/zHkQ4hM0_how-mcp-works.webp) + **Example**: A user asks, "Email our top client in NYC about today's weather." The AI gateway uses MCP to: - Fetch the top client from Salesforce. diff --git a/blog/en/blog/2025/03/24/6-essential-ai-gateway-use-cases.md b/blog/en/blog/2025/03/24/6-essential-ai-gateway-use-cases.md index 5bba8cd458c8e..ecc5be69e6d84 100644 --- a/blog/en/blog/2025/03/24/6-essential-ai-gateway-use-cases.md +++ b/blog/en/blog/2025/03/24/6-essential-ai-gateway-use-cases.md @@ -34,6 +34,8 @@ With this foundation in place, let's explore the six common application scenario Modern enterprises increasingly rely on diverse AI models to address varied business needs, from customer-facing chatbots to internal document analysis. However, managing multiple vendors (e.g., OpenAI, Anthropic, Mistral) and deployment environments (cloud, on-prem, hybrid) introduces operational chaos. +![Centralized AI Service Management](https://static.api7.ai/uploads/2025/08/01/vwfP6Mwx_centralized-ai-gateway.webp) + Enterprises adopt specialized models for specific tasks: - **GPT-4**: High-quality text generation for customer support. @@ -79,6 +81,8 @@ AI services, particularly those based on large language models, can incur signif - **Budget Enforcement**: Setting spending limits for different teams or applications - **Caching Strategies**: Reducing redundant calls by storing frequent responses +![Cost Optimization and Rate Limiting](https://static.api7.ai/uploads/2025/08/01/D0JOkr1h_cost-optimization-and-rate-limiting.webp) + For instance, a customer service application might cache common questions about password resets or refund processes, significantly reducing the number of model invocations needed. As AI adoption continues to accelerate, we can expect AI gateways to evolve with even more sophisticated cost management capabilities: diff --git a/blog/en/blog/2025/04/08/introducing-apisix-ai-gateway.md b/blog/en/blog/2025/04/08/introducing-apisix-ai-gateway.md index 5dabcda76aa47..2a84121a83f3b 100644 --- a/blog/en/blog/2025/04/08/introducing-apisix-ai-gateway.md +++ b/blog/en/blog/2025/04/08/introducing-apisix-ai-gateway.md @@ -41,6 +41,8 @@ The [`ai-proxy-multi`](https://apisix.apache.org/docs/apisix/plugins/ai-proxy-mu Additionally, the plugin supports logging LLM request information in the access log, such as token usage, model, time to first response, and more. +![AI Proxy](https://static.api7.ai/uploads/2025/08/01/TmTsNypy_ai-proxy-multi-workflow.webp) + **Example: Load Balancing**: The following example demonstrates how to configure two models for load balancing, forwarding 80% of the traffic to one instance and 20% to another. @@ -281,6 +283,8 @@ The [`ai-prompt-template`](https://apisix.apache.org/docs/apisix/plugins/ai-prom The [`ai-prompt-guard`](https://apisix.apache.org/docs/apisix/plugins/ai-prompt-guard/) plugin protects your large language model (LLM) endpoints by inspecting and validating incoming prompt messages. It checks the request content against user-defined allow and deny patterns, ensuring only approved input is forwarded to the upstream LLM. Depending on its configuration, the plugin can check either the latest message or the entire conversation history and can be set to inspect prompts from all roles or only from the end user. +![ai-prompt-guard](https://static.api7.ai/uploads/2025/08/01/6Dl4AQGL_ai-prompt-guard-workflow.webp) + ### Content Moderation #### 8. ai-aws-content-moderation