Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion public/__redirects
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,9 @@
/ai-gateway/configuration/websockets-api/ /ai-gateway/websockets-api/ 301
/ai-gateway/configuration/websockets-api/non-realtime-api/ /ai-gateway/websockets-api/non-realtime-api/ 301
/ai-gateway/configuration/websockets-api/realtime-api/ /ai-gateway/websockets-api/realtime-api/ 301
/ai-gateway/configuration/caching/ /ai-gateway/features/caching/ 301
/ai-gateway/configuration/rate-limiting/ /ai-gateway/features/rate-limiting/ 301
/ai-gateway/configuration/custom-metadata/ /ai-gateway/observability/custom-metadata/ 301

# agents
/agents/capabilities/mcp-server/ /agents/model-context-protocol/ 301
Expand Down Expand Up @@ -2346,6 +2349,11 @@
# Calls
/calls/* /realtime/:splat 301

# AI Gateway
/ai-gateway/providers/* /ai-gateway/usage/providers/:splat 301
/ai-gateway/guardrails/* /ai-gateway/features/guardrails/:splat 301
/ai-gateway/websockets-api/* /ai-gateway/usage/websockets-api/:splat 301

# Realtime
/realtime/limits /realtime/sfu/limits 302
/realtime/sessions-tracks /realtime/sfu/sessions-tracks/ 302
Expand All @@ -2361,4 +2369,4 @@

/realtime/realtimekit/get-started /realtime/realtimekit/getting-started/ 302
/realtime/introduction /realtime/realtimekit/introduction 302
/realtime/concepts /realtime/realtimekit/concepts 302
/realtime/concepts /realtime/realtimekit/concepts 302
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ products:
date: 2025-01-02
---

[**AI Gateway**](/ai-gateway/) now supports [**DeepSeek**](/ai-gateway/providers/deepseek/), including their cutting-edge DeepSeek-V3 model. With this addition, you have even more flexibility to manage and optimize your AI workloads using AI Gateway. Whether you're leveraging DeepSeek or other providers, like OpenAI, Anthropic, or [Workers AI](/workers-ai/), AI Gateway empowers you to:
[**AI Gateway**](/ai-gateway/) now supports [**DeepSeek**](/ai-gateway/usage/providers/deepseek/), including their cutting-edge DeepSeek-V3 model. With this addition, you have even more flexibility to manage and optimize your AI workloads using AI Gateway. Whether you're leveraging DeepSeek or other providers, like OpenAI, Anthropic, or [Workers AI](/workers-ai/), AI Gateway empowers you to:

- **Monitor**: Gain actionable insights with analytics and logs.
- **Control**: Implement caching, rate limiting, and fallbacks.
Expand All @@ -31,4 +31,4 @@ curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/deepseek/cha
}'
```

For detailed setup instructions, see our [DeepSeek provider documentation](/ai-gateway/providers/deepseek/).
For detailed setup instructions, see our [DeepSeek provider documentation](/ai-gateway/usage/providers/deepseek/).
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ With the new AI Gateway binding methods, you can now:

- Send feedback and update metadata with `patchLog`.
- Retrieve detailed log information using `getLog`.
- Execute [universal requests](/ai-gateway/universal/) to any AI Gateway provider with `run`.
- Execute [universal requests](/ai-gateway/usage/universal/) to any AI Gateway provider with `run`.

For example, to send feedback and update metadata using `patchLog`:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@ products:
date: 2025-02-05
---

[AI Gateway](/ai-gateway/) has added three new providers: [Cartesia](/ai-gateway/providers/cartesia/), [Cerebras](/ai-gateway/providers/cerebras/), and [ElevenLabs](/ai-gateway/providers/elevenlabs/), giving you more even more options for providers you can use through AI Gateway. Here's a brief overview of each:
[AI Gateway](/ai-gateway/) has added three new providers: [Cartesia](/ai-gateway/usage/providers/cartesia/), [Cerebras](/ai-gateway/usage/providers/cerebras/), and [ElevenLabs](/ai-gateway/usage/providers/elevenlabs/), giving you more even more options for providers you can use through AI Gateway. Here's a brief overview of each:

- [Cartesia](/ai-gateway/providers/cartesia/) provides text-to-speech models that produce natural-sounding speech with low latency.
- [Cerebras](/ai-gateway/providers/cerebras/) delivers low-latency AI inference to Meta's Llama 3.1 8B and Llama 3.3 70B models.
- [ElevenLabs](/ai-gateway/providers/elevenlabs/) offers text-to-speech models with human-like voices in 32 languages.
- [Cartesia](/ai-gateway/usage/providers/cartesia/) provides text-to-speech models that produce natural-sounding speech with low latency.
- [Cerebras](/ai-gateway/usage/providers/cerebras/) delivers low-latency AI inference to Meta's Llama 3.1 8B and Llama 3.3 70B models.
- [ElevenLabs](/ai-gateway/usage/providers/elevenlabs/) offers text-to-speech models with human-like voices in 32 languages.

![Example of Cerebras log in AI Gateway](~/assets/images/ai-gateway/cerebras2.png)

To get started with AI Gateway, just update the base URL. Here's how you can send a request to [Cerebras](/ai-gateway/providers/cerebras/) using cURL:
To get started with AI Gateway, just update the base URL. Here's how you can send a request to [Cerebras](/ai-gateway/usage/providers/cerebras/) using cURL:

```bash title="Example fetch request"
curl -X POST https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY/cerebras/chat/completions \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ date: 2025-02-06

AI Gateway adds additional ways to handle requests - [Request Timeouts](/ai-gateway/configuration/request-handling/#request-timeouts) and [Request Retries](/ai-gateway/configuration/request-handling/#request-retries), making it easier to keep your applications responsive and reliable.

Timeouts and retries can be used on both the [Universal Endpoint](/ai-gateway/universal/) or directly to a [supported provider](/ai-gateway/providers/).
Timeouts and retries can be used on both the [Universal Endpoint](/ai-gateway/usage/universal/) or directly to a [supported provider](/ai-gateway/usage/providers/).

**Request timeouts**
A [request timeout](/ai-gateway/configuration/request-handling/#request-timeouts) allows you to trigger [fallbacks](/ai-gateway/configuration/fallbacks/) or a retry if a provider takes too long to respond.
Expand Down
4 changes: 2 additions & 2 deletions src/content/changelog/ai-gateway/2025-02-26-guardrails.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ date: 2025-02-26
preview_image: ~/assets/images/changelog/ai-gateway/guardrails-social-preview.png
---

[AI Gateway](/ai-gateway/) now includes [Guardrails](/ai-gateway/guardrails/), to help you monitor your AI apps for harmful or inappropriate content and deploy safely.
[AI Gateway](/ai-gateway/) now includes [Guardrails](/ai-gateway/features/guardrails/), to help you monitor your AI apps for harmful or inappropriate content and deploy safely.

Within the AI Gateway settings, you can configure:

Expand All @@ -15,4 +15,4 @@ Within the AI Gateway settings, you can configure:

![Guardrails in AI Gateway](~/assets/images/ai-gateway/Guardrails.png)

Learn more in the [blog](https://blog.cloudflare.com/guardrails-in-ai-gateway/) or our [documentation](/ai-gateway/guardrails/).
Learn more in the [blog](https://blog.cloudflare.com/guardrails-in-ai-gateway/) or our [documentation](/ai-gateway/features/guardrails/).
4 changes: 2 additions & 2 deletions src/content/changelog/ai-gateway/2025-03-20-websockets.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: AI Gateway now supports end-to-end, client-to-provider WebSockets
date: 2025-03-21
---

We are excited to announce that [AI Gateway](/ai-gateway/) now supports real-time AI interactions with the new [Realtime WebSockets API](/ai-gateway/websockets-api/realtime-api/).
We are excited to announce that [AI Gateway](/ai-gateway/) now supports real-time AI interactions with the new [Realtime WebSockets API](/ai-gateway/usage/websockets-api/realtime-api/).

This new capability allows developers to establish persistent, low-latency connections between their applications and AI models, enabling natural, real-time conversational AI experiences, including speech-to-speech interactions.

Expand Down Expand Up @@ -36,4 +36,4 @@ ws.send(
);
```

Get started by checking out the [Realtime WebSockets API](/ai-gateway/websockets-api/realtime-api/) documentation.
Get started by checking out the [Realtime WebSockets API](/ai-gateway/usage/websockets-api/realtime-api/) documentation.
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ products:
date: 2025-06-03
---

Users can now use an [OpenAI Compatible endpoint](/ai-gateway/chat-completion/) in AI Gateway to easily switch between providers, while keeping the exact same request and response formats. We're launching now with the chat completions endpoint, with the embeddings endpoint coming up next.
Users can now use an [OpenAI Compatible endpoint](/ai-gateway/usage/chat-completion/) in AI Gateway to easily switch between providers, while keeping the exact same request and response formats. We're launching now with the chat completions endpoint, with the embeddings endpoint coming up next.

To get started, use the OpenAI compatible chat completions endpoint URL with your own account id and gateway id and switch between providers by changing the `model` and `apiKey` parameters.

Expand All @@ -26,6 +26,6 @@ const response = await client.chat.completions.create({
console.log(response.choices[0].message.content);
```

Additionally, the [OpenAI Compatible endpoint](/ai-gateway/chat-completion/) can be combined with our [Universal Endpoint](/ai-gateway/universal/) to add fallbacks across multiple providers. That means AI Gateway will return every response in the same standardized format, no extra parsing logic required!
Additionally, the [OpenAI Compatible endpoint](/ai-gateway/usage/chat-completion/) can be combined with our [Universal Endpoint](/ai-gateway/usage/universal/) to add fallbacks across multiple providers. That means AI Gateway will return every response in the same standardized format, no extra parsing logic required!

Learn more in the [OpenAI Compatibility](/ai-gateway/chat-completion/) documentation.
Learn more in the [OpenAI Compatibility](/ai-gateway/usage/chat-completion/) documentation.
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ export default {
const client = new OpenAI({
apiKey: env.OPENAI_API_KEY,
// Optional: use AI Gateway to bring logs, evals & caching to your AI requests
// https://developers.cloudflare.com/ai-gateway/providers/openai/
// https://developers.cloudflare.com/ai-gateway/usage/providers/openai/
// baseUrl: "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai"
});

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ binding = "AI"

### Model routing

You can also use the model routing features in [AI Gateway](/ai-gateway/) directly from an Agent by specifying a [`gateway` configuration](/ai-gateway/providers/workersai/) when calling the AI binding.
You can also use the model routing features in [AI Gateway](/ai-gateway/) directly from an Agent by specifying a [`gateway` configuration](/ai-gateway/usage/providers/workersai/) when calling the AI binding.

:::note

Expand Down
107 changes: 107 additions & 0 deletions src/content/docs/ai-gateway/configuration/bring-your-own-keys.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: BYOK (Store Keys)
pcx_content_type: navigation
sidebar:
order: 2
badge: Beta
---

import { CardGrid, LinkTitleCard } from "~/components";

## Introduction

Bring your own keys (BYOK) is a feature in Cloudflare AI Gateway that allows you to securely store your AI provider API keys directly in the Cloudflare dashboard. Instead of including API keys in every request to your AI models, you can configure them once in the dashboard and reference them in your gateway configuration.

The keys are stored securely with [Secret Store](/secrets-store/) and allows for

- Secure storage and limit exposure
- Easier key rotation
- Rate limit, budget limit and other restrictions with [Dynamic Routes](/ai-gateway/features/dynamic-routing/)

## Setting up BYOK

### Prerequisites

- An active Cloudflare account with AI Gateway enabled
- Valid API keys for the AI providers you want to use
- Appropriate permissions to manage AI Gateway settings

### Configure API keys

{/* TODO UPDATE */}

1. Log into the [Cloudflare dashboard](https://dash.cloudflare.com/) and select your account.
2. Go to **AI** > **AI Gateway**.
3. Select your gateway or create a new one.
4. Navigate to the **Provider Keys** section.
5. Click **Add API Key**.
6. Select your AI provider from the dropdown.
7. Enter your API key and optionally provide a description.
8. Click **Save**.

### Update your applications

Once you've configured your API keys in the dashboard:

1. **Remove API keys from your code**: Delete any hardcoded API keys or environment variables
2. **Update request headers**: Remove authorization headers from your requests
3. **Test your integration**: Verify that requests work without including API keys

## Example

With BYOK enabled, your workflow changes from:

1. **Traditional approach**: Include API key in every request header

```bash
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
-H 'cf-aig-authorization: Bearer {CF_AIG_TOKEN}' \
-H "Authorization: Bearer YOUR_OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4", "messages": [...]}'
```

2. **BYOK approach**: Configure key once in dashboard, make requests without exposing keys
```bash
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
-H 'cf-aig-authorization: Bearer {CF_AIG_TOKEN}' \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4", "messages": [...]}'
```

## Managing API keys

### Viewing configured keys

In the AI Gateway dashboard, you can:

- View all configured API keys by provider
- See when each key was last used
- Check the status of each key (active, expired, invalid)

### Rotating keys

To rotate an API key:

{/* TODO UPDATE */}

1. Generate a new API key from your AI provider
2. In the Cloudflare dashboard, edit the existing key entry
3. Replace the old key with the new one
4. Save the changes

Your applications will immediately start using the new key without any code changes or downtime.

### Revoking access

To remove an API key:

{/* TODO UPDATE */}

1. In the AI Gateway dashboard, find the key you want to remove
2. Click the **Delete** button
3. Confirm the deletion

:::caution[Impact of key deletion]
Deleting an API key will immediately stop all requests that depend on it. Make sure to update your applications or configure alternative keys before deletion.
:::
7 changes: 4 additions & 3 deletions src/content/docs/ai-gateway/configuration/fallbacks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,13 @@
pcx_content_type: configuration
title: Fallbacks
sidebar:
order: 2
order: 6
hidden: true
---

import { Render } from "~/components";

Specify model or provider fallbacks with your [Universal endpoint](/ai-gateway/universal/) to handle request failures and ensure reliability.
Specify model or provider fallbacks with your [Universal endpoint](/ai-gateway/usage/universal/) to handle request failures and ensure reliability.

Cloudflare can trigger your fallback provider in response to [request errors](#request-failures) or [predetermined request timeouts](/ai-gateway/configuration/request-handling#request-timeouts). The [response header `cf-aig-step`](#response-headercf-aig-step) indicates which step successfully processed the request.

Expand Down Expand Up @@ -38,7 +39,7 @@ You can add as many fallbacks as you need, just by adding another object in the

## Response header(cf-aig-step)

When using the [Universal endpoint](/ai-gateway/universal/) with fallbacks, the response header `cf-aig-step` indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response.
When using the [Universal endpoint](/ai-gateway/usage/universal/) with fallbacks, the response header `cf-aig-step` indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response.

- `cf-aig-step:0` – The first (primary) model was used successfully.
- `cf-aig-step:1` – The request fell back to the second model.
Expand Down
14 changes: 10 additions & 4 deletions src/content/docs/ai-gateway/configuration/request-handling.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,12 @@ sidebar:

import { Render, Aside } from "~/components";

:::note[Deprecated]

While the request handling features described on this page still work, [Dynamic Routing](/ai-gateway/features/dynamic-routing/) is now the preferred way to achieve advanced request handling, including timeouts, retries, and fallbacks. Dynamic Routing provides a more powerful and flexible approach with a visual interface for managing complex routing scenarios.

:::

Your AI gateway supports different strategies for handling requests to providers, which allows you to manage AI interactions effectively and ensure your applications remain responsive and reliable.

## Request timeouts
Expand All @@ -28,7 +34,7 @@ A timeout is set in milliseconds. Additionally, the timeout is based on when the

#### Universal Endpoint

If set on a [Universal Endpoint](/ai-gateway/universal/), a request timeout specifies the timeout duration for requests and triggers a fallback.
If set on a [Universal Endpoint](/ai-gateway/usage/universal/), a request timeout specifies the timeout duration for requests and triggers a fallback.

For a Universal Endpoint, configure the timeout value by setting a `requestTimeout` property within the provider-specific `config` object. Each provider can have a different `requestTimeout` value for granular customization.

Expand Down Expand Up @@ -87,7 +93,7 @@ curl 'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}' \

#### Direct provider

If set on a [provider](/ai-gateway/providers/) request, request timeout specifies the timeout duration for a request and - if exceeded - returns an error.
If set on a [provider](/ai-gateway/usage/providers/) request, request timeout specifies the timeout duration for a request and - if exceeded - returns an error.

For a provider-specific endpoint, configure the timeout value by adding a `cf-aig-request-timeout` header.

Expand Down Expand Up @@ -123,7 +129,7 @@ On the final retry attempt, your gateway will wait until the request completes,

#### Universal endpoint

If set on a [Universal Endpoint](/ai-gateway/universal/), a request retry will automatically retry failed requests up to five times before triggering any configured fallbacks.
If set on a [Universal Endpoint](/ai-gateway/usage/universal/), a request retry will automatically retry failed requests up to five times before triggering any configured fallbacks.

For a Universal Endpoint, configure the retry settings with the following properties in the provider-specific `config`:

Expand Down Expand Up @@ -196,7 +202,7 @@ curl 'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}' \

#### Direct provider

If set on a [provider](/ai-gateway/universal/) request, a request retry will automatically retry failed requests up to five times. On the final retry attempt, your gateway will wait until the request completes, regardless of how long it takes.
If set on a [provider](/ai-gateway/usage/universal/) request, a request retry will automatically retry failed requests up to five times. On the final retry attempt, your gateway will wait until the request completes, regardless of how long it takes.

For a provider-specific endpoint, configure the retry settings by adding different header values:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ pcx_content_type: how-to
title: Add Human Feedback using API
sidebar:
order: 4
hidden: true
---

import { APIRequest } from "~/components";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ pcx_content_type: how-to
title: Add human feedback using Worker Bindings
sidebar:
order: 4
hidden: true
---

This guide explains how to provide human feedback for AI Gateway evaluations using Worker bindings.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ pcx_content_type: how-to
title: Add Human Feedback using Dashboard
sidebar:
order: 3
hidden: true
---

Human feedback is a valuable metric to assess the performance of your AI models. By incorporating human feedback, you can gain deeper insights into how the model's responses are perceived and how well it performs from a user-centric perspective. This feedback can then be used in evaluations to calculate performance metrics, driving optimization and ultimately enhancing the reliability, accuracy, and efficiency of your AI application.
Expand Down
Loading
Loading