Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions public/__redirects
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@
/ai-gateway/integration/aig-workers-ai-binding/ /ai-gateway/integrations/aig-workers-ai-binding/ 301
/ai-gateway/integration/ /ai-gateway/integrations/ 301
/ai-gateway/providers/open-router/ /ai-gateway/providers/openrouter/ 301
/ai-gateway/providers/universal/ /ai-gateway/universal/ 301
/ai-gateway/configuration/websockets-api/ /ai-gateway/websockets-api/ 301
/ai-gateway/configuration/websockets-api/non-realtime-api/ /ai-gateway/websockets-api/non-realtime-api/ 301
/ai-gateway/configuration/websockets-api/realtime-api/ /ai-gateway/websockets-api/realtime-api/ 301
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ With the new AI Gateway binding methods, you can now:

- Send feedback and update metadata with `patchLog`.
- Retrieve detailed log information using `getLog`.
- Execute [universal requests](/ai-gateway/providers/universal/) to any AI Gateway provider with `run`.
- Execute [universal requests](/ai-gateway/universal/) to any AI Gateway provider with `run`.

For example, to send feedback and update metadata using `patchLog`:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@ date: 2025-02-06T11:00:00Z

AI Gateway adds additional ways to handle requests - [Request Timeouts](/ai-gateway/configuration/request-handling/#request-timeouts) and [Request Retries](/ai-gateway/configuration/request-handling/#request-retries), making it easier to keep your applications responsive and reliable.

Timeouts and retries can be used on both the [Universal Endpoint](/ai-gateway/providers/universal) or directly to a [supported provider](/ai-gateway/providers/).
Timeouts and retries can be used on both the [Universal Endpoint](/ai-gateway/universal/) or directly to a [supported provider](/ai-gateway/providers/).

**Request timeouts**
A [request timeout](/ai-gateway/configuration/request-handling/#request-timeouts) allows you to trigger [fallbacks](/ai-gateway/configuration/fallbacks/) or a retry if a provider takes too long to respond.
A [request timeout](/ai-gateway/configuration/request-handling/#request-timeouts) allows you to trigger [fallbacks](/ai-gateway/configuration/fallbacks/) or a retry if a provider takes too long to respond.

To set a request timeout directly to a provider, add a `cf-aig-request-timeout` header.

Expand All @@ -22,10 +22,12 @@ curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@
--header 'cf-aig-request-timeout: 5000'
--data '{"prompt": "What is Cloudflare?"}'
```

**Request retries**
A [request retry](/ai-gateway/configuration/request-handling/#request-retries) automatically retries failed requests, so you can recover from temporary issues without intervening.

To set up request retries directly to a provider, add the following headers:

- cf-aig-max-attempts (number)
- cf-aig-retry-delay (number)
- cf-aig-backoff ("constant" | "linear" | "exponential)
- cf-aig-backoff ("constant" | "linear" | "exponential)
2 changes: 1 addition & 1 deletion src/content/docs/ai-gateway/changelog.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Changelog
release_notes_file_name:
- ai-gateway
sidebar:
order: 9
order: 15
---

import { ProductReleaseNotes } from "~/components";
Expand Down
4 changes: 2 additions & 2 deletions src/content/docs/ai-gateway/configuration/fallbacks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ sidebar:

import { Render } from "~/components";

Specify model or provider fallbacks with your [Universal endpoint](/ai-gateway/providers/universal/) to handle request failures and ensure reliability.
Specify model or provider fallbacks with your [Universal endpoint](/ai-gateway/universal/) to handle request failures and ensure reliability.

Cloudflare can trigger your fallback provider in response to [request errors](#request-failures) or [predetermined request timeouts](/ai-gateway/configuration/request-handling#request-timeouts). The [response header `cf-aig-step`](#response-headercf-aig-step) indicates which step successfully processed the request.

Expand Down Expand Up @@ -38,7 +38,7 @@ You can add as many fallbacks as you need, just by adding another object in the

## Response header(cf-aig-step)

When using the [Universal endpoint](/ai-gateway/providers/universal/) with fallbacks, the response header `cf-aig-step` indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response.
When using the [Universal endpoint](/ai-gateway/universal/) with fallbacks, the response header `cf-aig-step` indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response.

- `cf-aig-step:0` – The first (primary) model was used successfully.
- `cf-aig-step:1` – The request fell back to the second model.
Expand Down
2 changes: 1 addition & 1 deletion src/content/docs/ai-gateway/configuration/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Configuration
sidebar:
group:
hideIndex: true
order: 5
order: 4
---

import { DirectoryListing } from "~/components";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ A timeout is set in milliseconds. Additionally, the timeout is based on when the

#### Universal Endpoint

If set on a [Universal Endpoint](/ai-gateway/providers/universal/), a request timeout specifies the timeout duration for requests and triggers a fallback.
If set on a [Universal Endpoint](/ai-gateway/universal/), a request timeout specifies the timeout duration for requests and triggers a fallback.

For a Universal Endpoint, configure the timeout value by setting a `requestTimeout` property within the provider-specific `config` object. Each provider can have a different `requestTimeout` value for granular customization.

Expand Down Expand Up @@ -123,7 +123,7 @@ On the final retry attempt, your gateway will wait until the request completes,

#### Universal endpoint

If set on a [Universal Endpoint](/ai-gateway/providers/universal/), a request retry will automatically retry failed requests up to five times before triggering any configured fallbacks.
If set on a [Universal Endpoint](/ai-gateway/universal/), a request retry will automatically retry failed requests up to five times before triggering any configured fallbacks.

For a Universal Endpoint, configure the retry settings with the following properties in the provider-specific `config`:

Expand Down Expand Up @@ -196,7 +196,7 @@ curl 'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}' \

#### Direct provider

If set on a [provider](/ai-gateway/providers/) request, a request retry will automatically retry failed requests up to five times. On the final retry attempt, your gateway will wait until the request completes, regardless of how long it takes.
If set on a [provider](/ai-gateway/universal/) request, a request retry will automatically retry failed requests up to five times. On the final retry attempt, your gateway will wait until the request completes, regardless of how long it takes.

For a provider-specific endpoint, configure the retry settings by adding different header values:

Expand Down
2 changes: 1 addition & 1 deletion src/content/docs/ai-gateway/glossary.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ AI Gateway supports a variety of headers to help you configure, customize, and m
Settings in AI Gateway can be configured at three levels: **Provider**, **Request**, and **Gateway**. Since the same settings can be configured in multiple locations, the following hierarchy determines which value is applied:

1. **Provider-level headers**:
Relevant only when using the [Universal Endpoint](/ai-gateway/providers/universal/), these headers take precedence over all other configurations.
Relevant only when using the [Universal Endpoint](/ai-gateway/universal/), these headers take precedence over all other configurations.
2. **Request-level headers**:
Apply if no provider-level headers are set.
3. **Gateway-level settings**:
Expand Down
2 changes: 1 addition & 1 deletion src/content/docs/ai-gateway/guardrails/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: Guardrails
pcx_content_type: navigation
order: 1
sidebar:
order: 8
order: 6
group:
badge: Beta
---
Expand Down
2 changes: 1 addition & 1 deletion src/content/docs/ai-gateway/integrations/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@ title: Integrations
sidebar:
group:
hideIndex: true
order: 12
order: 7
---
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,17 @@ This configuration sets up the AI binding accessible in your Worker code as `env
To perform an inference task using Workers AI and an AI Gateway, you can use the following code:

```typescript title="src/index.ts"
const resp = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
prompt: "tell me a joke"
}, {
gateway: {
id: "my-gateway"
}
});
const resp = await env.AI.run(
"@cf/meta/llama-3.1-8b-instruct",
{
prompt: "tell me a joke",
},
{
gateway: {
id: "my-gateway",
},
},
);
```

Additionally, you can access the latest request log ID with:
Expand All @@ -64,12 +68,12 @@ Once you have the gateway instance, you can use the following methods:
The `patchLog` method allows you to send feedback, score, and metadata for a specific log ID. All object properties are optional, so you can include any combination of the parameters:

```typescript
gateway.patchLog('my-log-id', {
feedback: 1,
score: 100,
metadata: {
user: "123"
}
gateway.patchLog("my-log-id", {
feedback: 1,
score: 100,
metadata: {
user: "123",
},
});
```

Expand Down Expand Up @@ -97,7 +101,7 @@ const baseUrl = await gateway.getUrl();
// Output: https://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/

// Get a provider-specific URL
const openaiUrl = await gateway.getUrl('openai');
const openaiUrl = await gateway.getUrl("openai");
// Output: https://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/openai
```

Expand All @@ -110,54 +114,57 @@ const openaiUrl = await gateway.getUrl('openai');
The `getUrl` method is particularly useful for integrating with popular AI SDKs:

**OpenAI SDK:**

```typescript
import OpenAI from "openai";

const openai = new OpenAI({
apiKey: "my api key", // defaults to process.env["OPENAI_API_KEY"]
baseURL: await env.AI.gateway('my-gateway').getUrl('openai'),
apiKey: "my api key", // defaults to process.env["OPENAI_API_KEY"]
baseURL: await env.AI.gateway("my-gateway").getUrl("openai"),
});
```

**Vercel AI SDK with OpenAI:**

```typescript
import { createOpenAI } from "@ai-sdk/openai";

const openai = createOpenAI({
baseURL: await env.AI.gateway('my-gateway').getUrl('openai'),
baseURL: await env.AI.gateway("my-gateway").getUrl("openai"),
});
```

**Vercel AI SDK with Anthropic:**

```typescript
import { createAnthropic } from "@ai-sdk/anthropic";

const anthropic = createAnthropic({
baseURL: await env.AI.gateway('my-gateway').getUrl('anthropic'),
baseURL: await env.AI.gateway("my-gateway").getUrl("anthropic"),
});
```

### 3.4. `run`: Universal Requests

The `run` method allows you to execute universal requests. Users can pass either a single universal request object or an array of them. This method supports all AI Gateway providers.

Refer to the [Universal endpoint documentation](/ai-gateway/providers/universal/) for details about the available inputs.
Refer to the [Universal endpoint documentation](/ai-gateway/universal/) for details about the available inputs.

```typescript
const resp = await gateway.run({
provider: "workers-ai",
endpoint: "@cf/meta/llama-3.1-8b-instruct",
headers: {
authorization: "Bearer my-api-token"
},
query: {
prompt: "tell me a joke"
}
provider: "workers-ai",
endpoint: "@cf/meta/llama-3.1-8b-instruct",
headers: {
authorization: "Bearer my-api-token",
},
query: {
prompt: "tell me a joke",
},
});
```

- **Returns**: `Promise<Response>`
- **Example Use Case**: Perform a [universal request](/ai-gateway/providers/universal/) to any supported provider.
- **Example Use Case**: Perform a [universal request](/ai-gateway/universal/) to any supported provider.

## Conclusion

Expand All @@ -168,4 +175,4 @@ With these AI Gateway binding methods, you can now:
- Get gateway URLs for direct API access with `getUrl`, making it easy to integrate with popular AI SDKs.
- Execute universal requests to any AI Gateway provider with `run`.

These methods offer greater flexibility and control over your AI integrations, empowering you to build more sophisticated applications on the Cloudflare Workers platform.
These methods offer greater flexibility and control over your AI integrations, empowering you to build more sophisticated applications on the Cloudflare Workers platform.
2 changes: 1 addition & 1 deletion src/content/docs/ai-gateway/observability/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ pcx_content_type: navigation
sidebar:
group:
hideIndex: true
order: 6
order: 5
---

import { DirectoryListing } from "~/components";
Expand Down
5 changes: 2 additions & 3 deletions src/content/docs/ai-gateway/reference/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,8 @@
pcx_content_type: navigation
title: Platform
sidebar:
group:
hideIndex: true
order: 11
order: 9
hideIndex: true
---

import { DirectoryListing } from "~/components";
Expand Down
2 changes: 1 addition & 1 deletion src/content/docs/ai-gateway/tutorials/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ pcx_content_type: navigation
title: Tutorials
hideChildren: true
sidebar:
order: 4
order: 8
hideIndex: true
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
---
title: Universal Endpoint
pcx_content_type: get-started

sidebar:
order: 1
order: 2
---

import { Render, Badge } from "~/components";
Expand Down
4 changes: 2 additions & 2 deletions src/content/docs/ai-gateway/websockets-api/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ pcx_content_type: configuration
sidebar:
group:
badge: Beta
order: 5
order: 3
---

The AI Gateway WebSockets API provides a persistent connection for AI interactions, eliminating repeated handshakes and reducing latency. This API is divided into two categories:
Expand All @@ -27,7 +27,7 @@ WebSockets are long-lived TCP connections that enable bi-directional, real-time
| :---------------------- | :---------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------- |
| **Purpose** | Enables real-time, multimodal AI interactions for providers that offer dedicated WebSocket endpoints. | Supports WebSocket-based AI interactions with providers that do not natively support WebSockets. |
| **Use Case** | Streaming responses for voice, video, and live interactions. | Text-based queries and responses, such as LLM requests. |
| **AI Provider Support** | [Limited to providers offering real-time WebSocket APIs.](/ai-gateway/websockets-api/realtime-api/#supported-providers) | [All AI providers in AI Gateway.](/ai-gateway/providers/) |
| **AI Provider Support** | [Limited to providers offering real-time WebSocket APIs.](/ai-gateway/websockets-api/realtime-api/#supported-providers) | [All AI providers in AI Gateway.](/ai-gateway/providers/) |
| **Streaming Support** | Providers natively support real-time data streaming. | AI Gateway handles streaming via WebSockets. |

For details on implementation, refer to the next sections:
Expand Down
Loading