diff --git a/public/__redirects b/public/__redirects index fa75d5903ecbcae..09239f4aeb595c3 100644 --- a/public/__redirects +++ b/public/__redirects @@ -146,6 +146,7 @@ /ai-gateway/integration/aig-workers-ai-binding/ /ai-gateway/integrations/aig-workers-ai-binding/ 301 /ai-gateway/integration/ /ai-gateway/integrations/ 301 /ai-gateway/providers/open-router/ /ai-gateway/providers/openrouter/ 301 +/ai-gateway/providers/universal/ /ai-gateway/universal/ 301 /ai-gateway/configuration/websockets-api/ /ai-gateway/websockets-api/ 301 /ai-gateway/configuration/websockets-api/non-realtime-api/ /ai-gateway/websockets-api/non-realtime-api/ 301 /ai-gateway/configuration/websockets-api/realtime-api/ /ai-gateway/websockets-api/realtime-api/ 301 diff --git a/src/content/changelog/ai-gateway/2025-01-26-worker-binding-methods.mdx b/src/content/changelog/ai-gateway/2025-01-26-worker-binding-methods.mdx index 7d0f34f41c15e91..b38ca35f7a6d7c3 100644 --- a/src/content/changelog/ai-gateway/2025-01-26-worker-binding-methods.mdx +++ b/src/content/changelog/ai-gateway/2025-01-26-worker-binding-methods.mdx @@ -16,7 +16,7 @@ With the new AI Gateway binding methods, you can now: - Send feedback and update metadata with `patchLog`. - Retrieve detailed log information using `getLog`. -- Execute [universal requests](/ai-gateway/providers/universal/) to any AI Gateway provider with `run`. +- Execute [universal requests](/ai-gateway/universal/) to any AI Gateway provider with `run`. For example, to send feedback and update metadata using `patchLog`: diff --git a/src/content/changelog/ai-gateway/2025-02-05-aig-request-handling.mdx b/src/content/changelog/ai-gateway/2025-02-05-aig-request-handling.mdx index 06f64192fe4cf65..bde707fb4037836 100644 --- a/src/content/changelog/ai-gateway/2025-02-05-aig-request-handling.mdx +++ b/src/content/changelog/ai-gateway/2025-02-05-aig-request-handling.mdx @@ -8,10 +8,10 @@ date: 2025-02-06T11:00:00Z AI Gateway adds additional ways to handle requests - [Request Timeouts](/ai-gateway/configuration/request-handling/#request-timeouts) and [Request Retries](/ai-gateway/configuration/request-handling/#request-retries), making it easier to keep your applications responsive and reliable. -Timeouts and retries can be used on both the [Universal Endpoint](/ai-gateway/providers/universal) or directly to a [supported provider](/ai-gateway/providers/). +Timeouts and retries can be used on both the [Universal Endpoint](/ai-gateway/universal/) or directly to a [supported provider](/ai-gateway/providers/). **Request timeouts** -A [request timeout](/ai-gateway/configuration/request-handling/#request-timeouts) allows you to trigger [fallbacks](/ai-gateway/configuration/fallbacks/) or a retry if a provider takes too long to respond. +A [request timeout](/ai-gateway/configuration/request-handling/#request-timeouts) allows you to trigger [fallbacks](/ai-gateway/configuration/fallbacks/) or a retry if a provider takes too long to respond. To set a request timeout directly to a provider, add a `cf-aig-request-timeout` header. @@ -22,10 +22,12 @@ curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@ --header 'cf-aig-request-timeout: 5000' --data '{"prompt": "What is Cloudflare?"}' ``` + **Request retries** A [request retry](/ai-gateway/configuration/request-handling/#request-retries) automatically retries failed requests, so you can recover from temporary issues without intervening. To set up request retries directly to a provider, add the following headers: + - cf-aig-max-attempts (number) - cf-aig-retry-delay (number) -- cf-aig-backoff ("constant" | "linear" | "exponential) +- cf-aig-backoff ("constant" | "linear" | "exponential) diff --git a/src/content/docs/ai-gateway/changelog.mdx b/src/content/docs/ai-gateway/changelog.mdx index 5e0aa5fd25468a4..cf677fa62659e3d 100644 --- a/src/content/docs/ai-gateway/changelog.mdx +++ b/src/content/docs/ai-gateway/changelog.mdx @@ -4,7 +4,7 @@ title: Changelog release_notes_file_name: - ai-gateway sidebar: - order: 9 + order: 15 --- import { ProductReleaseNotes } from "~/components"; diff --git a/src/content/docs/ai-gateway/configuration/fallbacks.mdx b/src/content/docs/ai-gateway/configuration/fallbacks.mdx index 162151ca9647e81..a9963a7041704f2 100644 --- a/src/content/docs/ai-gateway/configuration/fallbacks.mdx +++ b/src/content/docs/ai-gateway/configuration/fallbacks.mdx @@ -7,7 +7,7 @@ sidebar: import { Render } from "~/components"; -Specify model or provider fallbacks with your [Universal endpoint](/ai-gateway/providers/universal/) to handle request failures and ensure reliability. +Specify model or provider fallbacks with your [Universal endpoint](/ai-gateway/universal/) to handle request failures and ensure reliability. Cloudflare can trigger your fallback provider in response to [request errors](#request-failures) or [predetermined request timeouts](/ai-gateway/configuration/request-handling#request-timeouts). The [response header `cf-aig-step`](#response-headercf-aig-step) indicates which step successfully processed the request. @@ -38,7 +38,7 @@ You can add as many fallbacks as you need, just by adding another object in the ## Response header(cf-aig-step) -When using the [Universal endpoint](/ai-gateway/providers/universal/) with fallbacks, the response header `cf-aig-step` indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response. +When using the [Universal endpoint](/ai-gateway/universal/) with fallbacks, the response header `cf-aig-step` indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response. - `cf-aig-step:0` – The first (primary) model was used successfully. - `cf-aig-step:1` – The request fell back to the second model. diff --git a/src/content/docs/ai-gateway/configuration/index.mdx b/src/content/docs/ai-gateway/configuration/index.mdx index 285bdbf506c0780..0b4789dcc2174de 100644 --- a/src/content/docs/ai-gateway/configuration/index.mdx +++ b/src/content/docs/ai-gateway/configuration/index.mdx @@ -4,7 +4,7 @@ title: Configuration sidebar: group: hideIndex: true - order: 5 + order: 4 --- import { DirectoryListing } from "~/components"; diff --git a/src/content/docs/ai-gateway/configuration/request-handling.mdx b/src/content/docs/ai-gateway/configuration/request-handling.mdx index 6821aa5480f6544..3af59510d716fe2 100644 --- a/src/content/docs/ai-gateway/configuration/request-handling.mdx +++ b/src/content/docs/ai-gateway/configuration/request-handling.mdx @@ -28,7 +28,7 @@ A timeout is set in milliseconds. Additionally, the timeout is based on when the #### Universal Endpoint -If set on a [Universal Endpoint](/ai-gateway/providers/universal/), a request timeout specifies the timeout duration for requests and triggers a fallback. +If set on a [Universal Endpoint](/ai-gateway/universal/), a request timeout specifies the timeout duration for requests and triggers a fallback. For a Universal Endpoint, configure the timeout value by setting a `requestTimeout` property within the provider-specific `config` object. Each provider can have a different `requestTimeout` value for granular customization. @@ -123,7 +123,7 @@ On the final retry attempt, your gateway will wait until the request completes, #### Universal endpoint -If set on a [Universal Endpoint](/ai-gateway/providers/universal/), a request retry will automatically retry failed requests up to five times before triggering any configured fallbacks. +If set on a [Universal Endpoint](/ai-gateway/universal/), a request retry will automatically retry failed requests up to five times before triggering any configured fallbacks. For a Universal Endpoint, configure the retry settings with the following properties in the provider-specific `config`: @@ -196,7 +196,7 @@ curl 'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}' \ #### Direct provider -If set on a [provider](/ai-gateway/providers/) request, a request retry will automatically retry failed requests up to five times. On the final retry attempt, your gateway will wait until the request completes, regardless of how long it takes. +If set on a [provider](/ai-gateway/universal/) request, a request retry will automatically retry failed requests up to five times. On the final retry attempt, your gateway will wait until the request completes, regardless of how long it takes. For a provider-specific endpoint, configure the retry settings by adding different header values: diff --git a/src/content/docs/ai-gateway/glossary.mdx b/src/content/docs/ai-gateway/glossary.mdx index 8a7e51363c7af32..1c8afb60c8bd7b1 100644 --- a/src/content/docs/ai-gateway/glossary.mdx +++ b/src/content/docs/ai-gateway/glossary.mdx @@ -16,7 +16,7 @@ AI Gateway supports a variety of headers to help you configure, customize, and m Settings in AI Gateway can be configured at three levels: **Provider**, **Request**, and **Gateway**. Since the same settings can be configured in multiple locations, the following hierarchy determines which value is applied: 1. **Provider-level headers**: - Relevant only when using the [Universal Endpoint](/ai-gateway/providers/universal/), these headers take precedence over all other configurations. + Relevant only when using the [Universal Endpoint](/ai-gateway/universal/), these headers take precedence over all other configurations. 2. **Request-level headers**: Apply if no provider-level headers are set. 3. **Gateway-level settings**: diff --git a/src/content/docs/ai-gateway/guardrails/index.mdx b/src/content/docs/ai-gateway/guardrails/index.mdx index e373c051466c30e..bb9b11ede09f8fa 100644 --- a/src/content/docs/ai-gateway/guardrails/index.mdx +++ b/src/content/docs/ai-gateway/guardrails/index.mdx @@ -3,7 +3,7 @@ title: Guardrails pcx_content_type: navigation order: 1 sidebar: - order: 8 + order: 6 group: badge: Beta --- diff --git a/src/content/docs/ai-gateway/integrations/index.mdx b/src/content/docs/ai-gateway/integrations/index.mdx index 2b2a4eeaa26434c..9391c09b0021a98 100644 --- a/src/content/docs/ai-gateway/integrations/index.mdx +++ b/src/content/docs/ai-gateway/integrations/index.mdx @@ -4,5 +4,5 @@ title: Integrations sidebar: group: hideIndex: true - order: 12 + order: 7 --- diff --git a/src/content/docs/ai-gateway/integrations/worker-binding-methods.mdx b/src/content/docs/ai-gateway/integrations/worker-binding-methods.mdx index 11d5092d732cf9c..12955e050141968 100644 --- a/src/content/docs/ai-gateway/integrations/worker-binding-methods.mdx +++ b/src/content/docs/ai-gateway/integrations/worker-binding-methods.mdx @@ -34,13 +34,17 @@ This configuration sets up the AI binding accessible in your Worker code as `env To perform an inference task using Workers AI and an AI Gateway, you can use the following code: ```typescript title="src/index.ts" -const resp = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", { - prompt: "tell me a joke" -}, { - gateway: { - id: "my-gateway" - } -}); +const resp = await env.AI.run( + "@cf/meta/llama-3.1-8b-instruct", + { + prompt: "tell me a joke", + }, + { + gateway: { + id: "my-gateway", + }, + }, +); ``` Additionally, you can access the latest request log ID with: @@ -64,12 +68,12 @@ Once you have the gateway instance, you can use the following methods: The `patchLog` method allows you to send feedback, score, and metadata for a specific log ID. All object properties are optional, so you can include any combination of the parameters: ```typescript -gateway.patchLog('my-log-id', { - feedback: 1, - score: 100, - metadata: { - user: "123" - } +gateway.patchLog("my-log-id", { + feedback: 1, + score: 100, + metadata: { + user: "123", + }, }); ``` @@ -97,7 +101,7 @@ const baseUrl = await gateway.getUrl(); // Output: https://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/ // Get a provider-specific URL -const openaiUrl = await gateway.getUrl('openai'); +const openaiUrl = await gateway.getUrl("openai"); // Output: https://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/openai ``` @@ -110,30 +114,33 @@ const openaiUrl = await gateway.getUrl('openai'); The `getUrl` method is particularly useful for integrating with popular AI SDKs: **OpenAI SDK:** + ```typescript import OpenAI from "openai"; const openai = new OpenAI({ - apiKey: "my api key", // defaults to process.env["OPENAI_API_KEY"] - baseURL: await env.AI.gateway('my-gateway').getUrl('openai'), + apiKey: "my api key", // defaults to process.env["OPENAI_API_KEY"] + baseURL: await env.AI.gateway("my-gateway").getUrl("openai"), }); ``` **Vercel AI SDK with OpenAI:** + ```typescript import { createOpenAI } from "@ai-sdk/openai"; const openai = createOpenAI({ - baseURL: await env.AI.gateway('my-gateway').getUrl('openai'), + baseURL: await env.AI.gateway("my-gateway").getUrl("openai"), }); ``` **Vercel AI SDK with Anthropic:** + ```typescript import { createAnthropic } from "@ai-sdk/anthropic"; const anthropic = createAnthropic({ - baseURL: await env.AI.gateway('my-gateway').getUrl('anthropic'), + baseURL: await env.AI.gateway("my-gateway").getUrl("anthropic"), }); ``` @@ -141,23 +148,23 @@ const anthropic = createAnthropic({ The `run` method allows you to execute universal requests. Users can pass either a single universal request object or an array of them. This method supports all AI Gateway providers. -Refer to the [Universal endpoint documentation](/ai-gateway/providers/universal/) for details about the available inputs. +Refer to the [Universal endpoint documentation](/ai-gateway/universal/) for details about the available inputs. ```typescript const resp = await gateway.run({ - provider: "workers-ai", - endpoint: "@cf/meta/llama-3.1-8b-instruct", - headers: { - authorization: "Bearer my-api-token" - }, - query: { - prompt: "tell me a joke" - } + provider: "workers-ai", + endpoint: "@cf/meta/llama-3.1-8b-instruct", + headers: { + authorization: "Bearer my-api-token", + }, + query: { + prompt: "tell me a joke", + }, }); ``` - **Returns**: `Promise` -- **Example Use Case**: Perform a [universal request](/ai-gateway/providers/universal/) to any supported provider. +- **Example Use Case**: Perform a [universal request](/ai-gateway/universal/) to any supported provider. ## Conclusion @@ -168,4 +175,4 @@ With these AI Gateway binding methods, you can now: - Get gateway URLs for direct API access with `getUrl`, making it easy to integrate with popular AI SDKs. - Execute universal requests to any AI Gateway provider with `run`. -These methods offer greater flexibility and control over your AI integrations, empowering you to build more sophisticated applications on the Cloudflare Workers platform. \ No newline at end of file +These methods offer greater flexibility and control over your AI integrations, empowering you to build more sophisticated applications on the Cloudflare Workers platform. diff --git a/src/content/docs/ai-gateway/observability/index.mdx b/src/content/docs/ai-gateway/observability/index.mdx index 4bdad045a0e7d48..07fadc42886bc84 100644 --- a/src/content/docs/ai-gateway/observability/index.mdx +++ b/src/content/docs/ai-gateway/observability/index.mdx @@ -4,7 +4,7 @@ pcx_content_type: navigation sidebar: group: hideIndex: true - order: 6 + order: 5 --- import { DirectoryListing } from "~/components"; diff --git a/src/content/docs/ai-gateway/reference/index.mdx b/src/content/docs/ai-gateway/reference/index.mdx index 0b3a9f9d08daf5f..0e4ab6cc5969f5e 100644 --- a/src/content/docs/ai-gateway/reference/index.mdx +++ b/src/content/docs/ai-gateway/reference/index.mdx @@ -2,9 +2,8 @@ pcx_content_type: navigation title: Platform sidebar: - group: - hideIndex: true - order: 11 + order: 9 + hideIndex: true --- import { DirectoryListing } from "~/components"; diff --git a/src/content/docs/ai-gateway/tutorials/index.mdx b/src/content/docs/ai-gateway/tutorials/index.mdx index 34a10e4aebdecb8..bcf4619bafa4f95 100644 --- a/src/content/docs/ai-gateway/tutorials/index.mdx +++ b/src/content/docs/ai-gateway/tutorials/index.mdx @@ -3,7 +3,7 @@ pcx_content_type: navigation title: Tutorials hideChildren: true sidebar: - order: 4 + order: 8 hideIndex: true --- diff --git a/src/content/docs/ai-gateway/providers/universal.mdx b/src/content/docs/ai-gateway/universal.mdx similarity index 99% rename from src/content/docs/ai-gateway/providers/universal.mdx rename to src/content/docs/ai-gateway/universal.mdx index ea3e391552da673..f33cf0c2b61c383 100644 --- a/src/content/docs/ai-gateway/providers/universal.mdx +++ b/src/content/docs/ai-gateway/universal.mdx @@ -1,9 +1,8 @@ --- title: Universal Endpoint pcx_content_type: get-started - sidebar: - order: 1 + order: 2 --- import { Render, Badge } from "~/components"; diff --git a/src/content/docs/ai-gateway/websockets-api/index.mdx b/src/content/docs/ai-gateway/websockets-api/index.mdx index f43487e47e35ab0..7780bd024862098 100644 --- a/src/content/docs/ai-gateway/websockets-api/index.mdx +++ b/src/content/docs/ai-gateway/websockets-api/index.mdx @@ -4,7 +4,7 @@ pcx_content_type: configuration sidebar: group: badge: Beta - order: 5 + order: 3 --- The AI Gateway WebSockets API provides a persistent connection for AI interactions, eliminating repeated handshakes and reducing latency. This API is divided into two categories: @@ -27,7 +27,7 @@ WebSockets are long-lived TCP connections that enable bi-directional, real-time | :---------------------- | :---------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------- | | **Purpose** | Enables real-time, multimodal AI interactions for providers that offer dedicated WebSocket endpoints. | Supports WebSocket-based AI interactions with providers that do not natively support WebSockets. | | **Use Case** | Streaming responses for voice, video, and live interactions. | Text-based queries and responses, such as LLM requests. | -| **AI Provider Support** | [Limited to providers offering real-time WebSocket APIs.](/ai-gateway/websockets-api/realtime-api/#supported-providers) | [All AI providers in AI Gateway.](/ai-gateway/providers/) | +| **AI Provider Support** | [Limited to providers offering real-time WebSocket APIs.](/ai-gateway/websockets-api/realtime-api/#supported-providers) | [All AI providers in AI Gateway.](/ai-gateway/providers/) | | **Streaming Support** | Providers natively support real-time data streaming. | AI Gateway handles streaming via WebSockets. | For details on implementation, refer to the next sections: