Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
147 commits
Select commit Hold shift + click to select a range
1ac75cb
[AI gateway] Request timeouts for fallback providers
kodster28 Jan 23, 2025
2e0d6ba
bearer token inconsistency
kodster28 Jan 23, 2025
771fedb
Update src/content/docs/ai-gateway/providers/universal.mdx
kodster28 Jan 23, 2025
25db0f3
Update fallbacks.mdx
kathayl Jan 27, 2025
1468261
Update universal.mdx
kathayl Jan 28, 2025
51504e6
Update universal.mdx
kathayl Jan 28, 2025
49a4f3a
r2: terraform - make the required options even clearer (#19389)
elithrar Jan 23, 2025
c55f037
[SSL] Update origin-ca and adjust content for SEO (#19315)
RebeccaTamachiro Jan 23, 2025
01bce58
[Vectorize] Mark AOT Support (#19351)
garvit-gupta Jan 23, 2025
4bace5c
Feature nodejs compat issues in troubleshooting (#19288)
thomasgauvin Jan 23, 2025
11d0c45
[ZT] Update GDrive cert procedure (#19374)
maxvp Jan 23, 2025
a4284b9
Update routing.mdx (#19393)
thomasgauvin Jan 23, 2025
5b4aba9
update SaaS apps (#19397)
ranbel Jan 23, 2025
8603830
thomasgauvin: fix db docs to adjust for no default nodejs_compat in c…
thomasgauvin Jan 24, 2025
d364b58
[Rules] Update capitalization in page-rules-migration.mdx (#19399)
thomasgauvin Jan 24, 2025
b80bf77
thomasgauvin: add explanations to connect to private network db from …
thomasgauvin Jan 24, 2025
b2f5b9c
[Workers] Fix broken link in dev-tools/index.mdx (#19406)
ketanhwr Jan 24, 2025
52d7ba0
Hyperlint Automation: Broken Link Fixes (#19353)
hyperlint-ai[bot] Jan 24, 2025
0c0d6b4
[Fundamentals] Added info for account and user tokens (#19372)
dcpena Jan 24, 2025
9724208
[1.1.1.1] Make troubleshooting prominent on the sidenav (#19407)
RebeccaTamachiro Jan 24, 2025
531e354
ZT User Certificates - banner link not working across all required pa…
Vortexmind Jan 24, 2025
75a3fa3
[Docs Site] Bump @cloudflare/workers-types (#19365)
dependabot[bot] Jan 24, 2025
b2a105b
[CF1] ip visibility update (#19354)
deadlypants1973 Jan 24, 2025
e20b10b
[Turnstile] Pre-clearance + Hostname Mgmt overhaul (#19373)
patriciasantaana Jan 24, 2025
a2f470f
Update custom-cache-key.mdx (#19410)
chris-martinelli Jan 24, 2025
ec9734a
Update cache-keys.mdx (#19408)
chris-martinelli Jan 24, 2025
cc8e28f
Free transformation as 9422 (#19075)
deven96 Jan 24, 2025
c109d91
[Docs Site] Adopt Cloudflare styling for badge component (#19390)
KianNH Jan 24, 2025
6fb47a8
dns/changelog: deprecate zone_id and zone_name fields (#19320)
Jan 24, 2025
f2a95bc
Add turnstile e2e testing tutorial for Turnstile (#19415)
olipayne Jan 24, 2025
b42795b
[Docs Site] Add privacy group to Key Transparency and Privacy Gateway…
KianNH Jan 24, 2025
08657e6
Update dns-records.mdx (#19413)
vianaedson Jan 24, 2025
32e1753
[Docs Site] Bump @typescript-eslint/parser from 8.20.0 to 8.21.0 (#19…
dependabot[bot] Jan 24, 2025
1194418
Revert "Update parameters.mdx (#19194)" (#19422)
ranbel Jan 24, 2025
8fc9dc1
Updating 9520 as unsupported format (#19414)
deven96 Jan 24, 2025
4b134eb
Revert "Update index.mdx (#19195)" (#19423)
ranbel Jan 24, 2025
d443c36
LB Changelog: Update to Cloudflare Tunnel Steering (#19427)
tc80 Jan 24, 2025
04efc52
[Gateway] Proxy Happy Eyeballs algorithm (#19432)
maxvp Jan 24, 2025
8461edc
[Docs Site] Bump puppeteer from 24.1.0 to 24.1.1 (#19431)
dependabot[bot] Jan 27, 2025
81ec9c8
typo fix (#19426)
kodster28 Jan 27, 2025
9db2b4a
Update cache-keys.mdx (#19433)
chris-martinelli Jan 27, 2025
28658d7
[WAF, Page Shield] Add links to Learning Center (#19447)
pedrosousa Jan 27, 2025
bb3ffb5
[Support] Revamps 4xx page (#19382)
angelampcosta Jan 27, 2025
ee8f45b
[CF1] public hostname error (#19424)
deadlypants1973 Jan 27, 2025
7bb183e
Adds Fullstack Nextjs Auth Tutorial (#18092)
mackenly Jan 27, 2025
7f1d8ae
[Workers] Update Rust & WebAssembly book link (#19434)
saqibameen Jan 27, 2025
7d0861c
[DDoS Protection/Network Analytics] Clarify log behavior (#19416)
patriciasantaana Jan 27, 2025
88f28fc
PCX-15211 (#19425)
ranbel Jan 27, 2025
33719bf
[WAF] Fix incorrect expression for Zone Lockdown example (#19452)
connorgurney Jan 27, 2025
ccd9404
[Chore] Remove API crawl script (#19454)
kodster28 Jan 27, 2025
5f03ca0
[Fundamentals] Env variables in API calls (#19419)
pedrosousa Jan 27, 2025
3c1e9d7
[Fundamentals] Added role requirement for magic tunnel hcs (#19456)
marciocloudflare Jan 27, 2025
27f3a17
[WAF] Update examples in FAQ (#19458)
pedrosousa Jan 27, 2025
25d689a
[ZT] Fix typo (#19460)
pedrosousa Jan 27, 2025
0978a29
Update index.mdx (#19466)
NuelEdeh Jan 27, 2025
7c74024
[ZT] Update test list CSV (#19435)
maxvp Jan 27, 2025
fa75318
[Bots/Turnstile] JSD clearance cookie (#19463)
patriciasantaana Jan 27, 2025
ea14770
Update update-warp.mdx (#19473)
ranbel Jan 27, 2025
e0f1eaf
[DLP] Payload log expansion (#19398)
maxvp Jan 27, 2025
4cf1683
[ZT] Link between beta and stable release pages (#19472)
ranbel Jan 27, 2025
395e586
update API Discovery's ongoing nature (#19476)
patriciasantaana Jan 27, 2025
0fdb2d2
Update index.mdx (#19479)
ranbel Jan 27, 2025
e58b451
[3rd Party] Update configure-cloudflare-and-heroku-over-https.mdx (#1…
nenizera Jan 28, 2025
ff24f05
SQC-352 SQC-353 create cert command documantion for mtls/CA cert chai…
Ltadrian Jan 28, 2025
a64982b
Update cannot-locate-dashboard-account.mdx (#19481)
smsp Jan 28, 2025
e8fb685
Updates FAQ sections titles (#19482)
angelampcosta Jan 28, 2025
ca62ac1
[BYOIP] Call out no expected downtime when setting up address maps (#…
RebeccaTamachiro Jan 28, 2025
61c687a
clarified AND nature of filters (#19485)
marciocloudflare Jan 28, 2025
2f23b14
[Workers] Fix typos in TypeScript page (#19483)
pedrosousa Jan 28, 2025
e25160b
[BYOIP] Review get-started and IRR guidance (#18941)
RebeccaTamachiro Jan 28, 2025
45a7c20
[ES] Scope of Data Retained for all messages and detections (#19401)
kyouheicf Jan 28, 2025
7e2cdd4
[CF1] adding okta docs on claims (#19455)
deadlypants1973 Jan 28, 2025
675c102
[CF1] windows 10 limit (#19457)
deadlypants1973 Jan 28, 2025
c446e3e
Builds and It's Completely Different but Also Still Build (#18955)
aninibread Jan 28, 2025
5fc1859
Clarify how `wrangler --env` affects loading `.env` and `.dev.vars` (…
vicb Jan 28, 2025
669e85b
[CF1] warp egress ip note (#19450)
deadlypants1973 Jan 28, 2025
203946f
[Email Security] Move Email details to new page (#19487)
Maddy-Cloudflare Jan 28, 2025
c6dc970
Remove beta note from Gradual Deployments (#19488)
WalshyDev Jan 28, 2025
6412637
[Fundamentals] Add HTTP response headers section and rename page (#18…
DRayCloudflare Jan 28, 2025
83cd10b
NOJIRA-99: Update docs for hyperdrive conn limits (#19459)
ReppCodes Jan 28, 2025
f881057
Workers KV 1000 namespace limit announcement (#19409)
thomasgauvin Jan 28, 2025
f19d8e1
Update limits of KV namespaces to 1000 (#19404)
ferhatelmas Jan 28, 2025
111d17f
Adds the usage stats to all Text Gen models (#19492)
craigsdennis Jan 28, 2025
aa8c42f
integrate feedback from #19340 (#19489)
vicb Jan 28, 2025
1ccc203
Adds Deepseek R1 Distill Qwen 32b (#19493)
craigsdennis Jan 28, 2025
a401151
Update vercel-ai-sdk.mdx (#19467)
kathayl Jan 28, 2025
d575551
[Turnstile] Update challenge solve issue troubleshooting (#19499)
patriciasantaana Jan 28, 2025
ab828de
Adding changelog for node compat improvements (#19341)
mikenomitch Jan 28, 2025
a4e0688
[Terraform] Added note about V4 code snippets (#19497)
dcpena Jan 28, 2025
1febb2f
Add ai gateway binding methods (#19484)
G4brym Jan 28, 2025
3a4eaab
[Workers] Properly spell `compatibility` (#19500)
vil02 Jan 28, 2025
8f0ae7e
Removes database mentions from node:net docs (#19503)
mikenomitch Jan 28, 2025
80e9c4e
[Workers AI] Pricing for deepseek r1 distill (#19502)
kodster28 Jan 29, 2025
676e572
[LB] Properly spell `Success criteria` (#19505)
vil02 Jan 29, 2025
98af3bd
[Magic] Updated tunnel health checks ref page (#19490)
marciocloudflare Jan 29, 2025
04c1b15
[Rules] Snippets: Update dashboard instructions (#19511)
pedrosousa Jan 29, 2025
0e63a6e
[All] Cleaning up references to wrangler.toml. (#19403)
Oxyjun Jan 29, 2025
dd7a812
[wrangler] update wrangler global flags and `secret:bulk` -> `secret …
emily-shen Jan 29, 2025
8145f8a
Update connect-to-private-database.mdx (#19494)
thomasgauvin Jan 29, 2025
c40f871
Hyperdrive private database support improved UI docs (#19400)
thomasgauvin Jan 29, 2025
eced0eb
Updating AI Image Playground series (#19517)
jason-cf Jan 29, 2025
e86d7f6
[MWAN] Azure vWAN integration guide (#19518)
marciocloudflare Jan 29, 2025
386420a
PCX-14407: Incorporate Feedback to HTML Rewriter (#19515)
worenga Jan 29, 2025
68edbd7
beta releases (#19501)
ranbel Jan 29, 2025
d8450c2
add download buttons (#19474)
ranbel Jan 29, 2025
2eb56d1
[ZT] Properly spell `reusable` (#19520)
vil02 Jan 29, 2025
2c7dd96
Added Snowflake partner documentation (#19469)
jonesphillip Jan 29, 2025
f5fb735
[Access for SaaS] Properly spell `metadata` (#19524)
vil02 Jan 29, 2025
16bc05d
[ZT] Browser Isolation copy-paste (#18434)
ranbel Jan 29, 2025
ff78b09
[CF1] team name change update (#19519)
deadlypants1973 Jan 29, 2025
b647f57
[Email Security] Rename CES to Email Security (#19523)
Maddy-Cloudflare Jan 29, 2025
2f52c30
Calls: Cleanup Calls TURN FAQ page and add STUN question (#19526)
renandincer Jan 29, 2025
dd16d60
[CF1] Properly spell `visibility` (#19528)
vil02 Jan 29, 2025
0541dc6
[Ruleset Engine] Functions: Add url_decode() example (#19512)
ngayerie Jan 29, 2025
f156cdb
[Docs Site] Exclude Style Guide from sitemap (#19530)
KianNH Jan 29, 2025
36e3262
[Docs Site] changelog-next padding fixes (#19529)
KianNH Jan 29, 2025
a934ece
[Docs Site] Bump @types/node from 22.10.7 to 22.12.0 (#19498)
dependabot[bot] Jan 29, 2025
d3d8607
[Docs Site] Bump @iconify-json/material-symbols from 1.2.12 to 1.2.13…
dependabot[bot] Jan 29, 2025
6627c71
Agents landing page
irvinebroque Jan 29, 2025
98f47c5
[Access for SaaS] Properly spell `consumer` (#19536)
vil02 Jan 29, 2025
fb9a1cd
Updates model (#19507)
craigsdennis Jan 29, 2025
bb5c503
[AIG]add filter table (#19253)
daisyfaithauma Jan 29, 2025
9d17f8a
Update real-time.mdx in agents page (#19538)
rita3ko Jan 29, 2025
a5aecd8
[ZT] Add overview page for implementation guides (#19537)
ranbel Jan 29, 2025
0b675d2
[ZT] Manual per-account certs (#19506)
maxvp Jan 29, 2025
01dd94b
[Cache] Adds information about smart edge revalidation (#19486)
angelampcosta Jan 30, 2025
f0800a4
[DNS, Analytics] Remove mentions to Analytics > DNS (#19545)
RebeccaTamachiro Jan 30, 2025
6d7c300
[Agents] Fix typo (#19540)
thomasgauvin Jan 30, 2025
388ae5f
[ZT] Fix typo (#19544)
vil02 Jan 30, 2025
b223160
Update create-buckets.mdx (#19436)
cf-scott Jan 30, 2025
838a447
Hyperlint Automation: Meta Description Fixes (#19543)
hyperlint-ai[bot] Jan 30, 2025
4874b37
[Magic] BGP Magic routing table info vs PNI peering (#19548)
marciocloudflare Jan 30, 2025
1ebef4d
Renaming "Walkthrough" to "Tutorial" (#19547)
Oxyjun Jan 30, 2025
014041d
removed extra forward slash (#19549)
marciocloudflare Jan 30, 2025
eaad7c7
[DNS] Zone defaults may require contacting account team (#19418)
vianaedson Jan 30, 2025
78b8d6c
Added domains to compat matrix (#18583)
ToriLindsay Jan 30, 2025
861500a
[MWAN/MT] Adds BGP peering to overview features (#19551)
marciocloudflare Jan 30, 2025
5188f54
[Workers AI] Update the Python SDK in the notebooks (#19247)
craigsdennis Jan 30, 2025
cbaab97
Update RAG tutorial to use workflows (#18338)
kristianfreeman Jan 30, 2025
f15abe4
fix dash reference in queues>metrics entry (#19514)
ale-cf Jan 30, 2025
e88f5a4
Hyperlint Automation: Meta Description Fixes (#19509)
hyperlint-ai[bot] Jan 30, 2025
8127ed1
[3rd Party]Update configure-cloudflare-and-heroku-over-https.mdx (#19…
nenizera Jan 30, 2025
233456e
thomasgauvin: added changelog entry for hyperdrive caching at edge (w…
thomasgauvin Jan 30, 2025
0fe62f1
Adds info about logpush permissions (#19558)
angelampcosta Jan 30, 2025
c1139a8
fix: #19464 remove @latest from yarn create cloudflare commands (#19470)
deloreyj Jan 30, 2025
685db15
changelog: Workflows updates (#19563)
elithrar Jan 30, 2025
a8eaae8
Updates dns docs to not use unimplemented method (#19564)
mikenomitch Jan 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion src/content/changelogs/ai-gateway.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,15 @@ productLink: "/ai-gateway/"
productArea: Developer platform
productAreaLink: /workers/platform/changelog/platform/
entries:
- publish_date: "2025-01-23"
title: Request timeouts for Universal gateways and fallback providers
description: |-
* Added [request timeouts](/ai-gateway/configuration/fallbacks/#request-timeouts) as a configuration option for fallback providers. This property triggers a fallback proivder based on a predetermined response time (measured in milliseconds).
- publish_date: "2025-01-02"
title: DeepSeek
description: |-
* **Configuration**: Added [DeepSeek](/ai-gateway/providers/deepseek/) as a new provider.

- publish_date: "2024-12-17"
title: AI Gateway Dashboard
description: |-
Expand Down
88 changes: 85 additions & 3 deletions src/content/docs/ai-gateway/configuration/fallbacks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,15 @@ import { Render } from "~/components";

Specify model or provider fallbacks with your [Universal endpoint](/ai-gateway/providers/universal/) to handle request failures and ensure reliability.

Fallbacks are currently triggered only when a request encounters an error. We are working to expand fallback functionality to include time-based triggers, which will allow requests that exceed a predefined response time to timeout and fallback.
Cloudflare can trigger your fallback provider in response to [request errors](#request-failures) or [predetermined request timeouts](#request-timeouts).

## Example
## Request failures

In the following example, a request first goes to the [Workers AI](/workers-ai/) Inference API. If the request fails, it falls back to OpenAI. The response header `cf-aig-step` indicates which provider successfully processed the request.
By default, Cloudflare triggers your fallback if a model request returns an error.

### Request failure example

In the following example, a request first goes to the [Workers AI](/workers-ai/) Inference API. If the request fails, it falls back to OpenAI. The response header `cf-aig-step` indicates which provider successfully processed the request.

1. Sends a request to Workers AI Inference API.
2. If that request fails, proceeds to OpenAI.
Expand All @@ -32,6 +36,84 @@ You can add as many fallbacks as you need, just by adding another object in the

<Render file="universal-gateway-example" />

---

## Request timeouts

If set, a request timeout triggers a fallback provider based on a predetermined response time (measured in milliseconds). This feature is helpful for latency-sensitive applications because your gateway does not have to wait for a [request error](#request-failure) before moving to a fallback provider.

You can configure request timeouts by using one or more of the following properties, which are listed in order of priority:

| Priority | Property |
| -------- | ---------------------------------------------------------------------------------------------------------------------- |
| 1 | `requestTimeout` (added as a universal attribute) |
| 2 | `cf-aig-request-timeout` (header included at the [provider level](/ai-gateway/providers/universal/#payload-reference)) |
| 3 | `cf-aig-request-timeout` (header included at the request level) |

Your gateway follows this hierarchy to determine the timeout duration before implementing a fallback.

### Request timeout example

These request timeout values can interact to customize the behavior of your universal gateway.

In this example, the request will try to answer `What is Cloudflare?` within 1000 milliseconds using the normal `@cf/meta/llama-3.1-8b-instruct` model. The `requestTimeout` property takes precedence over the `cf-aig-request-timeout` for `@cf/meta/llama-3.1-8b-insruct`.

If that fails, then the gateway will timeout and move to the fallback `@cf/meta/llama-3.1-8b-instruct-fast` model. This model has 3000 milliseconds - determined by the request-level `cf-aig-request-timeout` value - to complete the request and provide an answer.

```bash title="Request" collapse={36-50} {2,11,13-15}
curl 'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}' \
--header 'cf-aig-request-timeout: 3000' \
--header 'Content-Type: application/json' \
--data '[
{
"provider": "workers-ai",
"endpoint": "@cf/meta/llama-3.1-8b-instruct",
"headers": {
"Authorization": "Bearer {cloudflare_token}",
"Content-Type": "application/json",
"cf-aig-request-timeout": "2000"
},
"config": {
"requestTimeout": 1000
},
"query": {
"messages": [
{
"role": "system",
"content": "You are a friendly assistant"
},
{
"role": "user",
"content": "What is Cloduflare?"
}
]
}
},
{
"provider": "workers-ai",
"endpoint": "@cf/meta/llama-3.1-8b-instruct-fast",
"headers": {
"Authorization": "Bearer XXXXXXXXXXXXXXXXXXXX",
"Content-Type": "application/json"
},
"query": {
"messages": [
{
"role": "system",
"content": "You are a friendly assistant"
},
{
"role": "user",
"content": "What is Cloudflare?"
}
]
}
}
]'
```

---

## Response header(cf-aig-step)

When using the [Universal endpoint](/ai-gateway/providers/universal/) with fallbacks, the response header `cf-aig-step` indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response.
Expand Down
10 changes: 8 additions & 2 deletions src/content/docs/ai-gateway/providers/universal.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,18 +16,24 @@ https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}

AI Gateway offers multiple endpoints for each Gateway you create - one endpoint per provider, and one Universal Endpoint. The Universal Endpoint requires some adjusting to your schema, but supports additional features. Some of these features are, for example, retrying a request if it fails the first time, or configuring a [fallback model/provider](/ai-gateway/configuration/fallbacks/).

## Payload reference

You can use the Universal endpoint to contact every provider. The payload is expecting an array of message, and each message is an object with the following parameters:

- `provider` : the name of the provider you would like to direct this message to. Can be OpenAI, workers-ai, or any of our supported providers.
- `endpoint`: the pathname of the provider API you’re trying to reach. For example, on OpenAI it can be `chat/completions`, and for Workers AI this might be [`@cf/meta/llama-3.1-8b-instruct`](/workers-ai/models/llama-3.1-8b-instruct/). See more in the sections that are specific to [each provider](/ai-gateway/providers/).
- `authorization`: the content of the Authorization HTTP Header that should be used when contacting this provider. This usually starts with “Token” or “Bearer”.
- `headers`:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kathayl, I think this is accurate, but fact check me here :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switched from caching and authentication to caching and custom metadata.

bc authentication is at the level of the request, not provider.

- `Authorization`: the content of the Authorization HTTP Header that should be used when contacting this provider. This usually starts with “Token” or “Bearer”.
- Any other custom header in a model's [configuration](/ai-gateway/configuration/), such as [Caching](/ai-gateway/configuration/caching/) or [Authentication](/ai-gateway/configuration/authentication/).
- `query`: the payload as the provider expects it in their official API.

## cURL example

The following example shows a simple setup with a primary model and a [fallback](/ai-gateway/configuration/fallbacks/) option.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to add more cross-links over to fallbacks page


<Render file="universal-gateway-example" />

The above will send a request to Workers AI Inference API, if it fails it will proceed to OpenAI. You can add as many fallbacks as you need, just by adding another JSON in the array.
The above will send a request to Workers AI Inference API, if it fails it will proceed to OpenAI. You can add as many [fallbacks](/ai-gateway/configuration/fallbacks/) as you need, just by adding another JSON in the array.

## WebSockets API <Badge text="beta" variant="tip" size="small" />

Expand Down
4 changes: 4 additions & 0 deletions src/content/glossary/ai-gateway.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,10 @@ entries:
general_definition: |-
Header to [bypass caching for a specific request](/ai-gateway/configuration/caching/#skip-cache-cf-aig-skip-cache).

- term: cf-aig-request-timeout
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adds automatically to headers glossary page

general_definition: |-
Header to trigger a fallback provider based on a [predetermined response time](/ai-gateway/configuration/fallbacks/#request-timeouts) (measured in milliseconds).

# Deprecated headers
- term: cf-cache-ttl
general_definition: |-
Expand Down
Loading