|
| 1 | +--- |
| 2 | +pcx_content_type: configuration |
| 3 | +title: Request handling |
| 4 | +sidebar: |
| 5 | + order: 4 |
| 6 | +--- |
| 7 | + |
| 8 | +import { Render, Aside } from "~/components"; |
| 9 | + |
| 10 | +Your AI gateway supports different strategies for handling requests to providers, which allows you to manage AI interactions effectively and ensure your applications remain responsive and reliable. |
| 11 | + |
| 12 | +## Request timeouts |
| 13 | + |
| 14 | +A request timeout allows you to trigger fallbacks or a retry if a provider takes too long to respond. |
| 15 | + |
| 16 | +These timeouts help: |
| 17 | + |
| 18 | +- Improve user experience, by preventing users from waiting too long for a response |
| 19 | +- Proactively handle errors, by detecting unresponsive providers and triggering a fallback option |
| 20 | + |
| 21 | +Request timeouts can be set on a [Universal Endpoint](/ai-gateway/providers/universal/) or directly on a request to any [provider](/ai-gateway/providers/): |
| 22 | + |
| 23 | +- If set on a Universal Endpoint, it specifies the timeout duration for requests and triggers a fallback. |
| 24 | +- If set on a provider request, it specifies the timeout duration for a request and - if exceeded - returns an error. |
| 25 | + |
| 26 | +### Definitions |
| 27 | + |
| 28 | +A timeout is set in milliseconds. Additionaly, the timeout is based on when the first part of the response comes back. As long as the first part of the response returns within the specified timeframe - such as when streaming a response - your gateway will wait for the response. |
| 29 | + |
| 30 | +### Configuration |
| 31 | + |
| 32 | +#### Universal Endpoint |
| 33 | + |
| 34 | +For a Universal endpoint, configure the timeout value by setting a `requestTimeout` property at the |
| 35 | + |
| 36 | +by using one or more of the following properties, which are listed in order of priority: |
| 37 | + |
| 38 | +| Priority | Property | |
| 39 | +| -------- | ---------------------------------------------------------------------------------------------------------------------- | |
| 40 | +| 1 | `requestTimeout` (added as a universal attribute) | |
| 41 | +| 2 | `cf-aig-request-timeout` (header included at the [provider level](/ai-gateway/providers/universal/#payload-reference)) | |
| 42 | +| 3 | `cf-aig-request-timeout` (header included at the request level) | |
| 43 | + |
| 44 | +Your gateway follows this hierarchy to determine the timeout duration before implementing a fallback. |
| 45 | + |
| 46 | +### Request timeout example |
| 47 | + |
| 48 | +These request timeout values can interact to customize the behavior of your universal gateway. |
| 49 | + |
| 50 | +In this example, the request will try to answer `What is Cloudflare?` within 1000 milliseconds using the normal `@cf/meta/llama-3.1-8b-instruct` model. The `requestTimeout` property takes precedence over the `cf-aig-request-timeout` for `@cf/meta/llama-3.1-8b-instruct`. |
| 51 | + |
| 52 | +If that fails, then the gateway will timeout and move to the fallback `@cf/meta/llama-3.1-8b-instruct-fast` model. This model has 3000 milliseconds - determined by the request-level `cf-aig-request-timeout` value - to complete the request and provide an answer. |
| 53 | + |
| 54 | +```bash title="Request" collapse={36-50} {2,11,13-15} |
| 55 | +curl 'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}' \ |
| 56 | + --header 'cf-aig-request-timeout: 3000' \ |
| 57 | + --header 'Content-Type: application/json' \ |
| 58 | + --data '[ |
| 59 | + { |
| 60 | + "provider": "workers-ai", |
| 61 | + "endpoint": "@cf/meta/llama-3.1-8b-instruct", |
| 62 | + "headers": { |
| 63 | + "Authorization": "Bearer {cloudflare_token}", |
| 64 | + "Content-Type": "application/json", |
| 65 | + "cf-aig-request-timeout": "2000" |
| 66 | + }, |
| 67 | + "config": { |
| 68 | + "requestTimeout": 1000 |
| 69 | + }, |
| 70 | + "query": { |
| 71 | + "messages": [ |
| 72 | + { |
| 73 | + "role": "system", |
| 74 | + "content": "You are a friendly assistant" |
| 75 | + }, |
| 76 | + { |
| 77 | + "role": "user", |
| 78 | + "content": "What is Cloduflare?" |
| 79 | + } |
| 80 | + ] |
| 81 | + } |
| 82 | + }, |
| 83 | + { |
| 84 | + "provider": "workers-ai", |
| 85 | + "endpoint": "@cf/meta/llama-3.1-8b-instruct-fast", |
| 86 | + "headers": { |
| 87 | + "Authorization": "Bearer {cloudflare_token}", |
| 88 | + "Content-Type": "application/json" |
| 89 | + }, |
| 90 | + "query": { |
| 91 | + "messages": [ |
| 92 | + { |
| 93 | + "role": "system", |
| 94 | + "content": "You are a friendly assistant" |
| 95 | + }, |
| 96 | + { |
| 97 | + "role": "user", |
| 98 | + "content": "What is Cloudflare?" |
| 99 | + } |
| 100 | + ] |
| 101 | + } |
| 102 | + } |
| 103 | +]' |
| 104 | +``` |
0 commit comments