|
| 1 | +--- |
| 2 | +pcx_content_type: configuration |
| 3 | +title: Request handling |
| 4 | +sidebar: |
| 5 | + order: 4 |
| 6 | +--- |
| 7 | + |
| 8 | +import { Render, Aside } from "~/components"; |
| 9 | + |
| 10 | +Your AI gateway supports different strategies for handling requests to providers, which allows you to manage AI interactions effectively and ensure your applications remain responsive and reliable. |
| 11 | + |
| 12 | +## Request timeouts |
| 13 | + |
| 14 | +A request timeout allows you to trigger fallbacks or a retry if a provider takes too long to respond. |
| 15 | + |
| 16 | +These timeouts help: |
| 17 | + |
| 18 | +- Improve user experience, by preventing users from waiting too long for a response |
| 19 | +- Proactively handle errors, by detecting unresponsive providers and triggering a fallback option |
| 20 | + |
| 21 | +Request timeouts can be set on a Universal Endpoint or directly on a request to any provider. |
| 22 | + |
| 23 | +### Definitions |
| 24 | + |
| 25 | +A timeout is set in milliseconds. Additionally, the timeout is based on when the first part of the response comes back. As long as the first part of the response returns within the specified timeframe - such as when streaming a response - your gateway will wait for the response. |
| 26 | + |
| 27 | +### Configuration |
| 28 | + |
| 29 | +#### Universal Endpoint |
| 30 | + |
| 31 | +If set on a [Universal Endpoint](/ai-gateway/providers/universal/), a request timeout specifies the timeout duration for requests and triggers a fallback. |
| 32 | + |
| 33 | +For a Universal Endpoint, configure the timeout value by setting a `requestTimeout` property within the provider-specific `config` object. Each provider can have a different `requestTimeout` value for granular customization. |
| 34 | + |
| 35 | +```bash title="Provider-level config" {11-13} collapse={15-48} |
| 36 | +curl 'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}' \ |
| 37 | + --header 'Content-Type: application/json' \ |
| 38 | + --data '[ |
| 39 | + { |
| 40 | + "provider": "workers-ai", |
| 41 | + "endpoint": "@cf/meta/llama-3.1-8b-instruct", |
| 42 | + "headers": { |
| 43 | + "Authorization": "Bearer {cloudflare_token}", |
| 44 | + "Content-Type": "application/json" |
| 45 | + }, |
| 46 | + "config": { |
| 47 | + "requestTimeout": 1000 |
| 48 | + }, |
| 49 | + "query": { |
| 50 | + "messages": [ |
| 51 | + { |
| 52 | + "role": "system", |
| 53 | + "content": "You are a friendly assistant" |
| 54 | + }, |
| 55 | + { |
| 56 | + "role": "user", |
| 57 | + "content": "What is Cloudflare?" |
| 58 | + } |
| 59 | + ] |
| 60 | + } |
| 61 | + }, |
| 62 | + { |
| 63 | + "provider": "workers-ai", |
| 64 | + "endpoint": "@cf/meta/llama-3.1-8b-instruct-fast", |
| 65 | + "headers": { |
| 66 | + "Authorization": "Bearer {cloudflare_token}", |
| 67 | + "Content-Type": "application/json" |
| 68 | + }, |
| 69 | + "query": { |
| 70 | + "messages": [ |
| 71 | + { |
| 72 | + "role": "system", |
| 73 | + "content": "You are a friendly assistant" |
| 74 | + }, |
| 75 | + { |
| 76 | + "role": "user", |
| 77 | + "content": "What is Cloudflare?" |
| 78 | + } |
| 79 | + ] |
| 80 | + }, |
| 81 | + "config": { |
| 82 | + "requestTimeout": 3000 |
| 83 | + }, |
| 84 | + } |
| 85 | +]' |
| 86 | +``` |
| 87 | + |
| 88 | +#### Direct provider |
| 89 | + |
| 90 | +If set on a [provider](/ai-gateway/providers/) request, request timeout specifies the timeout duration for a request and - if exceeded - returns an error. |
| 91 | + |
| 92 | +For a provider-specific endpoint, configure the timeout value by adding a `cf-aig-request-timeout` header. |
| 93 | + |
| 94 | +```bash title="Provider-specific endpoint example" {4} |
| 95 | +curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/meta/llama-3.1-8b-instruct \ |
| 96 | + --header 'Authorization: Bearer {cf_api_token}' \ |
| 97 | + --header 'Content-Type: application/json' \ |
| 98 | + --header 'cf-aig-request-timeout: 5000' |
| 99 | + --data '{"prompt": "What is Cloudflare?"}' |
| 100 | +``` |
| 101 | + |
| 102 | +--- |
| 103 | + |
| 104 | +## Request retries |
| 105 | + |
| 106 | +AI Gateway also supports automatic retries for failed requests, with a maximum of five retry attempts. |
| 107 | + |
| 108 | +This feature improves your application's resiliency, ensuring you can recover from temporary issues without manual intervention. |
| 109 | + |
| 110 | +Request timeouts can be set on a Universal Endpoint or directly on a request to any provider. |
| 111 | + |
| 112 | +### Definitions |
| 113 | + |
| 114 | +With request retries, you can adjust a combination of three properties: |
| 115 | + |
| 116 | +- Number of attempts (maximum of 5 tries) |
| 117 | +- How long before retrying (in milliseconds, maximum of 5 seconds) |
| 118 | +- Backoff method (constant, linear, or exponential) |
| 119 | + |
| 120 | +On the final retry attempt, your gateway will wait until the request completes, regardless of how long it takes. |
| 121 | + |
| 122 | +### Configuration |
| 123 | + |
| 124 | +#### Universal endpoint |
| 125 | + |
| 126 | +If set on a [Universal Endpoint](/ai-gateway/providers/universal/), a request retry will automatically retry failed requests up to five times before triggering any configured fallbacks. |
| 127 | + |
| 128 | +For a Universal Endpoint, configure the retry settings with the following properties in the provider-specific `config`: |
| 129 | + |
| 130 | +```json |
| 131 | +config:{ |
| 132 | + maxAttempts?: number; |
| 133 | + retryDelay?: number; |
| 134 | + backoff?: "constant" | "linear" | "exponential"; |
| 135 | +} |
| 136 | +``` |
| 137 | + |
| 138 | +As with the [request timeout](/ai-gateway/configuration/request-handling/#universal-endpoint), each provider can have a different retry settings for granular customization. |
| 139 | + |
| 140 | +```bash title="Provider-level config" {11-15} collapse={16-55} |
| 141 | +curl 'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}' \ |
| 142 | + --header 'Content-Type: application/json' \ |
| 143 | + --data '[ |
| 144 | + { |
| 145 | + "provider": "workers-ai", |
| 146 | + "endpoint": "@cf/meta/llama-3.1-8b-instruct", |
| 147 | + "headers": { |
| 148 | + "Authorization": "Bearer {cloudflare_token}", |
| 149 | + "Content-Type": "application/json" |
| 150 | + }, |
| 151 | + "config": { |
| 152 | + "maxAttempts": 2, |
| 153 | + "retryDelay": 1000, |
| 154 | + "backoff": "constant" |
| 155 | + }, |
| 156 | + "query": { |
| 157 | + "messages": [ |
| 158 | + { |
| 159 | + "role": "system", |
| 160 | + "content": "You are a friendly assistant" |
| 161 | + }, |
| 162 | + { |
| 163 | + "role": "user", |
| 164 | + "content": "What is Cloudflare?" |
| 165 | + } |
| 166 | + ] |
| 167 | + } |
| 168 | + }, |
| 169 | + { |
| 170 | + "provider": "workers-ai", |
| 171 | + "endpoint": "@cf/meta/llama-3.1-8b-instruct-fast", |
| 172 | + "headers": { |
| 173 | + "Authorization": "Bearer {cloudflare_token}", |
| 174 | + "Content-Type": "application/json" |
| 175 | + }, |
| 176 | + "query": { |
| 177 | + "messages": [ |
| 178 | + { |
| 179 | + "role": "system", |
| 180 | + "content": "You are a friendly assistant" |
| 181 | + }, |
| 182 | + { |
| 183 | + "role": "user", |
| 184 | + "content": "What is Cloudflare?" |
| 185 | + } |
| 186 | + ] |
| 187 | + }, |
| 188 | + "config": { |
| 189 | + "maxAttempts": 4, |
| 190 | + "retryDelay": 1000, |
| 191 | + "backoff": "exponential" |
| 192 | + }, |
| 193 | + } |
| 194 | +]' |
| 195 | +``` |
| 196 | + |
| 197 | +#### Direct provider |
| 198 | + |
| 199 | +If set on a [provider](/ai-gateway/providers/) request, a request retry will automatically retry failed requests up to five times. On the final retry attempt, your gateway will wait until the request completes, regardless of how long it takes. |
| 200 | + |
| 201 | +For a provider-specific endpoint, configure the retry settings by adding different header values: |
| 202 | + |
| 203 | +- `cf-aig-max-attempts` (number) |
| 204 | +- `cf-aig-retry-delay` (number) |
| 205 | +- `cf-aig-backoff` ("constant" | "linear" | "exponential) |
0 commit comments