Skip to content

Commit b2dbf8e

Browse files
daisyfaithaumaharshil1712
authored andcommitted
[AIG]added header hierarchy docs (#18357)
* added header hierachy docs * Minor edits
1 parent ade662f commit b2dbf8e

File tree

1 file changed

+75
-5
lines changed

1 file changed

+75
-5
lines changed

src/content/docs/ai-gateway/providers/universal.mdx

Lines changed: 75 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,6 @@ You can use the Universal Endpoint to contact every provider.
1414
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}
1515
```
1616

17-
## Description
18-
1917
AI Gateway offers multiple endpoints for each Gateway you create - one endpoint per provider, and one Universal Endpoint. The Universal Endpoint requires some adjusting to your schema, but supports additional features. Some of these features are, for example, retrying a request if it fails the first time, or configuring a [fallback model/provider](/ai-gateway/configuration/fallbacks/).
2018

2119
You can use the Universal endpoint to contact every provider. The payload is expecting an array of message, and each message is an object with the following parameters:
@@ -25,17 +23,17 @@ You can use the Universal endpoint to contact every provider. The payload is exp
2523
- `authorization`: the content of the Authorization HTTP Header that should be used when contacting this provider. This usually starts with “Token” or “Bearer”.
2624
- `query`: the payload as the provider expects it in their official API.
2725

28-
## Example
26+
## cURL example
2927

3028
<Render file="universal-gateway-example" />
3129

3230
The above will send a request to Workers AI Inference API, if it fails it will proceed to OpenAI. You can add as many fallbacks as you need, just by adding another JSON in the array.
3331

34-
## Websockets API <Badge text="beta" variant="tip" size="small" />
32+
## WebSockets API <Badge text="beta" variant="tip" size="small" />
3533

3634
The Universal Endpoint can also be accessed via a [WebSockets API](/ai-gateway/configuration/websockets-api/) which provides a single persistent connection, enabling continuous communication. This API supports all AI providers connected to AI Gateway, including those that do not natively support WebSockets.
3735

38-
## Example request
36+
## WebSockets example
3937

4038
```javascript
4139
import WebSocket from "ws";
@@ -70,3 +68,75 @@ ws.on("message", function incoming(message) {
7068
console.log(message.toString());
7169
});
7270
```
71+
72+
## Header configuration hierarchy
73+
74+
The Universal Endpoint allows you to set fallback models or providers and customize headers for each provider or request. You can configure headers at three levels:
75+
76+
1. **Provider level**: Headers specific to a particular provider.
77+
2. **Request level**: Headers included in individual requests.
78+
3. **Gateway settings**: Default headers configured in your gateway dashboard.
79+
80+
Since the same settings can be configured in multiple locations, AI Gateway applies a hierarchy to determine which configuration takes precedence:
81+
82+
- **Provider-level headers** override all other configurations.
83+
- **Request-level headers** are used if no provider-level headers are set.
84+
- **Gateway-level settings** are used only if no headers are configured at the provider or request levels.
85+
86+
This hierarchy ensures consistent behavior, prioritizing the most specific configurations. Use provider-level and request-level headers for fine-tuned control, and gateway settings for general defaults.
87+
88+
## Hierarchy example
89+
90+
This example demonstrates how headers set at different levels impact caching behavior:
91+
92+
- **Request-level header**: The `cf-aig-cache-ttl` is set to `3600` seconds, applying this caching duration to the request by default.
93+
- **Provider-level header**: For the fallback provider (OpenAI), `cf-aig-cache-ttl` is explicitly set to `0` seconds, overriding the request-level header and disabling caching for responses when OpenAI is used as the provider.
94+
95+
This shows how provider-level headers take precedence over request-level headers, allowing for granular control of caching behavior.
96+
97+
```bash
98+
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id} \
99+
--header 'Content-Type: application/json' \
100+
--header 'cf-aig-cache-ttl: 3600' \
101+
--data '[
102+
{
103+
"provider": "workers-ai",
104+
"endpoint": "@cf/meta/llama-3.1-8b-instruct",
105+
"headers": {
106+
"Authorization": "Bearer {cloudflare_token}",
107+
"Content-Type": "application/json"
108+
},
109+
"query": {
110+
"messages": [
111+
{
112+
"role": "system",
113+
"content": "You are a friendly assistant"
114+
},
115+
{
116+
"role": "user",
117+
"content": "What is Cloudflare?"
118+
}
119+
]
120+
}
121+
},
122+
{
123+
"provider": "openai",
124+
"endpoint": "chat/completions",
125+
"headers": {
126+
"Authorization": "Bearer {open_ai_token}",
127+
"Content-Type": "application/json",
128+
"cf-aig-cache-ttl": "0"
129+
},
130+
"query": {
131+
"model": "gpt-4o-mini",
132+
"stream": true,
133+
"messages": [
134+
{
135+
"role": "user",
136+
"content": "What is Cloudflare?"
137+
}
138+
]
139+
}
140+
}
141+
]'
142+
```

0 commit comments

Comments
 (0)