Skip to content

Commit fd0e5ea

Browse files
daisyfaithaumahyperlint-ai[bot]kathaylMaddy-Cloudflare
authored
[AIG]Initial websocket documentation (cloudflare#18247)
* Initial websocket documentation * Update src/content/docs/ai-gateway/get-started.mdx Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> * Update src/content/docs/ai-gateway/providers/universal.mdx Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> * Update src/content/docs/ai-gateway/configuration/websockets-api.mdx Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> * Update src/content/docs/ai-gateway/configuration/websockets-api.mdx Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> * Update src/content/docs/ai-gateway/configuration/websockets-api.mdx Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> * grammar * Update websockets-api.mdx * added a badge * Update src/content/docs/ai-gateway/configuration/websockets-api.mdx Co-authored-by: Maddy <[email protected]> --------- Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> Co-authored-by: Kathy <[email protected]> Co-authored-by: Maddy <[email protected]>
1 parent 0bd2818 commit fd0e5ea

File tree

3 files changed

+180
-1
lines changed

3 files changed

+180
-1
lines changed
Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
---
2+
title: Websockets API
3+
pcx_content_type: configuration
4+
sidebar:
5+
badge:
6+
text: Beta
7+
---
8+
9+
The AI Gateway WebSockets API provides a single persistent connection, enabling continuous communication. By using WebSockets, you can establish a single connection for multiple AI requests, eliminating the need for repeated handshakes and TLS negotiations, which enhances performance and reduces latency. This API supports all AI providers connected to AI Gateway, including those that do not natively support WebSockets.
10+
11+
## When to use webSockets?
12+
13+
WebSockets are long-lived TCP connections that enable bi-directional, real-time communication between client and server. Unlike HTTP connections, which require repeated handshakes for each request, WebSockets maintain the connection, supporting continuous data exchange with reduced overhead. WebSockets are ideal for applications needing low-latency, real-time data, such as voice assistants.
14+
15+
## Key benefits
16+
17+
- **Reduced Overhead**: Avoid overhead of repeated handshakes and TLS negotiations by maintaining a single, persistent connection.
18+
- **Provider Compatibility**: Works with all AI providers in AI Gateway. Even if your chosen provider does not support WebSockets, we handle it for you, managing the requests to your preferred AI provider.
19+
20+
## Set up WebSockets API
21+
22+
1. Generate an AI Gateway token with appropriate AI Gateway Run and opt in to using an authenticated gateway.
23+
2. Modify your Universal Endpoint URL by replacing `https://` with `wss://` to initiate a WebSocket connection:
24+
```
25+
wss://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}
26+
```
27+
3. Open a WebSocket connection authenticated with a Cloudflare token with the AI Gateway Run permission.
28+
29+
:::note
30+
Alternatively, we also support authentication via the `sec-websocket-protocol` header if you are using a browser WebSocket.
31+
:::
32+
33+
## Example request
34+
35+
```javascript
36+
import WebSocket from "ws";
37+
38+
const ws = new WebSocket(
39+
"wss://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/",
40+
{
41+
headers: {
42+
"cf-aig-authorization": "Bearer AI_GATEWAY_TOKEN",
43+
},
44+
},
45+
);
46+
47+
ws.send(
48+
JSON.stringify({
49+
type: "universal.create",
50+
request: {
51+
eventId: "my-request",
52+
provider: "workers-ai",
53+
endpoint: "@cf/meta/llama-3.1-8b-instruct",
54+
headers: {
55+
Authorization: "Bearer WORKERS_AI_TOKEN",
56+
"Content-Type": "application/json",
57+
},
58+
query: {
59+
prompt: "tell me a joke",
60+
},
61+
},
62+
}),
63+
);
64+
65+
ws.on("message", function incoming(message) {
66+
console.log(message.toString());
67+
});
68+
```
69+
70+
## Example response
71+
72+
```json
73+
{
74+
"type": "universal.created",
75+
"metadata": {
76+
"cacheStatus": "MISS",
77+
"eventId": "my-request",
78+
"logId": "01JC3R94FRD97JBCBX3S0ZAXKW",
79+
"step": "0",
80+
"contentType": "application/json"
81+
},
82+
"response": {
83+
"result": {
84+
"response": "Why was the math book sad? Because it had too many problems. Would you like to hear another one?"
85+
},
86+
"success": true,
87+
"errors": [],
88+
"messages": []
89+
}
90+
}
91+
```
92+
93+
## Example streaming request
94+
95+
For streaming requests, AI Gateway sends an initial message with request metadata indicating the stream is starting:
96+
97+
```json
98+
{
99+
"type": "universal.created",
100+
"metadata": {
101+
"cacheStatus": "MISS",
102+
"eventId": "my-request",
103+
"logId": "01JC40RB3NGBE5XFRZGBN07572",
104+
"step": "0",
105+
"contentType": "text/event-stream"
106+
}
107+
}
108+
```
109+
110+
After this initial message, all streaming chunks are relayed in real-time to the WebSocket connection as they arrive from the inference provider. Only the `eventId` field is included in the metadata for these streaming chunks. The `eventId` allows AI Gateway to include a client-defined ID with each message, even in a streaming WebSocket environment.
111+
112+
```json
113+
{
114+
"type": "universal.stream",
115+
"metadata": {
116+
"eventId": "my-request"
117+
},
118+
"response": {
119+
"response": "would"
120+
}
121+
}
122+
```
123+
124+
Once all chunks for a request have been streamed, AI Gateway sends a final message to signal the completion of the request. For added flexibility, this message includes all the metadata again, even though it was initially provided at the start of the streaming process.
125+
126+
```json
127+
{
128+
"type": "universal.done",
129+
"metadata": {
130+
"cacheStatus": "MISS",
131+
"eventId": "my-request",
132+
"logId": "01JC40RB3NGBE5XFRZGBN07572",
133+
"step": "0",
134+
"contentType": "text/event-stream"
135+
}
136+
}
137+
```

src/content/docs/ai-gateway/get-started.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ Next, connect your AI provider to your gateway.
3434

3535
AI Gateway offers multiple endpoints for each Gateway you create - one endpoint per provider, and one Universal Endpoint.
3636

37+
Additionally, AI Gateway has a [WebSockets API](/ai-gateway/configuration/websockets-api/) which provides a single persistent connection, enabling continuous communication. This API supports all AI providers connected to AI Gateway, including those that do not natively support WebSockets.
38+
3739
Below is a list of our supported model providers:
3840

3941
<DirectoryListing folder="ai-gateway/providers" />

src/content/docs/ai-gateway/providers/universal.mdx

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ sidebar:
66
order: 1
77
---
88

9-
import { Render } from "~/components";
9+
import { Render, Badge } from "~/components";
1010

1111
You can use the Universal Endpoint to contact every provider.
1212

@@ -30,3 +30,43 @@ You can use the Universal endpoint to contact every provider. The payload is exp
3030
<Render file="universal-gateway-example" />
3131

3232
The above will send a request to Workers AI Inference API, if it fails it will proceed to OpenAI. You can add as many fallbacks as you need, just by adding another JSON in the array.
33+
34+
## Websockets API <Badge text="beta" variant="tip" size="small" />
35+
36+
The Universal Endpoint can also be accessed via a [WebSockets API](/ai-gateway/configuration/websockets-api/) which provides a single persistent connection, enabling continuous communication. This API supports all AI providers connected to AI Gateway, including those that do not natively support WebSockets.
37+
38+
## Example request
39+
40+
```javascript
41+
import WebSocket from "ws";
42+
const ws = new WebSocket(
43+
"wss://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/",
44+
{
45+
headers: {
46+
"cf-aig-authorization": "Bearer AI_GATEWAY_TOKEN",
47+
},
48+
},
49+
);
50+
51+
ws.send(
52+
JSON.stringify({
53+
type: "universal.create",
54+
request: {
55+
eventId: "my-request",
56+
provider: "workers-ai",
57+
endpoint: "@cf/meta/llama-3.1-8b-instruct",
58+
headers: {
59+
Authorization: "Bearer WORKERS_AI_TOKEN",
60+
"Content-Type": "application/json",
61+
},
62+
query: {
63+
prompt: "tell me a joke",
64+
},
65+
},
66+
}),
67+
);
68+
69+
ws.on("message", function incoming(message) {
70+
console.log(message.toString());
71+
});
72+
```

0 commit comments

Comments
 (0)