-
Notifications
You must be signed in to change notification settings - Fork 10.3k
[AIG]new docs for Websockets #20890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIG]new docs for Websockets #20890
Changes from 1 commit
4d27ab9
ca50bfd
a78628a
f7bd6bf
c087b0a
2c52128
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| --- | ||
| title: WebSockets API | ||
| pcx_content_type: configuration | ||
| sidebar: | ||
| group: | ||
| badge: Beta | ||
| --- | ||
|
|
||
| The AI Gateway WebSockets API provides a persistent connection for AI interactions, eliminating repeated handshakes and reducing latency. This API is divided into two categories: | ||
|
|
||
| 1. **Non-Realtime APIs** - Supports standard WebSocket communication for AI providers, including those that do not natively support WebSockets. | ||
| 2. **Realtime APIs** - Designed for AI providers that offer low-latency, multimodal interactions over WebSockets. | ||
|
|
||
| ## **Key differences** | ||
|
|
||
| | Feature | Non-Realtime APIs | Realtime APIs | | ||
| | :---------------------- | :----------------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------- | | ||
| | **Purpose** | Supports WebSocket-based AI interactions with providers that do not natively support WebSockets. | Enables real-time, multimodal AI interactions for providers that offer dedicated WebSocket endpoints. | | ||
| | **Use Case** | Text-based queries and responses, such as LLM requests. | Streaming responses for voice, video, and live interactions. | | ||
| | **AI Provider Support** | All AI providers in AI Gateway. | Limited to providers offering real-time WebSocket APIs. | | ||
| | **Streaming Support** | AI Gateway handles streaming via WebSockets. | Providers natively support real-time data streaming. | | ||
|
|
||
| For details on implementation, see the next section: | ||
|
|
||
| - [Non-Realtime WebSockets API](/ai-gateway/configuration/websockets/non-realtime-api.mdx) | ||
| - [Realtime WebSockets API](ai-gateway/configuration/websockets/realtime-api.mdx) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,127 @@ | ||
| --- | ||
| pcx_content_type: configuration | ||
| title: Non-realtime WebSockets API | ||
| sidebar: | ||
| order: 2 | ||
| --- | ||
|
|
||
| The Non-realtime WebSockets API allows you to establish persistent connections for AI requests without requiring repeated handshakes. This approach is ideal for applications that do not require real-time interactions but still benefit from reduced latency and continuous communication. | ||
|
|
||
| ## Set up WebSockets API | ||
|
|
||
| 1. Generate an AI Gateway token with appropriate AI Gateway Run and opt in to using an authenticated gateway. | ||
| 2. Modify your Universal Endpoint URL by replacing `https://` with `wss://` to initiate a WebSocket connection: | ||
| ``` | ||
| wss://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id} | ||
| ``` | ||
| 3. Open a WebSocket connection authenticated with a Cloudflare token with the AI Gateway Run permission. | ||
|
|
||
| :::note | ||
| Alternatively, we also support authentication via the `sec-websocket-protocol` header if you are using a browser WebSocket. | ||
| ::: | ||
|
|
||
| ## Example request | ||
|
|
||
| ```javascript | ||
| import WebSocket from "ws"; | ||
|
|
||
| const ws = new WebSocket( | ||
| "wss://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/", | ||
| { | ||
| headers: { | ||
| "cf-aig-authorization": "Bearer AI_GATEWAY_TOKEN", | ||
| }, | ||
| }, | ||
| ); | ||
|
|
||
| ws.send( | ||
| JSON.stringify({ | ||
| type: "universal.create", | ||
| request: { | ||
| eventId: "my-request", | ||
| provider: "workers-ai", | ||
| endpoint: "@cf/meta/llama-3.1-8b-instruct", | ||
| headers: { | ||
| Authorization: "Bearer WORKERS_AI_TOKEN", | ||
| "Content-Type": "application/json", | ||
| }, | ||
| query: { | ||
| prompt: "tell me a joke", | ||
| }, | ||
| }, | ||
| }), | ||
| ); | ||
|
|
||
| ws.on("message", function incoming(message) { | ||
| console.log(message.toString()); | ||
| }); | ||
| ``` | ||
|
|
||
| ## Example response | ||
|
||
|
|
||
| ```json | ||
| { | ||
| "type": "universal.created", | ||
| "metadata": { | ||
| "cacheStatus": "MISS", | ||
| "eventId": "my-request", | ||
| "logId": "01JC3R94FRD97JBCBX3S0ZAXKW", | ||
| "step": "0", | ||
| "contentType": "application/json" | ||
| }, | ||
| "response": { | ||
| "result": { | ||
| "response": "Why was the math book sad? Because it had too many problems. Would you like to hear another one?" | ||
| }, | ||
| "success": true, | ||
| "errors": [], | ||
| "messages": [] | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ## Example streaming request | ||
|
|
||
| For streaming requests, AI Gateway sends an initial message with request metadata indicating the stream is starting: | ||
|
|
||
| ```json | ||
| { | ||
| "type": "universal.created", | ||
| "metadata": { | ||
| "cacheStatus": "MISS", | ||
| "eventId": "my-request", | ||
| "logId": "01JC40RB3NGBE5XFRZGBN07572", | ||
| "step": "0", | ||
| "contentType": "text/event-stream" | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| After this initial message, all streaming chunks are relayed in real-time to the WebSocket connection as they arrive from the inference provider. Only the `eventId` field is included in the metadata for these streaming chunks. The `eventId` allows AI Gateway to include a client-defined ID with each message, even in a streaming WebSocket environment. | ||
|
|
||
| ```json | ||
| { | ||
| "type": "universal.stream", | ||
| "metadata": { | ||
| "eventId": "my-request" | ||
| }, | ||
| "response": { | ||
| "response": "would" | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| Once all chunks for a request have been streamed, AI Gateway sends a final message to signal the completion of the request. For added flexibility, this message includes all the metadata again, even though it was initially provided at the start of the streaming process. | ||
|
|
||
| ```json | ||
| { | ||
| "type": "universal.done", | ||
| "metadata": { | ||
| "cacheStatus": "MISS", | ||
| "eventId": "my-request", | ||
| "logId": "01JC40RB3NGBE5XFRZGBN07572", | ||
| "step": "0", | ||
| "contentType": "text/event-stream" | ||
| } | ||
| } | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,133 @@ | ||
| --- | ||
| pcx_content_type: configuration | ||
| title: Realtime WebSockets API | ||
| sidebar: | ||
| order: 3 | ||
| --- | ||
|
|
||
| Some AI providers support real-time, low-latency interactions over WebSockets. AI Gateway allows seamless integration with these APIs, supporting multimodal interactions such as text, audio, and video. | ||
|
|
||
| ## Supported Providers | ||
|
|
||
| - [OpenAI](https://platform.openai.com/docs/guides/realtime-websocket) | ||
| - [Google AI Studio](https://ai.google.dev/gemini-api/docs/multimodal-live) | ||
| - [Cartesia](https://docs.cartesia.ai/api-reference/tts/tts) | ||
| - [ElevenLabs](https://elevenlabs.io/docs/conversational-ai/api-reference/conversational-ai/websocket) | ||
|
|
||
| ## Authentication | ||
|
|
||
| For real-time WebSockets, authentication can be done using: | ||
|
|
||
| - Headers (for non-browser environments) | ||
| - `sec-websocket-protocol` (for browsers) | ||
|
|
||
| ## Examples | ||
|
|
||
| ### OpenAI | ||
|
||
|
|
||
| ```javascript | ||
| import WebSocket from "ws"; | ||
|
|
||
| const url = | ||
| "wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/openai?model=gpt-4o-realtime-preview-2024-12-17"; | ||
| const ws = new WebSocket(url, { | ||
| headers: { | ||
| "cf-aig-authorization": process.env.CLOUDFLARE_API_KEY, | ||
| Authorization: "Bearer " + process.env.OPENAI_API_KEY, | ||
| "OpenAI-Beta": "realtime=v1", | ||
| }, | ||
| }); | ||
|
|
||
| ws.on("open", () => console.log("Connected to server.")); | ||
| ws.on("message", (message) => console.log(JSON.parse(message.toString()))); | ||
|
|
||
| ws.send( | ||
| JSON.stringify({ | ||
| type: "response.create", | ||
| response: { modalities: ["text"], instructions: "Tell me a joke" }, | ||
| }), | ||
| ); | ||
| ``` | ||
|
|
||
| ### Google AI Studio | ||
|
|
||
| ```javascript | ||
| const ws = new WebSocket( | ||
| "wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/google?api_key=<google_api_key>", | ||
| ["cf-aig-authorization.<cloudflare_token>"], | ||
| ); | ||
|
|
||
| ws.on("open", () => console.log("Connected to server.")); | ||
| ws.on("message", (message) => console.log(message.data)); | ||
|
|
||
| ws.send( | ||
| JSON.stringify({ | ||
| setup: { | ||
| model: "models/gemini-2.0-flash-exp", | ||
| generationConfig: { responseModalities: ["TEXT"] }, | ||
| }, | ||
| }), | ||
| ); | ||
| ``` | ||
|
|
||
| ### Cartesia | ||
|
|
||
| ```javascript | ||
| const ws = new WebSocket( | ||
| "wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/cartesia?cartesia_version=2024-06-10&api_key=<cartesia_api_key>", | ||
| ["cf-aig-authorization.<cloudflare_token>"], | ||
| ); | ||
|
|
||
| ws.on("open", function open() { | ||
| console.log("Connected to server."); | ||
| }); | ||
|
|
||
| ws.on("message", function incoming(message) { | ||
| console.log(message.data); | ||
| }); | ||
|
|
||
| ws.send( | ||
| JSON.stringify({ | ||
| model_id: "sonic", | ||
| transcript: "Hello, world! I'm generating audio on ", | ||
| voice: { mode: "id", id: "a0e99841-438c-4a64-b679-ae501e7d6091" }, | ||
| language: "en", | ||
| context_id: "happy-monkeys-fly", | ||
| output_format: { | ||
| container: "raw", | ||
| encoding: "pcm_s16le", | ||
| sample_rate: 8000, | ||
| }, | ||
| add_timestamps: true, | ||
| continue: true, | ||
| }), | ||
| ); | ||
| ``` | ||
|
|
||
| ### ElevenLabs | ||
|
|
||
| ```javascript | ||
| const ws = new WebSocket( | ||
| "wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/elevenlabs?agent_id=<elevenlabs_agent_id>", | ||
| [ | ||
| "xi-api-key.<elevenlabs_api_key>", | ||
| "cf-aig-authorization.<cloudflare_token>", | ||
| ], | ||
| ); | ||
|
|
||
| ws.on("open", function open() { | ||
| console.log("Connected to server."); | ||
| }); | ||
|
|
||
| ws.on("message", function incoming(message) { | ||
| console.log(message.data); | ||
| }); | ||
|
|
||
| ws.send( | ||
| JSON.stringify({ | ||
| text: "This is a sample text ", | ||
| voice_settings: { stability: 0.8, similarity_boost: 0.8 }, | ||
| generation_config: { chunk_length_schedule: [120, 160, 250, 290] }, | ||
| }), | ||
| ); | ||
| ``` | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your folder casing is off here.
websockets-apiWe avoid spaces + capitalization in filenames