Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
title: WebSockets API
pcx_content_type: configuration
sidebar:
group:
badge: Beta
---

The AI Gateway WebSockets API provides a persistent connection for AI interactions, eliminating repeated handshakes and reducing latency. This API is divided into two categories:

1. **Non-Realtime APIs** - Supports standard WebSocket communication for AI providers, including those that do not natively support WebSockets.
2. **Realtime APIs** - Designed for AI providers that offer low-latency, multimodal interactions over WebSockets.

## When to use WebSockets?

WebSockets are long-lived TCP connections that enable bi-directional, real-time and non realtime communication between client and server. Unlike HTTP connections, which require repeated handshakes for each request, WebSockets maintain the connection, supporting continuous data exchange with reduced overhead. WebSockets are ideal for applications needing low-latency, real-time data, such as voice assistants.

## Key benefits

- **Reduced Overhead**: Avoid overhead of repeated handshakes and TLS negotiations by maintaining a single, persistent connection.
- **Provider Compatibility**: Works with all AI providers in AI Gateway. Even if your chosen provider does not support WebSockets, we handle it for you, managing the requests to your preferred AI provider.

## **Key differences**

| Feature | Non-Realtime APIs | Realtime APIs |
| :---------------------- | :----------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------ |
| **Purpose** | Supports WebSocket-based AI interactions with providers that do not natively support WebSockets. | Enables real-time, multimodal AI interactions for providers that offer dedicated WebSocket endpoints. |
| **Use Case** | Text-based queries and responses, such as LLM requests. | Streaming responses for voice, video, and live interactions. |
| **AI Provider Support** | [All AI providers in AI Gateway.](/ai-gateway/providers) | [Limited to providers offering real-time WebSocket APIs.](/ai-gateway/configuration/websockets-api/realtime-api/#supported-providers) |
| **Streaming Support** | AI Gateway handles streaming via WebSockets. | Providers natively support real-time data streaming. |

For details on implementation, see the next section:

- [Realtime WebSockets API](/ai-gateway/configuration/websockets-api/realtime-api/)
- [Non-Realtime WebSockets API](/ai-gateway/configuration/websockets-api/non-realtime-api/)
Original file line number Diff line number Diff line change
@@ -1,21 +1,11 @@
---
title: WebSockets API
pcx_content_type: configuration
title: Non-realtime WebSockets API
sidebar:
badge:
text: Beta
order: 3
---

The AI Gateway WebSockets API provides a single persistent connection, enabling continuous communication. By using WebSockets, you can establish a single connection for multiple AI requests, eliminating the need for repeated handshakes and TLS negotiations, which enhances performance and reduces latency. This API supports all AI providers connected to AI Gateway, including those that do not natively support WebSockets.

## When to use WebSockets?

WebSockets are long-lived TCP connections that enable bi-directional, real-time communication between client and server. Unlike HTTP connections, which require repeated handshakes for each request, WebSockets maintain the connection, supporting continuous data exchange with reduced overhead. WebSockets are ideal for applications needing low-latency, real-time data, such as voice assistants.

## Key benefits

- **Reduced Overhead**: Avoid overhead of repeated handshakes and TLS negotiations by maintaining a single, persistent connection.
- **Provider Compatibility**: Works with all AI providers in AI Gateway. Even if your chosen provider does not support WebSockets, we handle it for you, managing the requests to your preferred AI provider.
The Non-realtime WebSockets API allows you to establish persistent connections for AI requests without requiring repeated handshakes. This approach is ideal for applications that do not require real-time interactions but still benefit from reduced latency and continuous communication.

## Set up WebSockets API

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
---
pcx_content_type: configuration
title: Realtime WebSockets API
sidebar:
order: 2
---

Some AI providers support real-time, low-latency interactions over WebSockets. AI Gateway allows seamless integration with these APIs, supporting multimodal interactions such as text, audio, and video.

## Supported Providers

- [OpenAI](https://platform.openai.com/docs/guides/realtime-websocket)
- [Google AI Studio](https://ai.google.dev/gemini-api/docs/multimodal-live)
- [Cartesia](https://docs.cartesia.ai/api-reference/tts/tts)
- [ElevenLabs](https://elevenlabs.io/docs/conversational-ai/api-reference/conversational-ai/websocket)

## Authentication

For real-time WebSockets, authentication can be done using:

- Headers (for non-browser environments)
- `sec-websocket-protocol` (for browsers)

## Examples

### OpenAI

```javascript
import WebSocket from "ws";

const url =
"wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/openai?model=gpt-4o-realtime-preview-2024-12-17";
const ws = new WebSocket(url, {
headers: {
"cf-aig-authorization": process.env.CLOUDFLARE_API_KEY,
Authorization: "Bearer " + process.env.OPENAI_API_KEY,
"OpenAI-Beta": "realtime=v1",
},
});

ws.on("open", () => console.log("Connected to server."));
ws.on("message", (message) => console.log(JSON.parse(message.toString())));

ws.send(
JSON.stringify({
type: "response.create",
response: { modalities: ["text"], instructions: "Tell me a joke" },
}),
);
```

### Google AI Studio

```javascript
const ws = new WebSocket(
"wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/google?api_key=<google_api_key>",
["cf-aig-authorization.<cloudflare_token>"],
);

ws.on("open", () => console.log("Connected to server."));
ws.on("message", (message) => console.log(message.data));

ws.send(
JSON.stringify({
setup: {
model: "models/gemini-2.0-flash-exp",
generationConfig: { responseModalities: ["TEXT"] },
},
}),
);
```

### Cartesia

```javascript
const ws = new WebSocket(
"wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/cartesia?cartesia_version=2024-06-10&api_key=<cartesia_api_key>",
["cf-aig-authorization.<cloudflare_token>"],
);

ws.on("open", function open() {
console.log("Connected to server.");
});

ws.on("message", function incoming(message) {
console.log(message.data);
});

ws.send(
JSON.stringify({
model_id: "sonic",
transcript: "Hello, world! I'm generating audio on ",
voice: { mode: "id", id: "a0e99841-438c-4a64-b679-ae501e7d6091" },
language: "en",
context_id: "happy-monkeys-fly",
output_format: {
container: "raw",
encoding: "pcm_s16le",
sample_rate: 8000,
},
add_timestamps: true,
continue: true,
}),
);
```

### ElevenLabs

```javascript
const ws = new WebSocket(
"wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/elevenlabs?agent_id=<elevenlabs_agent_id>",
[
"xi-api-key.<elevenlabs_api_key>",
"cf-aig-authorization.<cloudflare_token>",
],
);

ws.on("open", function open() {
console.log("Connected to server.");
});

ws.on("message", function incoming(message) {
console.log(message.data);
});

ws.send(
JSON.stringify({
text: "This is a sample text ",
voice_settings: { stability: 0.8, similarity_boost: 0.8 },
generation_config: { chunk_length_schedule: [120, 160, 250, 290] },
}),
);
```
Loading