You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Agents can communicate with AI models hosted on any provider, including [Workers AI](/workers-ai/), OpenAI, Anthropic, and Google's Gemini.
11
+
Agents can communicate with AI models hosted on any provider, including [Workers AI](/workers-ai/), OpenAI, Anthropic, and Google's Gemini, and use the model routing features in [AI Gateway](/ai-gateway/) to route across providers, eval responses, and manage AI provider rate limits.
12
12
13
13
Because Agents are built on top of [Durable Objects](/durable-objects/), each Agent or chat session is associated with a stateful compute instance. Tradtional serverless architectures often present challenges for persistent connections needed in real-time applications like chat.
14
14
@@ -18,16 +18,116 @@ A user can disconnect during a long-running response from a modern reasoning mod
18
18
19
19
### Workers AI
20
20
21
-
TODO
21
+
### Inference endpoints
22
22
23
-
- Workers AI
24
-
- AI Gateway / model routing
23
+
You can use [any of the models available in Workers AI](/workers-ai/models/) within your Agent by [configuring a binding](/workers-ai/configuration/bindings/).
25
24
25
+
Workers AI supports streaming responses out-of-the-box by setting `stream: true`, and we strongly recommend using them to avoid buffering and delaying responses, especially for larger models or reasoning models that require more time to generate a response.
26
+
27
+
<TypeScriptExamplefile="src/index.ts">
28
+
29
+
```ts
30
+
import { Agent } from"@cloudflare/agents"
31
+
32
+
interfaceEnv {
33
+
AI:Ai;
34
+
}
35
+
36
+
exportclassMyAgentextendsAgent<Env> {
37
+
async onRequest(request:Request) {
38
+
const response =awaitenv.AI.run(
39
+
"@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",
40
+
{
41
+
prompt: "Build me a Cloudflare Worker that returns JSON.",
42
+
stream: true, // Stream a response and don't block the client!
43
+
}
44
+
);
45
+
46
+
// Return the stream
47
+
returnnewResponse(answer, {
48
+
headers: { "content-type": "text/event-stream" }
49
+
}
50
+
}
51
+
```
52
+
53
+
</TypeScriptExample>
54
+
55
+
Your wrangler configuration will need an `ai` binding added:
56
+
57
+
<WranglerConfig>
58
+
59
+
```toml
60
+
[ai]
61
+
binding="AI"
62
+
```
63
+
64
+
</WranglerConfig>
65
+
66
+
67
+
### Model routing
68
+
69
+
You can also use the model routing features in [AI Gateway](/ai-gateway/) directly from an Agent by specifying a [`gateway` configuration](/ai-gateway/providers/workersai/) when calling the AI binding.
70
+
71
+
:::note
72
+
73
+
Model routing allows you to route requests to different AI models based on whether they are reachable, rate-limiting your client, and/or if you've exceeded your cost budget for a specific provider.
74
+
75
+
:::
76
+
77
+
<TypeScriptExample file="src/index.ts">
78
+
79
+
```ts
80
+
import { Agent } from"@cloudflare/agents"
81
+
82
+
interfaceEnv {
83
+
AI: Ai;
84
+
}
85
+
86
+
exportclassMyAgentextendsAgent<Env> {
87
+
async onRequest(request:Request) {
88
+
const response =awaitenv.AI.run(
89
+
"@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",
90
+
{
91
+
prompt: "Build me a Cloudflare Worker that returns JSON."
92
+
},
93
+
{
94
+
gateway: {
95
+
id: "{gateway_id}", // Specify your AI Gateway ID here
96
+
skipCache: false,
97
+
cacheTtl: 3360,
98
+
},
99
+
},
100
+
);
101
+
102
+
returnResponse.json(response)
103
+
}
104
+
}
105
+
```
106
+
107
+
</TypeScriptExample>
108
+
109
+
Your wrangler configuration will need an `ai` binding added. This is shared across both Workers AI and AI Gateway.
110
+
<WranglerConfig>
111
+
112
+
```toml
113
+
[ai]
114
+
binding="AI"
115
+
```
116
+
117
+
</WranglerConfig>
118
+
119
+
Visit the [AI Gateway documentation](/ai-gateway/) to learn how to configure a gateway and retrieve a gateway ID.
26
120
27
121
### AI SDK
28
122
29
123
The [AI SDK](https://sdk.vercel.ai/docs/introduction) provides a unified API for using AI models, including for text generation, tool calling, structured responses, image generation, and more.
30
124
125
+
To use the AI SDK, install the `ai` package and use it within your Agent. The example below shows how it use it to generate text on request, but you can use it from any method within your Agent, including WebSocket handlers, as part of a scheduled task, or even when the Agent is initialized.
126
+
127
+
```sh
128
+
npminstallai @ai-sdk/openai
129
+
```
130
+
31
131
<TypeScriptExample file="src/index.ts">
32
132
33
133
```ts
@@ -36,10 +136,6 @@ import { generateText } from 'ai';
@@ -55,7 +151,52 @@ export class MyAgent extends Agent<Env> {
55
151
56
152
### OpenAI SDK
57
153
154
+
Agents can call models across any service, including those that support the OpenAI API. For example, you can use the OpenAI SDK to use one of [Google's Gemini models](https://ai.google.dev/gemini-api/docs/openai#node.js) directly from your Agent.
155
+
156
+
Agents can stream responses back over HTTP using Server Sent Events (SSE) from within an `onRequest` handler, or by using the native [WebSockets](/agents/examples/websockets/) API to responses back to a a client over a long running WebSocket.
0 commit comments