guides, ai models

elithrar · elithrar · commit 731d6bb44d45 · 2025-02-24T15:59:15.000-05:00
diff --git a/src/content/docs/agents/examples/using-ai-models.mdx b/src/content/docs/agents/examples/using-ai-models.mdx
@@ -6,7 +6,7 @@ sidebar:
 
 ---
 
-import { MetaInfo, Render, Type, TypeScriptExample, WranglerConfig } from "~/components";
+import { AnchorHeading, MetaInfo, Render, Type, TypeScriptExample, WranglerConfig } from "~/components";
 
 Agents can communicate with AI models hosted on any provider, including [Workers AI](/workers-ai/), OpenAI, Anthropic, and Google's Gemini, and use the model routing features in [AI Gateway](/ai-gateway/) to route across providers, eval responses, and manage AI provider rate limits.
 
@@ -16,9 +16,69 @@ A user can disconnect during a long-running response from a modern reasoning mod
 
 ## Calling AI Models
 
+You can call models from any method within an Agent, including from HTTP requests using the [`onRequest`](/agents/api-reference/sdk/) handler, when a [scheduled task](/agents/examples/schedule-tasks/) runs, when handling a WebSocket message in the [`onMessage`](/agents/examples/websockets/) handler, or from any of your own methods.
+
+Importantly, Agents can call AI models on their own — autonomously — and can handle long-running responses that can take minutes (or longer) to respond in full.
+
+### Long-running model requests {/*long-running-model-requests*/}
+
+Modern [reasoning models](https://platform.openai.com/docs/guides/reasoning) or "thinking" model can take some time to both generate a response _and_ stream the response back to the client.
+
+Instead of buffering the entire response, or risking the client disconecting, you can stream the response back to the client by using the [WebSocket API](/agents/examples/websockets/).
+
+<TypeScriptExample file="src/index.ts">
+
+```ts
+import { Agent } from "@cloudflare/agents"
+import { OpenAI } from "openai"
+
+export class MyAgent extends Agent<Env> {
+	async onConnect(connection: Connection, ctx: ConnectionContext) {
+		// Omitted for simplicity: authenticating the user
+		connection.accept()
+	}
+
+	async onMessage(connection: Connection, message: WSMessage) {
+		let msg = JSON.parse(message)
+		// This can run as long as it needs to, and return as many messages as it needs to!
+		await queryReasoningModel(connection, msg.prompt)
+  }
+
+	async queryReasoningModel(connection: Connection, userPrompt: string) {
+		const client = new OpenAI({
+			apiKey: this.env.OPENAI_API_KEY,
+		});
+
+		try {
+			const stream = await client.chat.completions.create({
+				model: this.env.MODEL || 'o3-mini',
+				messages: [{ role: 'user', content: userPrompt }],
+				stream: true,
+			});
+
+			// Stream responses back as WebSocket messages
+			for await (const chunk of stream) {
+				const content = chunk.choices[0]?.delta?.content || '';
+				if (content) {
+					connection.send(JSON.stringify({ type: 'chunk', content }));
+				}
+			}
+
+			// Send completion message
+			connection.send(JSON.stringify({ type: 'done' }));
+		} catch (error) {
+			connection.send(JSON.stringify({ type: 'error', error: error }));
+		}
+	}
+```
+
+</TypeScriptExample>
+
+You can also persist AI model responses back to [Agent's internal state](/agents/examples/manage-and-sync-state/) by using the `this.setState` method. For example, if you run a [scheduled task](/agents/examples/scheduling-tasks/), you can store the output of the task and read it later. Or, if a user disconnects, read the message history back and send it to the user when they reconnect.
+
 ### Workers AI
 
-### Inference endpoints
+### Hosted models
 
 You can use [any of the models available in Workers AI](/workers-ai/models/) within your Agent by [configuring a binding](/workers-ai/configuration/bindings/).
 
@@ -63,7 +123,6 @@ binding = "AI"
 
 </WranglerConfig>
 
-
 ### Model routing
 
 You can also use the model routing features in [AI Gateway](/ai-gateway/) directly from an Agent by specifying a [`gateway` configuration](/ai-gateway/providers/workersai/) when calling the AI binding.
@@ -149,11 +208,11 @@ export class MyAgent extends Agent<Env> {
 
 </TypeScriptExample>
 
-### OpenAI SDK
+### OpenAI compatible endpoints
 
 Agents can call models across any service, including those that support the OpenAI API. For example, you can use the OpenAI SDK to use one of [Google's Gemini models](https://ai.google.dev/gemini-api/docs/openai#node.js) directly from your Agent.
 
-Agents can stream responses back over HTTP using Server Sent Events (SSE) from within an `onRequest` handler, or by using the native [WebSockets](/agents/examples/websockets/) API to responses back to a a client over a long running WebSocket.
+Agents can stream responses back over HTTP using Server Sent Events (SSE) from within an `onRequest` handler, or by using the native [WebSockets](/agents/examples/websockets/) API in your Agent to responses back to a client, which is especially useful for larger models that can take over 30+ seconds to reply.
 
 <TypeScriptExample file="src/index.ts">
 
diff --git a/src/content/docs/agents/examples/websockets.mdx b/src/content/docs/agents/examples/websockets.mdx
@@ -1,5 +1,5 @@
 ---
-title: Real-time and WebSockets
+title: Using WebSockets
 pcx_content_type: concept
 sidebar:
   order: 2
diff --git a/src/content/docs/agents/getting-started/observing-agents.mdx b/src/content/docs/agents/getting-started/observing-agents.mdx
diff --git a/src/content/docs/agents/guides/anthropic-agent-patterns.mdx b/src/content/docs/agents/guides/anthropic-agent-patterns.mdx
@@ -0,0 +1,9 @@
+---
+pcx_content_type: navigation
+title: Build a Human-in-the-loop Agent
+external_link: https://github.com/cloudflare/agents/tree/main/guides/human-in-the-loop
+sidebar:
+  order: 2
+head: []
+description: Implement human-in-the-loop functionality using Cloudflare Agents, allowing AI agents to request human approval before executing certain actions
+---
diff --git a/src/content/docs/agents/guides/human-in-the-loop.mdx b/src/content/docs/agents/guides/human-in-the-loop.mdx
@@ -0,0 +1,9 @@
+---
+pcx_content_type: navigation
+title: Implement Effective Agent Patterns
+external_link: https://github.com/cloudflare/agents/tree/main/guides/anthropic-patterns
+sidebar:
+  order: 3
+head: []
+description: Implement common agent patterns using the `@cloudflare/agents` framework.
+---