Update readme

danielmillerp · danielmillerp · commit 1a4e622e9826 · 2025-10-22T11:48:16.000-04:00
diff --git a/src/agentex/lib/core/temporal/plugins/openai_agents/README.md b/src/agentex/lib/core/temporal/plugins/openai_agents/README.md
@@ -16,8 +16,9 @@ We modified Temporal's OpenAI Agents plugin to add real-time streaming to Redis/
 3. [The Streaming Challenge](#the-streaming-challenge)
 4. [Our Streaming Solution](#our-streaming-solution)
 5. [Implementation Details](#implementation-details)
-6. [Usage](#usage)
-7. [Drawbacks and Maintenance](#drawbacks-and-maintenance)
+6. [Streaming Lifecycle Events with Hooks](#streaming-lifecycle-events-with-hooks)
+7. [Usage](#usage)
+8. [Drawbacks and Maintenance](#drawbacks-and-maintenance)
 
 ---
 
@@ -431,38 +432,13 @@ class StreamingModel(Model):
                     )
                     await streaming_context.stream_update(update)
 
-            # 5. Handle tool calls (sent as complete messages, not streamed)
-            if tool_calls:
-                for tool_call_data in tool_calls.values():
-                    tool_request = ToolRequestContent(
-                        author="agent",
-                        tool_call_id=tool_call_data["id"],
-                        name=tool_call_data["function"]["name"],
-                        arguments=json.loads(tool_call_data["function"]["arguments"])
-                    )
+            # 5. Handle reasoning tokens - streamed as they're generated
+            # Reasoning summaries (for o1/o3 models) are streamed token-by-token
+            # as ReasoningSummaryDelta events during the response stream
 
-                    # Tool calls use StreamTaskMessageFull (complete message)
-                    async with adk.streaming.streaming_task_message_context(
-                        task_id=task_id,
-                        initial_content=tool_request
-                    ) as tool_context:
-                        await tool_context.stream_update(
-                            StreamTaskMessageFull(
-                                parent_task_message=tool_context.task_message,
-                                content=tool_request,
-                                type="full"
-                            )
-                        )
-
-            # 6. Handle reasoning tokens (o1 models)
-            if reasoning_content:  # For o1 models
-                reasoning = ReasoningContent(
-                    author="agent",
-                    summary=[reasoning_content],
-                    type="reasoning"
-                )
-                # Stream reasoning as complete message
-                await stream_reasoning_update(reasoning)
+            # Note: Tool calls are NOT streamed from get_response
+            # They are collected during streaming and returned in the final response
+            # Use TemporalStreamingHooks to stream tool lifecycle events separately
 
         # 7. Context auto-closes and saves to DB
         # The streaming_task_message_context:
@@ -501,28 +477,22 @@ await streaming_context.stream_update(
     StreamTaskMessageDelta(delta=TextDelta(text_delta=chunk))
 )
 
-# Tool calls - sent as complete messages
-await streaming_context.stream_update(
-    StreamTaskMessageFull(content=ToolRequestContent(...))
-)
-
-# Reasoning (o1 models) - sent as complete
-await streaming_context.stream_update(
-    StreamTaskMessageFull(content=ReasoningContent(...))
+# Reasoning summaries (o1/o3 models) - streamed token by token
+await reasoning_context.stream_update(
+    StreamTaskMessageDelta(delta=ReasoningSummaryDelta(summary_delta=chunk))
 )
 
-# Guardrails - sent as complete
-await streaming_context.stream_update(
-    StreamTaskMessageFull(content=GuardrailContent(...))
-)
+# NOTE: Tool calls and handoffs are NOT streamed from streaming_model.py
+# They are collected during streaming and returned in the final ModelResponse
+# To stream tool lifecycle events, use TemporalStreamingHooks (see below)
 ```
 
 #### UI Subscription
 
 The frontend subscribes to `stream:{task_id}` and receives:
-1. Real-time text chunks as they're generated
-2. Complete tool calls when they're ready
-3. Reasoning summaries for o1 models
+1. Real-time text chunks as they're generated (from StreamingModel)
+2. Reasoning summaries for o1/o3 models (from StreamingModel)
+3. Tool lifecycle events (from TemporalStreamingHooks - see section below)
 4. DONE signal when complete
 
 This decoupling means we can stream anything we want through Redis!
@@ -543,6 +513,222 @@ class ExampleWorkflow:
 
 ---
 
+## Streaming Lifecycle Events with Hooks
+
+### Overview: Two Types of Streaming
+
+The streaming implementation described above (StreamingModel) only streams **LLM text responses and reasoning tokens**. It does NOT stream tool calls or agent handoffs - those are collected during execution and returned in the final response.
+
+To stream **agent lifecycle events** (tool requests, tool responses, handoffs), we provide **`TemporalStreamingHooks`** - a simpler, complementary approach that works alongside the streaming model.
+
+| What Gets Streamed | StreamingModel | TemporalStreamingHooks |
+|-------------------|----------------|------------------------|
+| **LLM text responses** | ✅ Token-by-token | ❌ |
+| **Reasoning summaries (o1/o3)** | ✅ Token-by-token | ❌ |
+| **Tool requests** | ❌ | ✅ When tool starts |
+| **Tool responses** | ❌ | ✅ When tool completes |
+| **Agent handoffs** | ❌ | ✅ When handoff occurs |
+
+**Best practice**: Use both together for complete streaming visibility!
+
+### What are Hooks?
+
+Hooks are callbacks provided by the OpenAI Agents SDK that fire during agent execution lifecycle events. They provide interception points for:
+- `on_agent_start` - When an agent begins execution
+- `on_agent_end` - When an agent completes execution
+- `on_tool_start` - When a tool is about to be invoked
+- `on_tool_end` - When a tool completes execution
+- `on_handoff` - When control transfers between agents
+
+### Why Use Hooks vs. Streaming Model?
+
+The streaming model approach operates at the LLM response level - it sees tokens as they're generated but doesn't have visibility into tool lifecycle events. Hooks provide a simpler, more configurable way to track what the agent is doing without understanding the plugin architecture internals.
+
+### Quick Start with TemporalStreamingHooks
+
+```python
+from agentex.lib.core.temporal.plugins.openai_agents import TemporalStreamingHooks
+from agents import Agent, Runner
+
+# Create an agent
+agent = Agent(
+    name="Assistant",
+    model="gpt-4o",
+    instructions="You are a helpful assistant",
+    tools=[my_tool]  # Assume we have some tools
+)
+
+# Initialize hooks with your task_id
+hooks = TemporalStreamingHooks(task_id="abc123")
+
+# Run the agent - lifecycle events automatically stream to the UI!
+result = await Runner.run(agent, "Hello", hooks=hooks)
+```
+
+That's it! Tool requests, tool responses, and handoffs are now automatically streamed to the AgentEx UI in real-time.
+
+### What Gets Streamed by Hooks
+
+The `TemporalStreamingHooks` class automatically streams:
+
+1. **Tool Requests** (`on_tool_start`):
+   - Fires when a tool is about to execute
+   - Streams `ToolRequestContent` with tool name and call ID
+   - Shows in UI that tool is being invoked
+   - **Note**: Tool arguments are not available due to OpenAI SDK architecture
+
+2. **Tool Responses** (`on_tool_end`):
+   - Fires when a tool completes execution
+   - Streams `ToolResponseContent` with tool result
+   - Shows tool output in the UI
+
+3. **Agent Handoffs** (`on_handoff`):
+   - Fires when control transfers between agents
+   - Streams `TextContent` with "Handoff from AgentA to AgentB" message
+
+### Using Hooks in Temporal Workflows
+
+When using hooks inside a Temporal workflow, combine them with the streaming context:
+
+```python
+from agents import Agent
+from agents.run import get_default_agent_runner
+from agentex.lib.core.temporal.plugins.openai_agents import TemporalStreamingHooks
+
+@workflow.defn
+class MyWorkflow:
+    @workflow.run
+    async def run(self, params):
+        agent = Agent(
+            name="Assistant",
+            instructions="You are a helpful assistant",
+            model="gpt-4o",
+            tools=[my_search_tool, my_calculator_tool]
+        )
+
+        # Create hooks with the task_id for lifecycle event streaming
+        hooks = TemporalStreamingHooks(task_id=params.task.id)
+
+        # Pass context for LLM response streaming (from StreamingModel)
+        context = {"task_id": params.task.id}
+
+        # Run with BOTH hooks AND context for complete streaming coverage
+        runner = get_default_agent_runner()
+        result = await runner.run(
+            agent,
+            params.event.content,
+            context=context,  # Enables LLM text/reasoning streaming
+            hooks=hooks       # Enables tool/handoff lifecycle streaming
+        )
+
+        return result.final_output
+```
+
+### Advanced: Custom Hooks
+
+Need custom behavior? Subclass `TemporalStreamingHooks` and override methods:
+
+```python
+from agentex.lib.core.temporal.plugins.openai_agents import TemporalStreamingHooks
+
+class MyCustomHooks(TemporalStreamingHooks):
+    async def on_tool_start(self, context, agent, tool):
+        # Add custom logic before streaming
+        print(f"About to call tool: {tool.name}")
+        await self.my_custom_logging(tool)
+
+        # Call parent to stream to UI
+        await super().on_tool_start(context, agent, tool)
+
+    async def on_agent_start(self, context, agent):
+        # Override empty methods for additional tracking
+        print(f"Agent {agent.name} started")
+        await self.track_agent_metrics(agent)
+
+    async def on_agent_end(self, context, agent, output):
+        # Track completion metrics
+        print(f"Agent {agent.name} completed")
+        await self.save_agent_output(agent, output)
+
+# Use your custom hooks
+hooks = MyCustomHooks(task_id="abc123")
+result = await Runner.run(agent, input, hooks=hooks)
+```
+
+### Configuration Options
+
+```python
+from datetime import timedelta
+
+# Customize timeout for streaming activities
+hooks = TemporalStreamingHooks(
+    task_id="abc123",
+    timeout=timedelta(seconds=30)  # Default is 10 seconds
+)
+```
+
+### Hooks Implementation Details
+
+Under the hood, `TemporalStreamingHooks` uses Temporal activities to stream events:
+
+```python
+# From hooks.py - simplified for illustration
+class TemporalStreamingHooks(RunHooks):
+    async def on_tool_start(self, context, agent, tool):
+        # Extract tool_call_id from context
+        tool_call_id = context.tool_call_id if isinstance(context, ToolContext) else f"call_{id(tool)}"
+
+        # Stream via Temporal activity
+        await workflow.execute_activity_method(
+            stream_lifecycle_content,
+            args=[
+                self.task_id,
+                ToolRequestContent(
+                    author="agent",
+                    tool_call_id=tool_call_id,
+                    name=tool.name,
+                    arguments={},  # Not available in hook context
+                ),
+            ],
+            start_to_close_timeout=self.timeout,
+        )
+```
+
+The `stream_lifecycle_content` activity then uses the AgentEx streaming infrastructure to push events to Redis, just like the streaming model does.
+
+### Limitations
+
+**Important**: Tool arguments are not available in `on_tool_start` hooks due to OpenAI SDK architecture. The hook signature doesn't include tool arguments - they're only passed to the actual tool function. This is why `arguments={}` appears in `ToolRequestContent`.
+
+If you need tool arguments in your streaming data, you'll need to:
+1. Stream them from within the tool function itself, or
+2. Wait for `on_tool_end` where you can log the full tool context
+
+### Power Users: Direct RunHooks Subclassing
+
+If you need complete control, ignore `TemporalStreamingHooks` and subclass `agents.RunHooks` directly:
+
+```python
+from agents import RunHooks
+from temporalio import workflow
+from agentex.lib.core.temporal.plugins.openai_agents.activities import stream_lifecycle_content
+
+class MyPowerUserHooks(RunHooks):
+    def __init__(self, task_id: str):
+        super().__init__()
+        self.task_id = task_id
+
+    async def on_tool_start(self, context, agent, tool):
+        # Implement completely custom streaming logic
+        await workflow.execute_activity_method(
+            stream_lifecycle_content,
+            args=[self.task_id, my_custom_content],
+            start_to_close_timeout=timedelta(seconds=10)
+        )
+```
+
+---
+
 ## Usage
 
 ### Installation