Skip to content

Commit cf96695

Browse files
committed
Add observability, checkpointing, and tool approval features
Major Features: - OpenTelemetry integration following Gen-AI semantic conventions - Workflow checkpoint system with file and memory storage backends - Tool approval system with @tool decorator and ApprovalMode support - Enhanced middleware pipeline with approval flow hooks - Context management improvements and agent context support - Memory tools and comprehensive examples Infrastructure: - Add poethepoet for task automation (poe test, poe check, etc.) - Add OpenTelemetry optional dependencies group - Enhanced Web UI with tool approval banners and example tasks display - Improved debug panel with detailed execution traces Examples: - Memory management examples (examples/memory/) - OpenTelemetry instrumentation examples (examples/otel/) - Tool approval workflow examples (examples/tools/) - Workflow checkpointing examples (examples/workflows/) Testing: - Add tests for memory tool functionality - Add tests for OpenTelemetry integration - Add tests for tool approval system - Reorganize workflow tests to tests/workflow/ Cleanup: - Remove deprecated planning tools module - Clean up old workflow test files from src - Update frontend build artifacts
1 parent 3c70ecc commit cf96695

File tree

78 files changed

+8561
-1047
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

78 files changed

+8561
-1047
lines changed

.gitignore

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,18 +6,28 @@ __pycache__/
66
*$py.class
77
CLAUDE.md
88
*.DS_Store
9-
9+
checkpoints/*
1010
*/work_dir
1111
.chainlit
12-
.files
12+
.files
1313
chainlit.md
1414
*.env
1515
/data*
16-
.run_samples.sh
16+
.run_samples.sh
1717
tests/test_fresh_install.sh
1818

1919

2020
test_fresh_install.sh
21+
22+
# Agent example artifacts - created when running agent examples
23+
agent_workspace/
24+
agent_memory/
25+
*_workspace/
26+
*_memory/
27+
*.json
28+
!pyproject.toml
29+
!package.json
30+
!tsconfig.json
2131
# C extensions
2232
*.so
2333

README.md

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Official code repository for [Designing Multi-Agent Systems: Principles, Pattern
66

77
![Designing Multi-Agent Systems](./docs/images/bookcover.png)
88

9-
Learn to build effective multi-agent systems from first principles through complete, tested implementations. This repository includes **PicoAgents**—a full-featured multi-agent framework built entirely from scratch for the sole purpose of teaching you how multi-agent systems work. Every component, from agent reasoning loops to orchestration patterns, is implemented with clarity and transparency so you can understand exactly how production systems are built.
9+
Learn to build effective multi-agent systems from first principles through complete, tested implementations. This repository includes **PicoAgents**—a full-featured multi-agent framework built entirely from scratch for the sole purpose of teaching you how multi-agent systems work. Every component, from agent reasoning loops to orchestration patterns, is implemented with clarity and transparency.
1010

1111
[📖 Buy Digital Edition](https://buy.multiagentbook.com) | [🛒 Buy Print - Coming Soon]()
1212

@@ -39,7 +39,7 @@ The book is organized across 4 parts, taking you from theory to production:
3939

4040
| Chapter | Title | Code | Learning Outcome |
4141
| -------- | ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- |
42-
| **Ch 4** | Building Your First Agent | [`agents/_agent.py`](picoagents/src/picoagents/agents/_agent.py), [`basic-agent.py`](examples/agents/basic-agent.py), [`memory.py`](examples/agents/memory.py), [`middleware.py`](examples/agents/middleware.py) <br> [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/victordibia/designing-multiagent-systems/blob/main/examples/notebooks/01_basic_agent.ipynb) | Create production agents with tools, memory, streaming, and middleware |
42+
| **Ch 4** | Building Your First Agent | [`agents/_agent.py`](picoagents/src/picoagents/agents/_agent.py), [`basic-agent.py`](examples/agents/basic-agent.py), [`memory.py`](examples/agents/memory.py), [`middleware.py`](examples/agents/middleware.py) <br> [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/victordibia/designing-multiagent-systems/blob/main/examples/notebooks/01_basic_agent.ipynb) | Build agents with tools, memory, streaming, and middleware |
4343
| **Ch 5** | Computer Use Agents | [`agents/_computer_use/`](picoagents/src/picoagents/agents/_computer_use/), [`computer_use.py`](examples/agents/computer_use.py) | Build browser automation agents with multimodal reasoning |
4444
| **Ch 5** | Building Multi-Agent Workflows | [`workflow/`](picoagents/src/picoagents/workflow/), [`data_visualization/`](examples/workflows/data_visualization/) | Build type-safe workflows with streaming observability |
4545
| **Ch 6** | Autonomous Multi-Agent Orchestration | [`orchestration/`](picoagents/src/picoagents/orchestration/), [`round-robin.py`](examples/orchestration/round-robin.py), [`ai-driven.py`](examples/orchestration/ai-driven.py), [`plan-based.py`](examples/orchestration/plan-based.py) | Implement GroupChat, LLM-driven, and plan-based orchestration (Magentic One patterns) |
@@ -62,7 +62,7 @@ The book is organized across 4 parts, taking you from theory to production:
6262

6363
### Option 1: Interactive Notebooks
6464

65-
Click Colab badges in the chapter tables below to run examples in your browser. No installation required.
65+
Click Colab badges in the chapter tables above to run examples in your browser. No installation required.
6666

6767
### Option 2: GitHub Codespaces
6868

@@ -204,12 +204,14 @@ examples/
204204

205205
## Key Features
206206

207-
🎯 **Production Patterns**
207+
🎯 **Production-Ready Patterns**
208208

209-
- Two-stage filtering: Reduce LLM costs by 90% (YC Analysis case study)
210-
- Structured outputs: Eliminate hallucination with Pydantic models
211-
- Checkpointing: Resumable workflows with state persistence
212-
- Think tool: Structured reasoning for 54% performance improvement
209+
Illustrated through real-world case studies (see [YC Analysis workflow](examples/workflows/yc_analysis/)):
210+
211+
- Cost optimization: Two-stage filtering for 90% LLM cost reduction
212+
- Type safety: Structured outputs with Pydantic validation
213+
- Reliability: Checkpointing and resumable workflows
214+
- Advanced reasoning: Think tool for improved problem-solving (54% performance gain)
213215

214216
🖥️ **Computer Use Agents**
215217

@@ -237,8 +239,8 @@ examples/
237239

238240
This repository implements every concept from the book. The book provides the theory, design trade-offs, and production considerations you need to build effective multi-agent systems.
239241

240-
- Digital Edition - [Link](https://buy.multiagentbook.com)
241-
- Buy Print Edition on [Amazon - Coming Soon]()
242+
- [Buy Digital Edition](https://buy.multiagentbook.com)
243+
- Buy Print Edition on Amazon - Coming Soon
242244

243245
## Questions and Feedback
244246

examples/agents/agent_as_tool.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,9 @@ def tool_agents():
6262
instructions="You solve tasks by delegating to the relevant agents or tools",
6363
model_client=model_client,
6464
tools=[weather_agent.as_tool(), analysis_agent.as_tool()],
65+
example_tasks=[
66+
"Get the current weather in New York and analyze recent sales data.",
67+
"Provide a brief report on the weather in San Francisco and its impact on outdoor events.",]
6568
)
6669

6770

examples/agents/basic-agent.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,12 @@ def calculate(expression: str) -> str:
3333
instructions="You are a helpful assistant with access to weather and calculation tools. Use them when appropriate.",
3434
model_client=OpenAIChatCompletionClient(model="gpt-4.1-mini"),
3535
tools=[get_weather, calculate],
36+
example_tasks=[
37+
"What's the weather in San Francisco?",
38+
"Calculate 125 * 48",
39+
"What's the weather in Tokyo and what's 15% of 240?",
40+
"Is it sunny in London?",
41+
],
3642
)
3743

3844

examples/agents/middleware_custom.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -357,7 +357,7 @@ def get_metrics(self) -> Dict:
357357

358358
return metrics
359359

360-
async def process_request(self, context: MiddlewareContext) -> MiddlewareContext:
360+
async def process_request(self, context: MiddlewareContext):
361361
"""Start operation tracking."""
362362
context.metadata["start_time"] = datetime.now()
363363

@@ -369,21 +369,21 @@ async def process_request(self, context: MiddlewareContext) -> MiddlewareContext
369369
if context.operation == "model_call" and isinstance(context.data, list):
370370
logger.info(f" Context size: {len(context.data)} messages")
371371

372-
return context
372+
yield context
373373

374-
async def process_response(self, context: MiddlewareContext, result: Any) -> Any:
374+
async def process_response(self, context: MiddlewareContext, result: Any):
375375
"""Record successful operation."""
376376
duration = (datetime.now() - context.metadata["start_time"]).total_seconds()
377377
self._record_metric(context.operation, duration, success=True)
378378

379379
if self.enable_detailed_logging:
380380
logger.info(f"✅ Completed {context.operation} in {duration:.3f}s")
381381

382-
return result
382+
yield result
383383

384384
async def process_error(
385385
self, context: MiddlewareContext, error: Exception
386-
) -> Optional[Any]:
386+
):
387387
"""Record failed operation."""
388388
duration = (datetime.now() - context.metadata["start_time"]).total_seconds()
389389
self._record_metric(context.operation, duration, success=False)
@@ -392,6 +392,7 @@ async def process_error(
392392
logger.error(f"❌ Failed {context.operation} after {duration:.3f}s: {error_msg}")
393393

394394
raise error
395+
yield # pragma: no cover
395396

396397

397398
# =============================================================================

0 commit comments

Comments
 (0)