Skip to content

Debug2 #37

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 48 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
87871b3
Fix LLMJudge input handling to preserve BinaryContent as separate mes…
adtyavrdhn Jul 21, 2025
772af1d
validate OpenAI responses (#2226)
samuelcolvin Jul 22, 2025
c78fa3f
Remove duplicate field on GeminiModelSettings (#2269)
strawgate Jul 22, 2025
3ceff58
Fix AG-UI shared state example (#2272)
stevenh Jul 22, 2025
932c888
Correct code snippet for native output (#2271)
minhduc0711 Jul 22, 2025
2b180f3
chore: simplify output function call with model retry (#2273)
bitnahian Jul 22, 2025
8112002
Fix pydantic-evals panel rendering with evaluators (#2274)
dmontagu Jul 22, 2025
66f1868
Reduce duplication between StreamedRunResult and AgentStream (#2275)
DouweM Jul 22, 2025
0f46928
Fix mp3 handling (#2279)
dmontagu Jul 23, 2025
a589f30
fix: use `FileUrl.format` to find the extension (#2280)
Kludex Jul 23, 2025
07e46b3
chore: add `CLAUDE.md` (#2281)
Kludex Jul 23, 2025
1e18729
Handle built-in tool errors better in tool registration (#2252)
fswair Jul 23, 2025
f3ad3e6
Enable URL and binary PDF for Mistral (#2267)
pintaf Jul 23, 2025
0b3d020
Speed up function `_estimate_string_tokens` (#2156)
misrasaurabh1 Jul 23, 2025
903f11e
Ignore empty text alongside tool calls when streaming from Ollama (#2…
DouweM Jul 23, 2025
b686001
Handle `None` `created` timestamp coming from OpenRouter API (#2247)
R0boji Jul 23, 2025
e295e5e
Rename `MCPServer` `sse_read_timeout` to `read_timeout` and pass to `…
AntSan813 Jul 23, 2025
a0c3abb
Update cohere and MCP, add support for MCP ResourceLink returned from…
medaminezghal Jul 24, 2025
2af4db6
Add Vercel AI Gateway provider (#2277)
joshualipman123 Jul 24, 2025
7eb4491
Parse '<think>' tags in streamed text as thinking parts (#2290)
DouweM Jul 24, 2025
da80f5d
Fix AG-UI parallel tool calls (#2301)
DouweM Jul 24, 2025
fc6a2b2
Support passing files uploaded to Gemini Files API and setting custom…
dprov Jul 24, 2025
6d8a4df
Update MCP docs to show you can pass SSL/TLS options via the `http_cl…
assadyousuf Jul 24, 2025
7728c2a
Add MoonshotAI provider with Kimi-K2 model support (#2211)
zachmayer Jul 24, 2025
bc4facd
Include ThinkingPart in messages.md API documentation graph (#2299)
lfloeer Jul 24, 2025
94b4305
Fix docs build failure by adding MoonshotAIProvider to API docs (#2304)
DouweM Jul 24, 2025
4104aca
Fix initial tool call args not being streamed with AG-UI (#2303)
DouweM Jul 24, 2025
41dd069
Ignore leading whitespace when streaming text, fixing run_stream + Ol…
DouweM Jul 24, 2025
753829b
add codeflash GHA
KRRT7 Jul 24, 2025
9e676a1
add to reqs
KRRT7 Jul 24, 2025
4476b5f
Merge branch 'main' into trigger-cf-workflow
KRRT7 Jul 24, 2025
afcfc62
Update uv.lock
KRRT7 Jul 24, 2025
c5f7702
revert test for CI
KRRT7 Jul 24, 2025
571be4e
Revert "revert test for CI"
KRRT7 Jul 24, 2025
d12b152
Update codeflash.yml
KRRT7 Jul 25, 2025
2703b5d
test dev dependency
KRRT7 Jul 25, 2025
d7b063e
don't freeze
KRRT7 Jul 25, 2025
7dc9ff0
Update google.py
KRRT7 Jul 25, 2025
ce9d28b
Update codeflash.yml
KRRT7 Jul 25, 2025
7d6945b
Update codeflash.yml
KRRT7 Jul 25, 2025
4c46906
Update google.py
KRRT7 Jul 25, 2025
f0a6cb1
Update codeflash.yml
KRRT7 Jul 25, 2025
26c0dff
add sleep
KRRT7 Jul 25, 2025
d55c88d
go
KRRT7 Jul 25, 2025
08a6655
revert codeflash change
KRRT7 Jul 25, 2025
d61b8ba
Update google.py
KRRT7 Jul 28, 2025
5e94ffe
Update codeflash.yml
KRRT7 Jul 28, 2025
cf9720e
Update codeflash.yml
KRRT7 Jul 28, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions .github/workflows/codeflash.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
name: Codeflash Optimization
on:
pull_request:
paths:
- 'pydantic_ai_slim/pydantic_ai/**'
workflow_dispatch:
concurrency:
# Any new push to the PR will cancel the previous run, so that only the latest code is optimized
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
optimize:
name: Optimize new Python code
# Don't run codeflash on codeflash-ai[bot] commits, prevent duplicate optimizations
if: ${{ github.actor != 'codeflash-ai[bot]' }}
runs-on: ubuntu-latest
env:
CODEFLASH_API_KEY: ${{ secrets.CODEFLASH_API_KEY }}
steps:
- name: 🛎️ Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: 🐍 Setup UV
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
- uses: denoland/setup-deno@v2
with:
deno-version: v2.x
- name: 📦 Install Dependencies
run: uv sync --all-extras --all-packages --group lint --group docs
- name: uninstall codeflash
run: uv pip uninstall codeflash
- name: 📦 install codeflash from main
run: uv pip install git+https://github.com/codeflash-ai/codeflash.git@benchmark-fixture-fix
- name: ⚡️Codeflash Optimization
run: uv run codeflash --benchmark
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ examples/pydantic_ai_examples/.chat_app_messages.sqlite
.vscode/
/question_graph_history.json
/docs-site/.wrangler/
/CLAUDE.md
node_modules/
**.idea/
.coverage*
Expand Down
127 changes: 127 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Development Commands

### Core Development Tasks
- **Install dependencies**: `make install` (requires uv, pre-commit, and deno)
- **Run all checks**: `make` (format, lint, typecheck, test with coverage)
- **Format code**: `make format`
- **Lint code**: `make lint`
- **Type checking**: `make typecheck` (uses pyright) or `make typecheck-both` (pyright + mypy)
- **Run tests**: `make test` (with coverage) or `make test-fast` (parallel, no coverage)
- **Build docs**: `make docs` or `make docs-serve` (local development)

### Single Test Commands
- **Run specific test**: `uv run pytest tests/test_agent.py::test_function_name -v`
- **Run test file**: `uv run pytest tests/test_agent.py -v`
- **Run with debug**: `uv run pytest tests/test_agent.py -v -s`

### Multi-Python Testing
- **Install all Python versions**: `make install-all-python`
- **Test all Python versions**: `make test-all-python`

## Project Architecture

### Core Components

**Agent System (`pydantic_ai_slim/pydantic_ai/agent.py`)**
- `Agent[AgentDepsT, OutputDataT]`: Main orchestrator class with generic types for dependency injection and output validation
- Entry points: `run()`, `run_sync()`, `run_stream()` methods
- Handles tool management, system prompts, and model interaction

**Model Integration (`pydantic_ai_slim/pydantic_ai/models/`)**
- Unified interface across providers: OpenAI, Anthropic, Google, Groq, Cohere, Mistral, Bedrock, HuggingFace
- Model strings: `"openai:gpt-4o"`, `"anthropic:claude-3-5-sonnet"`, `"google:gemini-1.5-pro"`
- `ModelRequestParameters` for configuration, `StreamedResponse` for streaming

**Graph-based Execution (`pydantic_graph/` + `_agent_graph.py`)**
- State machine execution through: `UserPromptNode` → `ModelRequestNode` → `CallToolsNode`
- `GraphAgentState` maintains message history and usage tracking
- `GraphRunContext` provides execution context

**Tool System (`tools.py`, `toolsets/`)**
- `@agent.tool` decorator for function registration
- `RunContext[AgentDepsT]` provides dependency injection in tools
- Support for sync/async functions with automatic schema generation

**Output Handling**
- `TextOutput`: Plain text responses
- `ToolOutput`: Structured data via tool calls
- `NativeOutput`: Provider-specific structured output
- `PromptedOutput`: Prompt-based structured extraction

### Key Design Patterns

**Dependency Injection**
```python
@dataclass
class MyDeps:
database: DatabaseConn

agent = Agent('openai:gpt-4o', deps_type=MyDeps)

@agent.tool
async def get_data(ctx: RunContext[MyDeps]) -> str:
return await ctx.deps.database.fetch_data()
```

**Type-Safe Agents**
```python
class OutputModel(BaseModel):
result: str
confidence: float

agent: Agent[MyDeps, OutputModel] = Agent(
'openai:gpt-4o',
deps_type=MyDeps,
output_type=OutputModel
)
```

## Workspace Structure

This is a uv workspace with multiple packages:
- **`pydantic_ai_slim/`**: Core framework (minimal dependencies)
- **`pydantic_evals/`**: Evaluation system
- **`pydantic_graph/`**: Graph execution engine
- **`examples/`**: Example applications
- **`clai/`**: CLI tool
- **`mcp-run-python/`**: MCP server implementation (Deno/TypeScript)

## Testing Strategy

- **Unit tests**: `tests/` directory with comprehensive model and component coverage
- **VCR cassettes**: `tests/cassettes/` for recorded LLM API interactions
- **Test models**: Use `TestModel` for deterministic testing
- **Examples testing**: `tests/test_examples.py` validates all documentation examples
- **Multi-version testing**: Python 3.9-3.13 support

## Key Configuration Files

- **`pyproject.toml`**: Main workspace configuration with dependency groups
- **`pydantic_ai_slim/pyproject.toml`**: Core package with model optional dependencies
- **`Makefile`**: Development task automation
- **`uv.lock`**: Locked dependencies for reproducible builds

## Important Implementation Notes

- **Model Provider Integration**: Each provider in `models/` directory implements the `Model` abstract base class
- **Message System**: Vendor-agnostic message format in `messages.py` with rich content type support
- **Streaming Architecture**: Real-time response processing with validation during streaming
- **Error Handling**: Specific exception types with retry mechanisms at multiple levels
- **OpenTelemetry Integration**: Built-in observability support

## Documentation Development

- **Local docs**: `make docs-serve` (serves at http://localhost:8000)
- **Docs source**: `docs/` directory (MkDocs with Material theme)
- **API reference**: Auto-generated from docstrings using mkdocstrings

## Dependencies Management

- **Package manager**: uv (fast Python package manager)
- **Lock file**: `uv.lock` (commit this file)
- **Sync command**: `make sync` to update dependencies
- **Optional extras**: Define groups in `pyproject.toml` optional-dependencies
4 changes: 2 additions & 2 deletions docs/ag-ui.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,15 +150,15 @@ app = agent.to_ag_ui(deps=StateDeps(DocumentState()))


@agent.tool
def update_state(ctx: RunContext[StateDeps[DocumentState]]) -> StateSnapshotEvent:
async def update_state(ctx: RunContext[StateDeps[DocumentState]]) -> StateSnapshotEvent:
return StateSnapshotEvent(
type=EventType.STATE_SNAPSHOT,
snapshot=ctx.deps.state,
)


@agent.tool_plain
def custom_events() -> list[CustomEvent]:
async def custom_events() -> list[CustomEvent]:
return [
CustomEvent(
type=EventType.CUSTOM,
Expand Down
1 change: 1 addition & 0 deletions docs/api/messages.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ graph RL
RetryPromptPart(RetryPromptPart) --- ModelRequestPart
TextPart(TextPart) --- ModelResponsePart
ToolCallPart(ToolCallPart) --- ModelResponsePart
ThinkingPart(ThinkingPart) --- ModelResponsePart
ModelRequestPart("ModelRequestPart<br>(Union)") --- ModelRequest
ModelRequest("ModelRequest(parts=list[...])") --- ModelMessage
ModelResponsePart("ModelResponsePart<br>(Union)") --- ModelResponse
Expand Down
4 changes: 4 additions & 0 deletions docs/api/providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,8 @@

::: pydantic_ai.providers.openrouter.OpenRouterProvider

::: pydantic_ai.providers.vercel.VercelProvider

::: pydantic_ai.providers.huggingface.HuggingFaceProvider

::: pydantic_ai.providers.moonshotai.MoonshotAIProvider
48 changes: 48 additions & 0 deletions docs/mcp/client.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,54 @@ calculator_server = MCPServerSSE(
agent = Agent('openai:gpt-4o', toolsets=[weather_server, calculator_server])
```

## Custom TLS / SSL configuration

In some environments you need to tweak how HTTPS connections are established –
for example to trust an internal Certificate Authority, present a client
certificate for **mTLS**, or (during local development only!) disable
certificate verification altogether.
All HTTP-based MCP client classes
([`MCPServerStreamableHTTP`][pydantic_ai.mcp.MCPServerStreamableHTTP] and
[`MCPServerSSE`][pydantic_ai.mcp.MCPServerSSE]) expose an `http_client`
parameter that lets you pass your own pre-configured
[`httpx.AsyncClient`](https://www.python-httpx.org/async/).

```python {title="mcp_custom_tls_client.py" py="3.10"}
import httpx
import ssl

from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerSSE


# Trust an internal / self-signed CA
ssl_ctx = ssl.create_default_context(cafile="/etc/ssl/private/my_company_ca.pem")

# OPTIONAL: if the server requires **mutual TLS** load your client certificate
ssl_ctx.load_cert_chain(certfile="/etc/ssl/certs/client.crt", keyfile="/etc/ssl/private/client.key",)

http_client = httpx.AsyncClient(
verify=ssl_ctx,
timeout=httpx.Timeout(10.0),
)

server = MCPServerSSE(
url="http://localhost:3001/sse",
http_client=http_client, # (1)!
)
agent = Agent("openai:gpt-4o", toolsets=[server])

async def main():
async with agent:
result = await agent.run('How many days between 2000-01-01 and 2025-03-18?')
print(result.output)
#> There are 9,208 days between January 1, 2000, and March 18, 2025.
```

1. When you supply `http_client`, Pydantic AI re-uses this client for every
request. Anything supported by **httpx** (`verify`, `cert`, custom
proxies, timeouts, etc.) therefore applies to all MCP traffic.

## MCP Sampling

!!! info "What is MCP Sampling?"
Expand Down
5 changes: 4 additions & 1 deletion docs/mcp/server.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,10 @@ async def sampling_callback(
SamplingMessage(
role='user',
content=TextContent(
type='text', text='write a poem about socks', annotations=None
type='text',
text='write a poem about socks',
annotations=None,
meta=None,
),
)
]
Expand Down
1 change: 1 addition & 0 deletions docs/models/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ In addition, many providers are compatible with the OpenAI API, and can be used
- [Grok (xAI)](openai.md#grok-xai)
- [Ollama](openai.md#ollama)
- [OpenRouter](openai.md#openrouter)
- [Vercel AI Gateway](openai.md#vercel-ai-gateway)
- [Perplexity](openai.md#perplexity)
- [Fireworks AI](openai.md#fireworks-ai)
- [Together AI](openai.md#together-ai)
Expand Down
53 changes: 53 additions & 0 deletions docs/models/openai.md
Original file line number Diff line number Diff line change
Expand Up @@ -348,6 +348,41 @@ agent = Agent(model)
...
```

### Vercel AI Gateway

To use [Vercel's AI Gateway](https://vercel.com/docs/ai-gateway), first follow the [documentation](https://vercel.com/docs/ai-gateway) instructions on obtaining an API key or OIDC token.

You can set your credentials using one of these environment variables:

```bash
export VERCEL_AI_GATEWAY_API_KEY='your-ai-gateway-api-key'
# OR
export VERCEL_OIDC_TOKEN='your-oidc-token'
```

Once you have set the environment variable, you can use it with the [`VercelProvider`][pydantic_ai.providers.vercel.VercelProvider]:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.vercel import VercelProvider

# Uses environment variable automatically
model = OpenAIModel(
'anthropic/claude-4-sonnet',
provider=VercelProvider(),
)
agent = Agent(model)

# Or pass the API key directly
model = OpenAIModel(
'anthropic/claude-4-sonnet',
provider=VercelProvider(api_key='your-vercel-ai-gateway-api-key'),
)
agent = Agent(model)
...
```

### Grok (xAI)

Go to [xAI API Console](https://console.x.ai/) and create an API key.
Expand All @@ -366,6 +401,24 @@ agent = Agent(model)
...
```

### MoonshotAI

Create an API key in the [Moonshot Console](https://platform.moonshot.ai/console).
With that key you can instantiate the [`MoonshotAIProvider`][pydantic_ai.providers.moonshotai.MoonshotAIProvider]:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.moonshotai import MoonshotAIProvider

model = OpenAIModel(
'kimi-k2-0711-preview',
provider=MoonshotAIProvider(api_key='your-moonshot-api-key'),
)
agent = Agent(model)
...
```

### GitHub Models

To use [GitHub Models](https://docs.github.com/en/github-models), you'll need a GitHub personal access token with the `models: read` permission.
Expand Down
2 changes: 1 addition & 1 deletion docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -325,7 +325,7 @@ agent = Agent(
'openai:gpt-4o',
output_type=NativeOutput(
[Fruit, Vehicle], # (1)!
name='Fruit or vehicle',
name='Fruit_or_vehicle',
description='Return a fruit or vehicle.'
),
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ class JSONPatchOp(BaseModel):


@agent.tool_plain
def create_plan(steps: list[str]) -> StateSnapshotEvent:
async def create_plan(steps: list[str]) -> StateSnapshotEvent:
"""Create a plan with multiple steps.

Args:
Expand All @@ -86,7 +86,7 @@ def create_plan(steps: list[str]) -> StateSnapshotEvent:


@agent.tool_plain
def update_plan_step(
async def update_plan_step(
index: int, description: str | None = None, status: StepStatus | None = None
) -> StateDeltaEvent:
"""Update the plan with new steps or changes.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ class DocumentState(BaseModel):
# Tools which return AG-UI events will be sent to the client as part of the
# event stream, single events and iterables of events are supported.
@agent.tool_plain
def document_predict_state() -> list[CustomEvent]:
async def document_predict_state() -> list[CustomEvent]:
"""Enable document state prediction.

Returns:
Expand All @@ -45,7 +45,7 @@ def document_predict_state() -> list[CustomEvent]:


@agent.instructions()
def story_instructions(ctx: RunContext[StateDeps[DocumentState]]) -> str:
async def story_instructions(ctx: RunContext[StateDeps[DocumentState]]) -> str:
"""Provide instructions for writing document if present.

Args:
Expand Down
Loading
Loading