Skip to content

Commit 9d31dd0

Browse files
authored
Merge branch 'main' into feature/1783-minimal-mcp-resources-support
2 parents 4352f45 + eae558b commit 9d31dd0

31 files changed

+1088
-179
lines changed

docs/durable_execution/temporal.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,7 @@ As workflows and activities run in separate processes, any values passed between
172172

173173
To account for these limitations, tool functions and the [event stream handler](#streaming) running inside activities receive a limited version of the agent's [`RunContext`][pydantic_ai.tools.RunContext], and it's your responsibility to make sure that the [dependencies](../dependencies.md) object provided to [`TemporalAgent.run()`][pydantic_ai.durable_exec.temporal.TemporalAgent.run] can be serialized using Pydantic.
174174

175-
Specifically, only the `deps`, `run_id`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step` and `partial_output` fields are available by default, and trying to access `model`, `usage`, `prompt`, `messages`, or `tracer` will raise an error.
175+
Specifically, only the `deps`, `run_id`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` fields are available by default, and trying to access `model`, `prompt`, `messages`, or `tracer` will raise an error.
176176
If you need one or more of these attributes to be available inside activities, you can create a [`TemporalRunContext`][pydantic_ai.durable_exec.temporal.TemporalRunContext] subclass with custom `serialize_run_context` and `deserialize_run_context` class methods and pass it to [`TemporalAgent`][pydantic_ai.durable_exec.temporal.TemporalAgent] as `run_context_type`.
177177

178178
### Streaming

docs/logfire.md

Lines changed: 16 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -106,49 +106,30 @@ We can also query data with SQL in Logfire to monitor the performance of an appl
106106

107107
### Monitoring HTTP Requests
108108

109-
!!! tip "\"F**k you, show me the prompt.\""
110-
As per Hamel Husain's influential 2024 blog post ["Fuck You, Show Me The Prompt."](https://hamel.dev/blog/posts/prompt/)
111-
(bear with the capitalization, the point is valid), it's often useful to be able to view the raw HTTP requests and responses made to model providers.
109+
As per Hamel Husain's influential 2024 blog post ["Fuck You, Show Me The Prompt."](https://hamel.dev/blog/posts/prompt/)
110+
(bear with the capitalization, the point is valid), it's often useful to be able to view the raw HTTP requests and responses made to model providers.
112111

113-
To observe raw HTTP requests made to model providers, you can use Logfire's [HTTPX instrumentation](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) since all provider SDKs use the [HTTPX](https://www.python-httpx.org/) library internally.
112+
To observe raw HTTP requests made to model providers, you can use Logfire's [HTTPX instrumentation](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) since all provider SDKs (except for [Bedrock](models/bedrock.md)) use the [HTTPX](https://www.python-httpx.org/) library internally:
114113

115-
=== "With HTTP instrumentation"
116114

117-
```py {title="with_logfire_instrument_httpx.py" hl_lines="7"}
118-
import logfire
119-
120-
from pydantic_ai import Agent
121-
122-
logfire.configure()
123-
logfire.instrument_pydantic_ai()
124-
logfire.instrument_httpx(capture_all=True) # (1)!
125-
agent = Agent('openai:gpt-5')
126-
result = agent.run_sync('What is the capital of France?')
127-
print(result.output)
128-
#> The capital of France is Paris.
129-
```
130-
131-
1. See the [`logfire.instrument_httpx` docs][logfire.Logfire.instrument_httpx] more details, `capture_all=True` means both headers and body are captured for both the request and response.
132-
133-
![Logfire with HTTPX instrumentation](img/logfire-with-httpx.png)
134-
135-
=== "Without HTTP instrumentation"
115+
```py {title="with_logfire_instrument_httpx.py" hl_lines="7"}
116+
import logfire
136117

137-
```py {title="without_logfire_instrument_httpx.py"}
138-
import logfire
118+
from pydantic_ai import Agent
139119

140-
from pydantic_ai import Agent
120+
logfire.configure()
121+
logfire.instrument_pydantic_ai()
122+
logfire.instrument_httpx(capture_all=True) # (1)!
141123

142-
logfire.configure()
143-
logfire.instrument_pydantic_ai()
124+
agent = Agent('openai:gpt-5')
125+
result = agent.run_sync('What is the capital of France?')
126+
print(result.output)
127+
#> The capital of France is Paris.
128+
```
144129

145-
agent = Agent('openai:gpt-5')
146-
result = agent.run_sync('What is the capital of France?')
147-
print(result.output)
148-
#> The capital of France is Paris.
149-
```
130+
1. See the [`logfire.instrument_httpx` docs][logfire.Logfire.instrument_httpx] more details, `capture_all=True` means both headers and body are captured for both the request and response.
150131

151-
![Logfire without HTTPX instrumentation](img/logfire-without-httpx.png)
132+
![Logfire with HTTPX instrumentation](img/logfire-with-httpx.png)
152133

153134
## Using OpenTelemetry
154135

docs/mcp/client.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -338,6 +338,29 @@ calculator_server = MCPServerSSE(
338338
agent = Agent('openai:gpt-5', toolsets=[weather_server, calculator_server])
339339
```
340340

341+
## Server Instructions
342+
343+
MCP servers can provide instructions during initialization that give context about how to best interact with the server's tools. These instructions are accessible via the [`instructions`][pydantic_ai.mcp.MCPServer.instructions] property after the server connection is established.
344+
345+
```python {title="mcp_server_instructions.py"}
346+
from pydantic_ai import Agent
347+
from pydantic_ai.mcp import MCPServerStreamableHTTP
348+
349+
server = MCPServerStreamableHTTP('http://localhost:8000/mcp')
350+
agent = Agent('openai:gpt-5', toolsets=[server])
351+
352+
@agent.instructions
353+
async def mcp_server_instructions():
354+
return server.instructions # (1)!
355+
356+
async def main():
357+
result = await agent.run('What is 7 plus 5?')
358+
print(result.output)
359+
#> The answer is 12.
360+
```
361+
362+
1. The server connection is guaranteed to be established by this point, so `server.instructions` is available.
363+
341364
## Tool metadata
342365

343366
MCP tools can include metadata that provides additional information about the tool's characteristics, which can be useful when [filtering tools][pydantic_ai.toolsets.FilteredToolset]. The `meta`, `annotations`, and `output_schema` fields can be found on the `metadata` dict on the [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] object that's passed to filter functions.

docs/models/anthropic.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -83,8 +83,8 @@ agent = Agent(model)
8383
Anthropic supports [prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) to reduce costs by caching parts of your prompts. Pydantic AI provides three ways to use prompt caching:
8484

8585
1. **Cache User Messages with [`CachePoint`][pydantic_ai.messages.CachePoint]**: Insert a `CachePoint` marker in your user messages to cache everything before it
86-
2. **Cache System Instructions**: Enable the [`AnthropicModelSettings.anthropic_cache_instructions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions] [model setting](../agents.md#model-run-settings) to cache your system prompt
87-
3. **Cache Tool Definitions**: Enable the [`AnthropicModelSettings.anthropic_cache_tool_definitions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions] [model setting](../agents.md#model-run-settings) to cache your tool definitions
86+
2. **Cache System Instructions**: Set [`AnthropicModelSettings.anthropic_cache_instructions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions] to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
87+
3. **Cache Tool Definitions**: Set [`AnthropicModelSettings.anthropic_cache_tool_definitions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions] to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
8888

8989
You can combine all three strategies for maximum savings:
9090

@@ -96,8 +96,9 @@ agent = Agent(
9696
'anthropic:claude-sonnet-4-5',
9797
system_prompt='Detailed instructions...',
9898
model_settings=AnthropicModelSettings(
99+
# Use True for default 5m TTL, or specify '5m' / '1h' directly
99100
anthropic_cache_instructions=True,
100-
anthropic_cache_tool_definitions=True,
101+
anthropic_cache_tool_definitions='1h', # Longer cache for tool definitions
101102
),
102103
)
103104

@@ -134,7 +135,7 @@ agent = Agent(
134135
'anthropic:claude-sonnet-4-5',
135136
system_prompt='Instructions...',
136137
model_settings=AnthropicModelSettings(
137-
anthropic_cache_instructions=True
138+
anthropic_cache_instructions=True # Default 5m TTL
138139
),
139140
)
140141

docs/models/google.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -214,22 +214,22 @@ from pydantic_ai.models.google import GoogleModel, GoogleModelSettings
214214
settings = GoogleModelSettings(
215215
temperature=0.2,
216216
max_tokens=1024,
217-
google_thinking_config={'thinking_budget': 2048},
217+
google_thinking_config={'thinking_level': 'low'},
218218
google_safety_settings=[
219219
{
220220
'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH,
221221
'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
222222
}
223223
]
224224
)
225-
model = GoogleModel('gemini-2.5-flash')
225+
model = GoogleModel('gemini-2.5-pro')
226226
agent = Agent(model, model_settings=settings)
227227
...
228228
```
229229

230230
### Disable thinking
231231

232-
You can disable thinking by setting the `thinking_budget` to `0` on the `google_thinking_config`:
232+
On models older than Gemini 2.5 Pro, you can disable thinking by setting the `thinking_budget` to `0` on the `google_thinking_config`:
233233

234234
```python
235235
from pydantic_ai import Agent

pydantic_ai_slim/pydantic_ai/_agent_graph.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,12 @@ async def run( # noqa: C901
216216
ctx.state.message_history = messages
217217
ctx.deps.new_message_index = len(messages)
218218

219+
# Validate that message history starts with a user message
220+
if messages and isinstance(messages[0], _messages.ModelResponse):
221+
raise exceptions.UserError(
222+
'Message history cannot start with a `ModelResponse`. Conversations must begin with a user message.'
223+
)
224+
219225
if self.deferred_tool_results is not None:
220226
return await self._handle_deferred_tool_results(self.deferred_tool_results, messages, ctx)
221227

pydantic_ai_slim/pydantic_ai/_output.py

Lines changed: 4 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -470,7 +470,7 @@ def __init__(
470470
allows_image: bool,
471471
):
472472
super().__init__(
473-
processor=PromptedOutputProcessor(processor),
473+
processor=processor,
474474
allows_deferred_tools=allows_deferred_tools,
475475
allows_image=allows_image,
476476
)
@@ -494,13 +494,6 @@ def build_instructions(cls, template: str, object_def: OutputObjectDefinition) -
494494

495495
return template.format(schema=json.dumps(schema))
496496

497-
def instructions(self, default_template: str) -> str: # pragma: no cover
498-
"""Get instructions to tell model to output JSON matching the schema."""
499-
template = self.template or default_template
500-
object_def = self.object_def
501-
assert object_def is not None
502-
return self.build_instructions(template, object_def)
503-
504497

505498
@dataclass(init=False)
506499
class ToolOutputSchema(OutputSchema[OutputDataT]):
@@ -542,28 +535,6 @@ class BaseObjectOutputProcessor(BaseOutputProcessor[OutputDataT]):
542535
object_def: OutputObjectDefinition
543536

544537

545-
@dataclass(init=False)
546-
class PromptedOutputProcessor(BaseObjectOutputProcessor[OutputDataT]):
547-
wrapped: BaseObjectOutputProcessor[OutputDataT]
548-
549-
def __init__(self, wrapped: BaseObjectOutputProcessor[OutputDataT]):
550-
self.wrapped = wrapped
551-
super().__init__(object_def=wrapped.object_def)
552-
553-
async def process(
554-
self,
555-
data: str,
556-
run_context: RunContext[AgentDepsT],
557-
allow_partial: bool = False,
558-
wrap_validation_errors: bool = True,
559-
) -> OutputDataT:
560-
text = _utils.strip_markdown_fences(data)
561-
562-
return await self.wrapped.process(
563-
text, run_context, allow_partial=allow_partial, wrap_validation_errors=wrap_validation_errors
564-
)
565-
566-
567538
@dataclass(init=False)
568539
class ObjectOutputProcessor(BaseObjectOutputProcessor[OutputDataT]):
569540
outer_typed_dict_key: str | None = None
@@ -653,6 +624,9 @@ async def process(
653624
Returns:
654625
Either the validated output data (left) or a retry message (right).
655626
"""
627+
if isinstance(data, str):
628+
data = _utils.strip_markdown_fences(data)
629+
656630
try:
657631
output = self.validate(data, allow_partial)
658632
except ValidationError as e:

pydantic_ai_slim/pydantic_ai/_utils.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -467,12 +467,14 @@ def validate_empty_kwargs(_kwargs: dict[str, Any]) -> None:
467467
raise exceptions.UserError(f'Unknown keyword arguments: {unknown_kwargs}')
468468

469469

470+
_MARKDOWN_FENCES_PATTERN = re.compile(r'```(?:\w+)?\n(\{.*\})', flags=re.DOTALL)
471+
472+
470473
def strip_markdown_fences(text: str) -> str:
471474
if text.startswith('{'):
472475
return text
473476

474-
regex = r'```(?:\w+)?\n(\{.*\})\n```'
475-
match = re.search(regex, text, re.DOTALL)
477+
match = re.search(_MARKDOWN_FENCES_PATTERN, text)
476478
if match:
477479
return match.group(1)
478480

pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_run_context.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
class TemporalRunContext(RunContext[AgentDepsT]):
1515
"""The [`RunContext`][pydantic_ai.tools.RunContext] subclass to use to serialize and deserialize the run context for use inside a Temporal activity.
1616
17-
By default, only the `deps`, `run_id`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step` and `partial_output` attributes will be available.
17+
By default, only the `deps`, `run_id`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` attributes will be available.
1818
To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to [`TemporalAgent`][pydantic_ai.durable_exec.temporal.TemporalAgent].
1919
"""
2020

@@ -51,6 +51,7 @@ def serialize_run_context(cls, ctx: RunContext[Any]) -> dict[str, Any]:
5151
'max_retries': ctx.max_retries,
5252
'run_step': ctx.run_step,
5353
'partial_output': ctx.partial_output,
54+
'usage': ctx.usage,
5455
}
5556

5657
@classmethod

pydantic_ai_slim/pydantic_ai/mcp.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,7 @@ class MCPServer(AbstractToolset[Any], ABC):
260260
_write_stream: MemoryObjectSendStream[SessionMessage]
261261
_server_info: mcp_types.Implementation
262262
_server_capabilities: ServerCapabilities
263+
_instructions: str | None
263264

264265
def __init__(
265266
self,
@@ -346,6 +347,15 @@ def capabilities(self) -> ServerCapabilities:
346347
f'The `{self.__class__.__name__}.capabilities` is only instantiated after initialization.'
347348
)
348349
return self._server_capabilities
350+
351+
@property
352+
def instructions(self) -> str | None:
353+
"""Access the instructions sent by the MCP server during initialization."""
354+
if not hasattr(self, '_instructions'):
355+
raise AttributeError(
356+
f'The `{self.__class__.__name__}.instructions` is only available after initialization.'
357+
)
358+
return self._instructions
349359

350360
async def list_tools(self) -> list[mcp_types.Tool]:
351361
"""Retrieve tools that are currently active on the server.
@@ -566,6 +576,7 @@ async def __aenter__(self) -> Self:
566576
result = await self._client.initialize()
567577
self._server_info = result.serverInfo
568578
self._server_capabilities = _mcp.map_from_mcp_server_capabilities(result.capabilities)
579+
self._instructions = result.instructions
569580
if log_level := self.log_level:
570581
await self._client.set_logging_level(log_level)
571582

0 commit comments

Comments
 (0)