Skip to content

Commit ba9433f

Browse files
committed
Merge branch 'main' into 3225
2 parents 79253f1 + 468e60a commit ba9433f

File tree

48 files changed

+3290
-375
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+3290
-375
lines changed

docs/durable_execution/temporal.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,7 @@ As workflows and activities run in separate processes, any values passed between
172172

173173
To account for these limitations, tool functions and the [event stream handler](#streaming) running inside activities receive a limited version of the agent's [`RunContext`][pydantic_ai.tools.RunContext], and it's your responsibility to make sure that the [dependencies](../dependencies.md) object provided to [`TemporalAgent.run()`][pydantic_ai.durable_exec.temporal.TemporalAgent.run] can be serialized using Pydantic.
174174

175-
Specifically, only the `deps`, `run_id`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step` and `partial_output` fields are available by default, and trying to access `model`, `usage`, `prompt`, `messages`, or `tracer` will raise an error.
175+
Specifically, only the `deps`, `run_id`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` fields are available by default, and trying to access `model`, `prompt`, `messages`, or `tracer` will raise an error.
176176
If you need one or more of these attributes to be available inside activities, you can create a [`TemporalRunContext`][pydantic_ai.durable_exec.temporal.TemporalRunContext] subclass with custom `serialize_run_context` and `deserialize_run_context` class methods and pass it to [`TemporalAgent`][pydantic_ai.durable_exec.temporal.TemporalAgent] as `run_context_type`.
177177

178178
### Streaming

docs/models/anthropic.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -83,8 +83,8 @@ agent = Agent(model)
8383
Anthropic supports [prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) to reduce costs by caching parts of your prompts. Pydantic AI provides three ways to use prompt caching:
8484

8585
1. **Cache User Messages with [`CachePoint`][pydantic_ai.messages.CachePoint]**: Insert a `CachePoint` marker in your user messages to cache everything before it
86-
2. **Cache System Instructions**: Enable the [`AnthropicModelSettings.anthropic_cache_instructions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions] [model setting](../agents.md#model-run-settings) to cache your system prompt
87-
3. **Cache Tool Definitions**: Enable the [`AnthropicModelSettings.anthropic_cache_tool_definitions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions] [model setting](../agents.md#model-run-settings) to cache your tool definitions
86+
2. **Cache System Instructions**: Set [`AnthropicModelSettings.anthropic_cache_instructions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions] to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
87+
3. **Cache Tool Definitions**: Set [`AnthropicModelSettings.anthropic_cache_tool_definitions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions] to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
8888

8989
You can combine all three strategies for maximum savings:
9090

@@ -96,8 +96,9 @@ agent = Agent(
9696
'anthropic:claude-sonnet-4-5',
9797
system_prompt='Detailed instructions...',
9898
model_settings=AnthropicModelSettings(
99+
# Use True for default 5m TTL, or specify '5m' / '1h' directly
99100
anthropic_cache_instructions=True,
100-
anthropic_cache_tool_definitions=True,
101+
anthropic_cache_tool_definitions='1h', # Longer cache for tool definitions
101102
),
102103
)
103104

@@ -134,7 +135,7 @@ agent = Agent(
134135
'anthropic:claude-sonnet-4-5',
135136
system_prompt='Instructions...',
136137
model_settings=AnthropicModelSettings(
137-
anthropic_cache_instructions=True
138+
anthropic_cache_instructions=True # Default 5m TTL
138139
),
139140
)
140141

docs/models/google.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -214,22 +214,22 @@ from pydantic_ai.models.google import GoogleModel, GoogleModelSettings
214214
settings = GoogleModelSettings(
215215
temperature=0.2,
216216
max_tokens=1024,
217-
google_thinking_config={'thinking_budget': 2048},
217+
google_thinking_config={'thinking_level': 'low'},
218218
google_safety_settings=[
219219
{
220220
'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH,
221221
'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
222222
}
223223
]
224224
)
225-
model = GoogleModel('gemini-2.5-flash')
225+
model = GoogleModel('gemini-2.5-pro')
226226
agent = Agent(model, model_settings=settings)
227227
...
228228
```
229229

230230
### Disable thinking
231231

232-
You can disable thinking by setting the `thinking_budget` to `0` on the `google_thinking_config`:
232+
On models older than Gemini 2.5 Pro, you can disable thinking by setting the `thinking_budget` to `0` on the `google_thinking_config`:
233233

234234
```python
235235
from pydantic_ai import Agent

pydantic_ai_slim/pydantic_ai/_agent_graph.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,12 @@ async def run( # noqa: C901
216216
ctx.state.message_history = messages
217217
ctx.deps.new_message_index = len(messages)
218218

219+
# Validate that message history starts with a user message
220+
if messages and isinstance(messages[0], _messages.ModelResponse):
221+
raise exceptions.UserError(
222+
'Message history cannot start with a `ModelResponse`. Conversations must begin with a user message.'
223+
)
224+
219225
if self.deferred_tool_results is not None:
220226
return await self._handle_deferred_tool_results(self.deferred_tool_results, messages, ctx)
221227

pydantic_ai_slim/pydantic_ai/_json_schema.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ def __init__(
2525
*,
2626
strict: bool | None = None,
2727
prefer_inlined_defs: bool = False,
28-
simplify_nullable_unions: bool = False,
28+
simplify_nullable_unions: bool = False, # TODO (v2): Remove this, no longer used
2929
):
3030
self.schema = schema
3131

@@ -146,10 +146,9 @@ def _handle_union(self, schema: JsonSchema, union_kind: Literal['anyOf', 'oneOf'
146146

147147
handled = [self._handle(member) for member in members]
148148

149-
# convert nullable unions to nullable types
149+
# TODO (v2): Remove this feature, no longer used
150150
if self.simplify_nullable_unions:
151151
handled = self._simplify_nullable_union(handled)
152-
153152
if len(handled) == 1:
154153
# In this case, no need to retain the union
155154
return handled[0] | schema
@@ -161,7 +160,7 @@ def _handle_union(self, schema: JsonSchema, union_kind: Literal['anyOf', 'oneOf'
161160

162161
@staticmethod
163162
def _simplify_nullable_union(cases: list[JsonSchema]) -> list[JsonSchema]:
164-
# TODO: Should we move this to relevant subclasses? Or is it worth keeping here to make reuse easier?
163+
# TODO (v2): Remove this method, no longer used
165164
if len(cases) == 2 and {'type': 'null'} in cases:
166165
# Find the non-null schema
167166
non_null_schema = next(

pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_run_context.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
class TemporalRunContext(RunContext[AgentDepsT]):
1515
"""The [`RunContext`][pydantic_ai.tools.RunContext] subclass to use to serialize and deserialize the run context for use inside a Temporal activity.
1616
17-
By default, only the `deps`, `run_id`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step` and `partial_output` attributes will be available.
17+
By default, only the `deps`, `run_id`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` attributes will be available.
1818
To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to [`TemporalAgent`][pydantic_ai.durable_exec.temporal.TemporalAgent].
1919
"""
2020

@@ -51,6 +51,7 @@ def serialize_run_context(cls, ctx: RunContext[Any]) -> dict[str, Any]:
5151
'max_retries': ctx.max_retries,
5252
'run_step': ctx.run_step,
5353
'partial_output': ctx.partial_output,
54+
'usage': ctx.usage,
5455
}
5556

5657
@classmethod

pydantic_ai_slim/pydantic_ai/messages.py

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -627,6 +627,13 @@ class CachePoint:
627627
kind: Literal['cache-point'] = 'cache-point'
628628
"""Type identifier, this is available on all parts as a discriminator."""
629629

630+
ttl: Literal['5m', '1h'] = '5m'
631+
"""The cache time-to-live, either "5m" (5 minutes) or "1h" (1 hour).
632+
633+
Supported by:
634+
635+
* Anthropic. See https://docs.claude.com/en/docs/build-with-claude/prompt-caching#1-hour-cache-duration for more information."""
636+
630637

631638
MultiModalContent = ImageUrl | AudioUrl | DocumentUrl | VideoUrl | BinaryContent
632639
UserContent: TypeAlias = str | MultiModalContent | CachePoint
@@ -970,6 +977,9 @@ class ModelRequest:
970977
run_id: str | None = None
971978
"""The unique identifier of the agent run in which this message originated."""
972979

980+
metadata: dict[str, Any] | None = None
981+
"""Additional data that can be accessed programmatically by the application but is not sent to the LLM."""
982+
973983
@classmethod
974984
def user_text_prompt(cls, user_prompt: str, *, instructions: str | None = None) -> ModelRequest:
975985
"""Create a `ModelRequest` with a single user prompt as text."""
@@ -1060,7 +1070,7 @@ class FilePart:
10601070

10611071
def has_content(self) -> bool:
10621072
"""Return `True` if the file content is non-empty."""
1063-
return bool(self.content) # pragma: no cover
1073+
return bool(self.content.data)
10641074

10651075
__repr__ = _utils.dataclasses_no_defaults_repr
10661076

@@ -1214,6 +1224,9 @@ class ModelResponse:
12141224
run_id: str | None = None
12151225
"""The unique identifier of the agent run in which this message originated."""
12161226

1227+
metadata: dict[str, Any] | None = None
1228+
"""Additional data that can be accessed programmatically by the application but is not sent to the LLM."""
1229+
12171230
@property
12181231
def text(self) -> str | None:
12191232
"""Get the text in the response."""

pydantic_ai_slim/pydantic_ai/models/__init__.py

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -145,24 +145,28 @@
145145
'cohere:command-r7b-12-2024',
146146
'deepseek:deepseek-chat',
147147
'deepseek:deepseek-reasoner',
148+
'google-gla:gemini-flash-latest',
149+
'google-gla:gemini-flash-lite-latest',
148150
'google-gla:gemini-2.0-flash',
149151
'google-gla:gemini-2.0-flash-lite',
150152
'google-gla:gemini-2.5-flash',
153+
'google-gla:gemini-2.5-flash-preview-09-2025',
154+
'google-gla:gemini-2.5-flash-image',
151155
'google-gla:gemini-2.5-flash-lite',
152156
'google-gla:gemini-2.5-flash-lite-preview-09-2025',
153-
'google-gla:gemini-2.5-flash-preview-09-2025',
154157
'google-gla:gemini-2.5-pro',
155-
'google-gla:gemini-flash-latest',
156-
'google-gla:gemini-flash-lite-latest',
158+
'google-gla:gemini-3-pro-preview',
159+
'google-vertex:gemini-flash-latest',
160+
'google-vertex:gemini-flash-lite-latest',
157161
'google-vertex:gemini-2.0-flash',
158162
'google-vertex:gemini-2.0-flash-lite',
159163
'google-vertex:gemini-2.5-flash',
164+
'google-vertex:gemini-2.5-flash-preview-09-2025',
165+
'google-vertex:gemini-2.5-flash-image',
160166
'google-vertex:gemini-2.5-flash-lite',
161167
'google-vertex:gemini-2.5-flash-lite-preview-09-2025',
162-
'google-vertex:gemini-2.5-flash-preview-09-2025',
163168
'google-vertex:gemini-2.5-pro',
164-
'google-vertex:gemini-flash-latest',
165-
'google-vertex:gemini-flash-lite-latest',
169+
'google-vertex:gemini-3-pro-preview',
166170
'grok:grok-2-image-1212',
167171
'grok:grok-2-vision-1212',
168172
'grok:grok-3',

0 commit comments

Comments
 (0)