Skip to content

Commit fa3dd3b

Browse files
authored
Include thinking parts in model requests to improve performance and cache hit rates (#2823)
1 parent f897c5c commit fa3dd3b

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+6355
-2212
lines changed

docs/thinking.md

Lines changed: 33 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -6,27 +6,20 @@ providing its final answer.
66
This capability is typically disabled by default and depends on the specific model being used.
77
See the sections below for how to enable thinking for each provider.
88

9-
Internally, if the model doesn't provide thinking objects, Pydantic AI will convert thinking blocks
10-
(`"<think>..."</think>"`) in provider-specific text parts to `ThinkingPart`s. We have also made
11-
the decision not to send `ThinkingPart`s back to the provider in multi-turn conversations -
12-
this helps save costs for users. In the future, we plan to add a setting to customize this behavior.
13-
149
## OpenAI
1510

16-
When using the [`OpenAIChatModel`][pydantic_ai.models.openai.OpenAIChatModel], thinking objects are not created
17-
by default. However, the text content may contain `"<think>"` tags. When this happens, Pydantic AI will
18-
convert them to [`ThinkingPart`][pydantic_ai.messages.ThinkingPart] objects.
11+
When using the [`OpenAIChatModel`][pydantic_ai.models.openai.OpenAIChatModel], text output inside `<think>` tags are converted to [`ThinkingPart`][pydantic_ai.messages.ThinkingPart] objects.
12+
You can customize the tags using the [`thinking_tags`][pydantic_ai.profiles.ModelProfile.thinking_tags] field on the [model profile](models/openai.md#model-profile).
1913

20-
In contrast, the [`OpenAIResponsesModel`][pydantic_ai.models.openai.OpenAIResponsesModel] does
21-
generate thinking parts. To enable this functionality, you need to set the `openai_reasoning_effort` and
22-
`openai_reasoning_summary` fields in the
14+
The [`OpenAIResponsesModel`][pydantic_ai.models.openai.OpenAIResponsesModel] can generate native thinking parts.
15+
To enable this functionality, you need to set the `openai_reasoning_effort` and `openai_reasoning_summary` fields in the
2316
[`OpenAIResponsesModelSettings`][pydantic_ai.models.openai.OpenAIResponsesModelSettings].
2417

2518
```python {title="openai_thinking_part.py"}
2619
from pydantic_ai import Agent
2720
from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings
2821

29-
model = OpenAIResponsesModel('o3-mini')
22+
model = OpenAIResponsesModel('gpt-5')
3023
settings = OpenAIResponsesModelSettings(
3124
openai_reasoning_effort='low',
3225
openai_reasoning_summary='detailed',
@@ -37,29 +30,44 @@ agent = Agent(model, model_settings=settings)
3730

3831
## Anthropic
3932

40-
Unlike other providers, Anthropic includes a signature in the thinking part. This signature is used to
41-
ensure that the thinking part has not been tampered with. To enable thinking, use the `anthropic_thinking`
42-
field in the [`AnthropicModelSettings`][pydantic_ai.models.anthropic.AnthropicModelSettings].
33+
To enable thinking, use the `anthropic_thinking` field in the [`AnthropicModelSettings`][pydantic_ai.models.anthropic.AnthropicModelSettings].
4334

4435
```python {title="anthropic_thinking_part.py"}
4536
from pydantic_ai import Agent
4637
from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings
4738

48-
model = AnthropicModel('claude-3-7-sonnet-latest')
39+
model = AnthropicModel('claude-sonnet-4-0')
4940
settings = AnthropicModelSettings(
5041
anthropic_thinking={'type': 'enabled', 'budget_tokens': 1024},
5142
)
5243
agent = Agent(model, model_settings=settings)
5344
...
5445
```
5546

47+
## Google
48+
49+
To enable thinking, use the `google_thinking_config` field in the
50+
[`GoogleModelSettings`][pydantic_ai.models.google.GoogleModelSettings].
51+
52+
```python {title="google_thinking_part.py"}
53+
from pydantic_ai import Agent
54+
from pydantic_ai.models.google import GoogleModel, GoogleModelSettings
55+
56+
model = GoogleModel('gemini-2.5-pro')
57+
settings = GoogleModelSettings(google_thinking_config={'include_thoughts': True})
58+
agent = Agent(model, model_settings=settings)
59+
...
60+
```
61+
62+
## Bedrock
63+
5664
## Groq
5765

5866
Groq supports different formats to receive thinking parts:
5967

60-
- `"raw"`: The thinking part is included in the text content with the `"<think>"` tag.
68+
- `"raw"`: The thinking part is included in the text content inside `<think>` tags, which are automatically converted to [`ThinkingPart`][pydantic_ai.messages.ThinkingPart] objects.
6169
- `"hidden"`: The thinking part is not included in the text content.
62-
- `"parsed"`: The thinking part has its own [`ThinkingPart`][pydantic_ai.messages.ThinkingPart] object.
70+
- `"parsed"`: The thinking part has its own structured part in the response which is converted into a [`ThinkingPart`][pydantic_ai.messages.ThinkingPart] object.
6371

6472
To enable thinking, use the `groq_reasoning_format` field in the
6573
[`GroqModelSettings`][pydantic_ai.models.groq.GroqModelSettings]:
@@ -74,21 +82,15 @@ agent = Agent(model, model_settings=settings)
7482
...
7583
```
7684

77-
## Google
85+
## Mistral
7886

79-
To enable thinking, use the `google_thinking_config` field in the
80-
[`GoogleModelSettings`][pydantic_ai.models.google.GoogleModelSettings].
87+
Thinking is supported by the `magistral` family of models. It does not need to be specifically enabled.
8188

82-
```python {title="google_thinking_part.py"}
83-
from pydantic_ai import Agent
84-
from pydantic_ai.models.google import GoogleModel, GoogleModelSettings
89+
## Cohere
8590

86-
model = GoogleModel('gemini-2.5-pro-preview-03-25')
87-
settings = GoogleModelSettings(google_thinking_config={'include_thoughts': True})
88-
agent = Agent(model, model_settings=settings)
89-
...
90-
```
91+
Thinking is supported by the `command-a-reasoning-08-2025` model. It does not need to be specifically enabled.
9192

92-
## Mistral / Cohere
93+
## Hugging Face
9394

94-
Neither Mistral nor Cohere generate thinking parts.
95+
Text output inside `<think>` tags is automatically converted to [`ThinkingPart`][pydantic_ai.messages.ThinkingPart] objects.
96+
You can customize the tags using the [`thinking_tags`][pydantic_ai.profiles.ModelProfile.thinking_tags] field on the [model profile](models/openai.md#model-profile).

pydantic_ai_slim/pydantic_ai/_parts_manager.py

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,7 @@ def handle_thinking_delta(
156156
content: str | None = None,
157157
id: str | None = None,
158158
signature: str | None = None,
159+
provider_name: str | None = None,
159160
) -> ModelResponseStreamEvent:
160161
"""Handle incoming thinking content, creating or updating a ThinkingPart in the manager as appropriate.
161162
@@ -170,6 +171,7 @@ def handle_thinking_delta(
170171
content: The thinking content to append to the appropriate ThinkingPart.
171172
id: An optional id for the thinking part.
172173
signature: An optional signature for the thinking content.
174+
provider_name: An optional provider name for the thinking part.
173175
174176
Returns:
175177
A `PartStartEvent` if a new part was created, or a `PartDeltaEvent` if an existing part was updated.
@@ -199,24 +201,20 @@ def handle_thinking_delta(
199201
if content is not None:
200202
# There is no existing thinking part that should be updated, so create a new one
201203
new_part_index = len(self._parts)
202-
part = ThinkingPart(content=content, id=id, signature=signature)
204+
part = ThinkingPart(content=content, id=id, signature=signature, provider_name=provider_name)
203205
if vendor_part_id is not None: # pragma: no branch
204206
self._vendor_id_to_part_index[vendor_part_id] = new_part_index
205207
self._parts.append(part)
206208
return PartStartEvent(index=new_part_index, part=part)
207209
else:
208210
raise UnexpectedModelBehavior('Cannot create a ThinkingPart with no content')
209211
else:
210-
if content is not None:
211-
# Update the existing ThinkingPart with the new content delta
212-
existing_thinking_part, part_index = existing_thinking_part_and_index
213-
part_delta = ThinkingPartDelta(content_delta=content)
214-
self._parts[part_index] = part_delta.apply(existing_thinking_part)
215-
return PartDeltaEvent(index=part_index, delta=part_delta)
216-
elif signature is not None:
217-
# Update the existing ThinkingPart with the new signature delta
212+
if content is not None or signature is not None:
213+
# Update the existing ThinkingPart with the new content and/or signature delta
218214
existing_thinking_part, part_index = existing_thinking_part_and_index
219-
part_delta = ThinkingPartDelta(signature_delta=signature)
215+
part_delta = ThinkingPartDelta(
216+
content_delta=content, signature_delta=signature, provider_name=provider_name
217+
)
220218
self._parts[part_index] = part_delta.apply(existing_thinking_part)
221219
return PartDeltaEvent(index=part_index, delta=part_delta)
222220
else:

pydantic_ai_slim/pydantic_ai/messages.py

Lines changed: 30 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -895,7 +895,18 @@ class ThinkingPart:
895895
signature: str | None = None
896896
"""The signature of the thinking.
897897
898-
The signature is only available on the Anthropic models.
898+
Supported by:
899+
900+
* Anthropic (corresponds to the `signature` field)
901+
* Bedrock (corresponds to the `signature` field)
902+
* Google (corresponds to the `thought_signature` field)
903+
* OpenAI (corresponds to the `encrypted_content` field)
904+
"""
905+
906+
provider_name: str | None = None
907+
"""The name of the provider that generated the response.
908+
909+
Signatures are only sent back to the same provider.
899910
"""
900911

901912
part_kind: Literal['thinking'] = 'thinking'
@@ -980,7 +991,10 @@ class BuiltinToolCallPart(BaseToolCallPart):
980991
_: KW_ONLY
981992

982993
provider_name: str | None = None
983-
"""The name of the provider that generated the response."""
994+
"""The name of the provider that generated the response.
995+
996+
Built-in tool calls are only sent back to the same provider.
997+
"""
984998

985999
part_kind: Literal['builtin-tool-call'] = 'builtin-tool-call'
9861000
"""Part type identifier, this is available on all parts as a discriminator."""
@@ -1198,6 +1212,12 @@ class ThinkingPartDelta:
11981212
Note this is never treated as a delta — it can replace None.
11991213
"""
12001214

1215+
provider_name: str | None = None
1216+
"""Optional provider name for the thinking part.
1217+
1218+
Signatures are only sent back to the same provider.
1219+
"""
1220+
12011221
part_delta_kind: Literal['thinking'] = 'thinking'
12021222
"""Part delta type identifier, used as a discriminator."""
12031223

@@ -1222,14 +1242,18 @@ def apply(self, part: ModelResponsePart | ThinkingPartDelta) -> ThinkingPart | T
12221242
if isinstance(part, ThinkingPart):
12231243
new_content = part.content + self.content_delta if self.content_delta else part.content
12241244
new_signature = self.signature_delta if self.signature_delta is not None else part.signature
1225-
return replace(part, content=new_content, signature=new_signature)
1245+
new_provider_name = self.provider_name if self.provider_name is not None else part.provider_name
1246+
return replace(part, content=new_content, signature=new_signature, provider_name=new_provider_name)
12261247
elif isinstance(part, ThinkingPartDelta):
12271248
if self.content_delta is None and self.signature_delta is None:
12281249
raise ValueError('Cannot apply ThinkingPartDelta with no content or signature')
1229-
if self.signature_delta is not None:
1230-
return replace(part, signature_delta=self.signature_delta)
12311250
if self.content_delta is not None:
1232-
return replace(part, content_delta=self.content_delta)
1251+
part = replace(part, content_delta=(part.content_delta or '') + self.content_delta)
1252+
if self.signature_delta is not None:
1253+
part = replace(part, signature_delta=self.signature_delta)
1254+
if self.provider_name is not None:
1255+
part = replace(part, provider_name=self.provider_name)
1256+
return part
12331257
raise ValueError( # pragma: no cover
12341258
f'Cannot apply ThinkingPartDeltas to non-ThinkingParts or non-ThinkingPartDeltas ({part=}, {self=})'
12351259
)

0 commit comments

Comments
 (0)