You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1.**Cache User Messages with [`CachePoint`][pydantic_ai.messages.CachePoint]**: Insert a `CachePoint` marker in your user messages to cache everything before it
86
86
2.**Cache System Instructions**: Set [`AnthropicModelSettings.anthropic_cache_instructions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions] to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
87
87
3.**Cache Tool Definitions**: Set [`AnthropicModelSettings.anthropic_cache_tool_definitions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions] to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
88
-
4.**Cache All (Convenience)**: Set [`AnthropicModelSettings.anthropic_cache_all`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_all] to `True` to automatically cache both system instructions and the last user message
88
+
4.**Cache Last Message (Convenience)**: Set [`AnthropicModelSettings.anthropic_cache_messages`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_messages] to `True` to automatically cache the last user message
89
89
90
90
You can combine multiple strategies for maximum savings:
91
91
92
92
```python {test="skip"}
93
93
from pydantic_ai import Agent, CachePoint, RunContext
94
94
from pydantic_ai.models.anthropic import AnthropicModelSettings
95
95
96
-
# Option 1: Use anthropic_cache_all for convenience (caches system + last message)
96
+
# Option 1: Use anthropic_cache_messages for convenience (caches last message only)
97
97
agent = Agent(
98
98
'anthropic:claude-sonnet-4-5',
99
99
system_prompt='Detailed instructions...',
100
100
model_settings=AnthropicModelSettings(
101
-
anthropic_cache_all=True, # Caches both system prompt and last message
101
+
anthropic_cache_messages=True, # Caches the last user message
102
102
),
103
103
)
104
104
@@ -159,35 +159,77 @@ async def main():
159
159
160
160
### Cache Point Limits
161
161
162
-
Anthropic enforces a maximum of 4 cache points per request. Pydantic AI automatically manages this limit:
162
+
Anthropic enforces a maximum of 4 cache points per request. Pydantic AI automatically manages this limit to ensure your requests always comply without errors.
163
163
164
-
-**`anthropic_cache_all`**: Uses 2 cache points (system instructions + last message)
165
-
-**`anthropic_cache_instructions`**: Uses 1 cache point
166
-
-**`anthropic_cache_tool_definitions`**: Uses 1 cache point
167
-
-**`CachePoint` markers**: Use remaining available cache points
164
+
#### How Cache Points Are Allocated
168
165
169
-
When the total exceeds 4 cache points, Pydantic AI automatically removes cache points from **older messages** (keeping the most recent ones), ensuring your requests always comply with Anthropic's limits without errors.
166
+
Cache points can be placed in three locations:
167
+
168
+
1.**System Prompt**: Via `anthropic_cache_instructions` setting (adds cache point to last system prompt block)
169
+
2.**Tool Definitions**: Via `anthropic_cache_tool_definitions` setting (adds cache point to last tool definition)
170
+
3.**Messages**: Via `CachePoint` markers or `anthropic_cache_messages` setting (adds cache points to message content)
171
+
172
+
Each setting uses **at most 1 cache point**, but you can combine them:
170
173
171
174
```python {test="skip"}
172
175
from pydantic_ai import Agent, CachePoint
173
176
from pydantic_ai.models.anthropic import AnthropicModelSettings
174
177
178
+
# Example: Using all 3 cache point sources
179
+
agent = Agent(
180
+
'anthropic:claude-sonnet-4-5',
181
+
system_prompt='Detailed instructions...',
182
+
model_settings=AnthropicModelSettings(
183
+
anthropic_cache_instructions=True, # 1 cache point
184
+
anthropic_cache_tool_definitions=True, # 1 cache point
185
+
anthropic_cache_messages=True, # 1 cache point
186
+
),
187
+
)
188
+
189
+
@agent.tool_plain
190
+
defmy_tool() -> str:
191
+
return'result'
192
+
193
+
asyncdefmain():
194
+
# This uses 3 cache points (instructions + tools + last message)
195
+
# You can add 1 more CachePoint marker before hitting the limit
196
+
result =await agent.run([
197
+
'Context', CachePoint(), # 4th cache point - OK
198
+
'Question'
199
+
])
200
+
```
201
+
202
+
#### Automatic Cache Point Limiting
203
+
204
+
When cache points from all sources (settings + `CachePoint` markers) exceed 4, Pydantic AI automatically removes excess cache points from **older message content** (keeping the most recent ones):
205
+
206
+
```python {test="skip"}
175
207
agent = Agent(
176
208
'anthropic:claude-sonnet-4-5',
177
209
system_prompt='Instructions...',
178
210
model_settings=AnthropicModelSettings(
179
-
anthropic_cache_all=True, # Uses 2 cache points
211
+
anthropic_cache_instructions=True, # 1 cache point
212
+
anthropic_cache_tool_definitions=True, # 1 cache point
180
213
),
181
214
)
182
215
216
+
@agent.tool_plain
217
+
defsearch() -> str:
218
+
return'data'
219
+
183
220
asyncdefmain():
184
-
#Even with multiple CachePoint markers, only 2 more will be kept
185
-
#(4 total limit - 2 from cache_all = 2 available)
221
+
#Already using 2 cache points (instructions + tools)
222
+
#Can add 2 more CachePoint markers (4 total limit)
0 commit comments