You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. The optional `metadata` parameter can attach arbitrary context to deferred tool calls, accessible in `DeferredToolRequests.metadata` keyed by `tool_call_id`.
180
+
178
181
_(This example is complete, it can be run "as is")_
result = ModelRetry('No result for this tool call was found.')
268
272
@@ -324,8 +328,9 @@ async def main():
324
328
"""
325
329
```
326
330
327
-
1. In reality, you'd likely use Celery or a similar task queue to run the task in the background.
328
-
2. In reality, this would typically happen in a separate process that polls for the task status or is notified when all pending tasks are complete.
331
+
1. Generate a task ID that can be tracked independently of the tool call ID.
332
+
2. The optional `metadata` parameter passes the `task_id` so it can be matched with results later, accessible in `DeferredToolRequests.metadata` keyed by `tool_call_id`.
333
+
3. In reality, this would typically happen in a separate process that polls for the task status or is notified when all pending tasks are complete.
329
334
330
335
_(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)_
Copy file name to clipboardExpand all lines: docs/durable_execution/temporal.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -172,7 +172,7 @@ As workflows and activities run in separate processes, any values passed between
172
172
173
173
To account for these limitations, tool functions and the [event stream handler](#streaming) running inside activities receive a limited version of the agent's [`RunContext`][pydantic_ai.tools.RunContext], and it's your responsibility to make sure that the [dependencies](../dependencies.md) object provided to [`TemporalAgent.run()`][pydantic_ai.durable_exec.temporal.TemporalAgent.run] can be serialized using Pydantic.
174
174
175
-
Specifically, only the `deps`, `run_id`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step`and `partial_output` fields are available by default, and trying to access `model`, `usage`, `prompt`, `messages`, or `tracer` will raise an error.
175
+
Specifically, only the `deps`, `run_id`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` fields are available by default, and trying to access `model`, `prompt`, `messages`, or `tracer` will raise an error.
176
176
If you need one or more of these attributes to be available inside activities, you can create a [`TemporalRunContext`][pydantic_ai.durable_exec.temporal.TemporalRunContext] subclass with custom `serialize_run_context` and `deserialize_run_context` class methods and pass it to [`TemporalAgent`][pydantic_ai.durable_exec.temporal.TemporalAgent] as `run_context_type`.
Copy file name to clipboardExpand all lines: docs/gateway.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ status: new
5
5
6
6
# Pydantic AI Gateway
7
7
8
-
**[Pydantic AI Gateway](https://pydantic.dev/ai-gateway)**(PAIG) is a unified interface for accessing multiple AI providers with a single key. Features include built-in OpenTelemetry observability, real-time cost monitoring, failover management, and native integration with the other tools in the [Pydantic stack](https://pydantic.dev/).
8
+
**[Pydantic AI Gateway](https://pydantic.dev/ai-gateway)** is a unified interface for accessing multiple AI providers with a single key. Features include built-in OpenTelemetry observability, real-time cost monitoring, failover management, and native integration with the other tools in the [Pydantic stack](https://pydantic.dev/).
9
9
10
10
!!! note "Free while in Beta"
11
11
The Pydantic AI Gateway is currently in Beta. You can bring your own key (BYOK) or buy inference through the Gateway (we will eat the card fee for now).
@@ -26,8 +26,8 @@ To help you get started with [Pydantic AI Gateway](https://gateway.pydantic.dev)
26
26
-**BYOK and managed providers:** Bring your own API keys (BYOK) from LLM providers, or pay for inference directly through the platform.
27
27
-**Multi-provider support:** Access models from OpenAI, Anthropic, Google Vertex, Groq, and AWS Bedrock. _More providers coming soon_.
28
28
-**Backend observability:** Log every request through [Pydantic Logfire](https://pydantic.dev/logfire) or any OpenTelemetry backend (_coming soon_).
29
-
-**Zero translation**: Unlike traditional AI gateways that translate everything to one common schema, PAIG allows requests to flow through directly in each provider's native format. This gives you immediate access to the new model features as soon as they are released.
30
-
-**Open source with self-hosting**: PAIG's core is [open source](https://github.com/pydantic/pydantic-ai-gateway/) (under [AGPL-3.0](https://www.gnu.org/licenses/agpl-3.0.en.html)), allowing self-hosting with file-based configuration, instead of using the managed service.
29
+
-**Zero translation**: Unlike traditional AI gateways that translate everything to one common schema, **Pydantic AI Gateway** allows requests to flow through directly in each provider's native format. This gives you immediate access to the new model features as soon as they are released.
30
+
-**Open source with self-hosting**: Pydantic AI Gateway core is [open source](https://github.com/pydantic/pydantic-ai-gateway/) (under [AGPL-3.0](https://www.gnu.org/licenses/agpl-3.0.en.html)), allowing self-hosting with file-based configuration, instead of using the managed service.
31
31
-**Enterprise ready**: Includes SSO (with OIDC support), granular permissions, and flexible deployment options. Deploy to your Cloudflare account, or run on-premises with our [consulting support](https://pydantic.dev/contact).
32
32
33
33
```python {title="hello_world.py"}
@@ -80,7 +80,7 @@ Users can only create personal keys, that will inherit spending caps from both U
80
80
## Usage
81
81
82
82
After setting up your account with the instructions above, you will be able to make an AI model request with the Pydantic AI Gateway.
83
-
The code snippets below show how you can use PAIG with different frameworks and SDKs.
83
+
The code snippets below show how you can use Pydantic AI Gateway with different frameworks and SDKs.
84
84
You can add `gateway/` as prefix on every known provider that
85
85
86
86
To use different models, change the model string `gateway/<api_format>:<model_name>` to other models offered by the supported providers.
@@ -114,7 +114,7 @@ Before you start, make sure you are on version 1.16 or later of `pydantic-ai`. T
114
114
Set the `PYDANTIC_AI_GATEWAY_API_KEY` environment variable to your Gateway API key:
Copy file name to clipboardExpand all lines: docs/logfire.md
+16-35Lines changed: 16 additions & 35 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -106,49 +106,30 @@ We can also query data with SQL in Logfire to monitor the performance of an appl
106
106
107
107
### Monitoring HTTP Requests
108
108
109
-
!!! tip "\"F**k you, show me the prompt.\""
110
-
As per Hamel Husain's influential 2024 blog post ["Fuck You, Show Me The Prompt."](https://hamel.dev/blog/posts/prompt/)
111
-
(bear with the capitalization, the point is valid), it's often useful to be able to view the raw HTTP requests and responses made to model providers.
109
+
As per Hamel Husain's influential 2024 blog post ["Fuck You, Show Me The Prompt."](https://hamel.dev/blog/posts/prompt/)
110
+
(bear with the capitalization, the point is valid), it's often useful to be able to view the raw HTTP requests and responses made to model providers.
112
111
113
-
To observe raw HTTP requests made to model providers, you can use Logfire's [HTTPX instrumentation](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) since all provider SDKs use the [HTTPX](https://www.python-httpx.org/) library internally.
112
+
To observe raw HTTP requests made to model providers, you can use Logfire's [HTTPX instrumentation](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) since all provider SDKs (except for [Bedrock](models/bedrock.md)) use the [HTTPX](https://www.python-httpx.org/) library internally:
result = agent.run_sync('What is the capital of France?')
127
-
print(result.output)
128
-
#> The capital of France is Paris.
129
-
```
130
-
131
-
1. See the [`logfire.instrument_httpx` docs][logfire.Logfire.instrument_httpx] more details, `capture_all=True` means both headers and body are captured for both the request and response.
132
-
133
-

result = agent.run_sync('What is the capital of France?')
126
+
print(result.output)
127
+
#> The capital of France is Paris.
128
+
```
144
129
145
-
agent = Agent('openai:gpt-5')
146
-
result = agent.run_sync('What is the capital of France?')
147
-
print(result.output)
148
-
#> The capital of France is Paris.
149
-
```
130
+
1. See the [`logfire.instrument_httpx` docs][logfire.Logfire.instrument_httpx] more details, `capture_all=True` means both headers and body are captured for both the request and response.
150
131
151
-

132
+

MCP servers can provide instructions during initialization that give context about how to best interact with the server's tools. These instructions are accessible via the [`instructions`][pydantic_ai.mcp.MCPServer.instructions] property after the server connection is established.
344
+
345
+
```python {title="mcp_server_instructions.py"}
346
+
from pydantic_ai import Agent
347
+
from pydantic_ai.mcp import MCPServerStreamableHTTP
348
+
349
+
server = MCPServerStreamableHTTP('http://localhost:8000/mcp')
350
+
agent = Agent('openai:gpt-5', toolsets=[server])
351
+
352
+
@agent.instructions
353
+
asyncdefmcp_server_instructions():
354
+
return server.instructions # (1)!
355
+
356
+
asyncdefmain():
357
+
result =await agent.run('What is 7 plus 5?')
358
+
print(result.output)
359
+
#> The answer is 12.
360
+
```
361
+
362
+
1. The server connection is guaranteed to be established by this point, so `server.instructions` is available.
363
+
341
364
## Tool metadata
342
365
343
366
MCP tools can include metadata that provides additional information about the tool's characteristics, which can be useful when [filtering tools][pydantic_ai.toolsets.FilteredToolset]. The `meta`, `annotations`, and `output_schema` fields can be found on the `metadata` dict on the [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] object that's passed to filter functions.
Copy file name to clipboardExpand all lines: docs/models/anthropic.md
+5-4Lines changed: 5 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -83,8 +83,8 @@ agent = Agent(model)
83
83
Anthropic supports [prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) to reduce costs by caching parts of your prompts. Pydantic AI provides three ways to use prompt caching:
84
84
85
85
1.**Cache User Messages with [`CachePoint`][pydantic_ai.messages.CachePoint]**: Insert a `CachePoint` marker in your user messages to cache everything before it
86
-
2.**Cache System Instructions**: Enable the [`AnthropicModelSettings.anthropic_cache_instructions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions][model setting](../agents.md#model-run-settings) to cache your system prompt
87
-
3.**Cache Tool Definitions**: Enable the [`AnthropicModelSettings.anthropic_cache_tool_definitions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions][model setting](../agents.md#model-run-settings) to cache your tool definitions
86
+
2.**Cache System Instructions**: Set [`AnthropicModelSettings.anthropic_cache_instructions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions]to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
87
+
3.**Cache Tool Definitions**: Set [`AnthropicModelSettings.anthropic_cache_tool_definitions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions]to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
88
88
89
89
You can combine all three strategies for maximum savings:
90
90
@@ -96,8 +96,9 @@ agent = Agent(
96
96
'anthropic:claude-sonnet-4-5',
97
97
system_prompt='Detailed instructions...',
98
98
model_settings=AnthropicModelSettings(
99
+
# Use True for default 5m TTL, or specify '5m' / '1h' directly
99
100
anthropic_cache_instructions=True,
100
-
anthropic_cache_tool_definitions=True,
101
+
anthropic_cache_tool_definitions='1h', # Longer cache for tool definitions
0 commit comments