pydantic
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/api/models/mcp-sampling.md‎
Lines changed: 3 additions & 0 deletions b/‎docs/api/models/mcp-sampling.md‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/common-tools.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/common-tools.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/dependencies.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/dependencies.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/evals.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/evals.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/graph.md‎
Lines changed: 6 additions & 6 deletions b/‎docs/graph.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎docs/input.md‎
Lines changed: 4 additions & 4 deletions b/‎docs/input.md‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/mcp/client.md‎
Lines changed: 110 additions & 3 deletions b/‎docs/mcp/client.md‎
Lines changed: 110 additions & 3 deletions
diff --git a/‎docs/mcp/run-python.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/mcp/run-python.md‎
Lines changed: 1 addition & 1 deletion
@@ -19,3 +19,4 @@ examples/pydantic_ai_examples/.chat_app_messages.sqlite
 node_modules/
 **.idea/
 .coverage*
+/test_tmp/
@@ -0,0 +1,3 @@
+# pydantic_ai.models.mcp_sampling
+
+::: pydantic_ai.models.mcp_sampling
@@ -20,7 +20,7 @@ pip/uv-add "pydantic-ai-slim[duckduckgo]"
 
 Here's an example of how you can use the DuckDuckGo search tool with an agent:
 
-```py {title="main.py" test="skip"}
+```py {title="duckduckgo_search.py" test="skip"}
 from pydantic_ai import Agent
 from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool
 
@@ -103,7 +103,7 @@ pip/uv-add "pydantic-ai-slim[tavily]"
 
 Here's an example of how you can use the Tavily search tool with an agent:
 
-```py {title="main.py" test="skip"}
+```py {title="tavily_search.py" test="skip"}
 import os
 
 from pydantic_ai.agent import Agent
 
@@ -276,7 +276,7 @@ async def application_code(prompt: str) -> str:  # (3)!
 
 _(This example is complete, it can be run "as is")_
 
-```python {title="test_joke_app.py" hl_lines="10-12" call_name="test_application_code"}
+```python {title="test_joke_app.py" hl_lines="10-12" call_name="test_application_code" requires="joke_app.py"}
 from joke_app import MyDeps, application_code, joke_agent
 
 
 
@@ -55,7 +55,7 @@ Evaluators are the components that analyze and score the results of your task wh
 
 Pydantic Evals includes several built-in evaluators and allows you to create custom evaluators:
 
-```python {title="simple_eval_evaluator.py"}
+```python {title="simple_eval_evaluator.py" requires="simple_eval_dataset.py"}
 from dataclasses import dataclass
 
 from simple_eval_dataset import dataset
@@ -616,7 +616,7 @@ _(This example is complete, it can be run "as is" — you'll need to add `asynci
 
 You can also write datasets as JSON files:
 
-```python {title="generate_dataset_example_json.py"}
+```python {title="generate_dataset_example_json.py" requires="generate_dataset_example.py"}
 from pathlib import Path
 
 from generate_dataset_example import AnswerOutput, MetadataType, QuestionInputs
 
@@ -167,7 +167,7 @@ _(This example is complete, it can be run "as is" with Python 3.10+)_
 
 A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code:
 
-```py {title="graph_example_diagram.py" py="3.10"}
+```py {title="graph_example_diagram.py" py="3.10" requires="graph_example.py"}
 from graph_example import DivisibleBy5, fives_graph
 
 fives_graph.mermaid_code(start_node=DivisibleBy5)
@@ -308,7 +308,7 @@ _(This example is complete, it can be run "as is" with Python 3.10+ — you'll n
 
 A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code:
 
-```py {title="vending_machine_diagram.py" py="3.10"}
+```py {title="vending_machine_diagram.py" py="3.10" requires="vending_machine.py"}
 from vending_machine import InsertCoin, vending_machine_graph
 
 vending_machine_graph.mermaid_code(start_node=InsertCoin)
@@ -524,7 +524,7 @@ Alternatively, you can drive iteration manually with the [`GraphRun.next`][pydan
 
 Below is a contrived example that stops whenever the counter is at 2, ignoring any node runs beyond that:
 
-```python {title="count_down_next.py" noqa="I001" py="3.10"}
+```python {title="count_down_next.py" noqa="I001" py="3.10" requires="count_down.py"}
 from pydantic_graph import End, FullStatePersistence
 from count_down import CountDown, CountDownState, count_down_graph
 
@@ -593,7 +593,7 @@ We can run the `count_down_graph` from [above](#iterating-over-a-graph), using [
 
 As you can see in this code, `run_node` requires no external application state (apart from state persistence) to be run, meaning graphs can easily be executed by distributed execution and queueing systems.
 
-```python {title="count_down_from_persistence.py" noqa="I001" py="3.10"}
+```python {title="count_down_from_persistence.py" noqa="I001" py="3.10" requires="count_down.py"}
 from pathlib import Path
 
 from pydantic_graph import End
@@ -746,7 +746,7 @@ Instead of running the entire graph in a single process invocation, we run the g
 
     _(This example is complete, it can be run "as is" with Python 3.10+)_
 
-```python {title="ai_q_and_a_run.py" noqa="I001" py="3.10"}
+```python {title="ai_q_and_a_run.py" noqa="I001" py="3.10" requires="ai_q_and_a_graph.py"}
 import sys
 from pathlib import Path
 
@@ -965,7 +965,7 @@ You can specify the direction of the state diagram using one of the following va
 - `'BT'`: Bottom to top, the diagram flows vertically from bottom to top.
 
 Here is an example of how to do this using 'Left to Right' (LR) instead of the default 'Top to Bottom' (TB):
-```py {title="vending_machine_diagram.py" py="3.10"}
+```py {title="vending_machine_diagram.py" py="3.10" requires="vending_machine.py"}
 from vending_machine import InsertCoin, vending_machine_graph
 
 vending_machine_graph.mermaid_code(start_node=InsertCoin, direction='LR')
 
@@ -10,7 +10,7 @@ Some LLMs are now capable of understanding audio, video, image and document cont
 
 If you have a direct URL for the image, you can use [`ImageUrl`][pydantic_ai.ImageUrl]:
 
-```py {title="main.py" test="skip" lint="skip"}
+```py {title="image_input.py" test="skip" lint="skip"}
 from pydantic_ai import Agent, ImageUrl
 
 agent = Agent(model='openai:gpt-4o')
@@ -26,7 +26,7 @@ print(result.output)
 
 If you have the image locally, you can also use [`BinaryContent`][pydantic_ai.BinaryContent]:
 
-```py {title="main.py" test="skip" lint="skip"}
+```py {title="local_image_input.py" test="skip" lint="skip"}
 import httpx
 
 from pydantic_ai import Agent, BinaryContent
@@ -69,7 +69,7 @@ You can provide document input using either [`DocumentUrl`][pydantic_ai.Document
 
 If you have a direct URL for the document, you can use [`DocumentUrl`][pydantic_ai.DocumentUrl]:
 
-```py {title="main.py" test="skip" lint="skip"}
+```py {title="document_input.py" test="skip" lint="skip"}
 from pydantic_ai import Agent, DocumentUrl
 
 agent = Agent(model='anthropic:claude-3-sonnet')
@@ -87,7 +87,7 @@ The supported document formats vary by model.
 
 You can also use [`BinaryContent`][pydantic_ai.BinaryContent] to pass document data directly:
 
-```py {title="main.py" test="skip" lint="skip"}
+```py {title="binary_content_input.py" test="skip" lint="skip"}
 from pathlib import Path
 from pydantic_ai import Agent, BinaryContent
 
 
@@ -98,7 +98,7 @@ Will display as follows:
 
 Before creating the Streamable HTTP client, we need to run a server that supports the Streamable HTTP transport.
 
-```python {title="streamable_http_server.py" py="3.10" test="skip"}
+```python {title="streamable_http_server.py" py="3.10" dunder_name="not_main"}
 from mcp.server.fastmcp import FastMCP
 
 app = FastMCP()
@@ -107,7 +107,8 @@ app = FastMCP()
 def add(a: int, b: int) -> int:
     return a + b
 
-app.run(transport='streamable-http')
+if __name__ == '__main__':
+    app.run(transport='streamable-http')
 ```
 
 Then we can create the client:
@@ -194,7 +195,7 @@ async def process_tool_call(
     return await call_tool(tool_name, args, metadata={'deps': ctx.deps})
 
 
-server = MCPServerStdio('python', ['-m', 'tests.mcp_server'], process_tool_call=process_tool_call)
+server = MCPServerStdio('python', ['mcp_server.py'], process_tool_call=process_tool_call)
 agent = Agent(
     model=TestModel(call_tools=['echo_deps']),
     deps_type=int,
@@ -275,3 +276,109 @@ agent = Agent('openai:gpt-4o', mcp_servers=[python_server, js_server])
 ```
 
 When the model interacts with these servers, it will see the prefixed tool names, but the prefixes will be automatically handled when making tool calls.
+
+## MCP Sampling
+
+!!! info "What is MCP Sampling?"
+    In MCP [sampling](https://modelcontextprotocol.io/docs/concepts/sampling) is a system by which an MCP server can make LLM calls via the MCP client - effectively proxying requests to an LLM via the client over whatever transport is being used.
+
+    Sampling is extremely useful when MCP servers need to use Gen AI but you don't want to provision them each with their own LLM credentials or when a public MCP server would like the connecting client to pay for LLM calls.
+
+    Confusingly it has nothing to do with the concept of "sampling" in observability, or frankly the concept of "sampling" in any other domain.
+
+    ??? info "Sampling Diagram"
+        Here's a mermaid diagram that may or may not make the data flow clearer:
+
+        ```mermaid
+        sequenceDiagram
+            participant LLM
+            participant MCP_Client as MCP client
+            participant MCP_Server as MCP server
+
+            MCP_Client->>LLM: LLM call
+            LLM->>MCP_Client: LLM tool call response
+
+            MCP_Client->>MCP_Server: tool call
+            MCP_Server->>MCP_Client: sampling "create message"
+
+            MCP_Client->>LLM: LLM call
+            LLM->>MCP_Client: LLM text response
+
+            MCP_Client->>MCP_Server: sampling response
+            MCP_Server->>MCP_Client: tool call response
+        ```
+
+Pydantic AI supports sampling as both a client and server. See the [server](./server.md#mcp-sampling) documentation for details on how to use sampling within a server.
+
+Sampling is automatically supported by Pydantic AI agents when they act as a client.
+
+Let's say we have an MCP server that wants to use sampling (in this case to generate an SVG as per the tool arguments).
+
+??? example "Sampling MCP Server"
+
+    ```python {title="generate_svg.py" py="3.10"}
+    import re
+    from pathlib import Path
+
+    from mcp import SamplingMessage
+    from mcp.server.fastmcp import Context, FastMCP
+    from mcp.types import TextContent
+
+    app = FastMCP()
+
+
+    @app.tool()
+    async def image_generator(ctx: Context, subject: str, style: str) -> str:
+        prompt = f'{subject=} {style=}'
+        # `ctx.session.create_message` is the sampling call
+        result = await ctx.session.create_message(
+            [SamplingMessage(role='user', content=TextContent(type='text', text=prompt))],
+            max_tokens=1_024,
+            system_prompt='Generate an SVG image as per the user input',
+        )
+        assert isinstance(result.content, TextContent)
+
+        path = Path(f'{subject}_{style}.svg')
+        # remove triple backticks if the svg was returned within markdown
+        if m := re.search(r'^```\w*$(.+?)```$', result.content.text, re.S | re.M):
+            path.write_text(m.group(1))
+        else:
+            path.write_text(result.content.text)
+        return f'See {path}'
+
+
+    if __name__ == '__main__':
+        # run the server via stdio
+        app.run()
+    ```
+
+Using this server with an `Agent` will automatically allow sampling:
+
+```python {title="sampling_mcp_client.py" py="3.10" requires="generate_svg.py"}
+from pydantic_ai import Agent
+from pydantic_ai.mcp import MCPServerStdio
+
+server = MCPServerStdio(command='python', args=['generate_svg.py'])
+agent = Agent('openai:gpt-4o', mcp_servers=[server])
+
+
+async def main():
+    async with agent.run_mcp_servers():
+        result = await agent.run('Create an image of a robot in a punk style.')
+    print(result.output)
+    #> Image file written to robot_punk.svg.
+```
+
+_(This example is complete, it can be run "as is" with Python 3.10+)_
+
+You can disallow sampling by settings [`allow_sampling=False`][pydantic_ai.mcp.MCPServerStdio.allow_sampling] when creating the server reference, e.g.:
+
+```python {title="sampling_disallowed.py" hl_lines="6" py="3.10"}
+from pydantic_ai.mcp import MCPServerStdio
+
+server = MCPServerStdio(
+    command='python',
+    args=['generate_svg.py'],
+    allow_sampling=False,
+)
+```
@@ -122,7 +122,7 @@ As introduced in PEP 723, explained [here](https://packaging.python.org/en/lates
 
 This allows use of dependencies that aren't imported in the code, and is more explicit.
 
-```py {title="inline_script_metadata.py" py="3.10"}
+```py {title="inline_script_metadata.py" py="3.10" requires="mcp_run_python.py"}
 from mcp import ClientSession
 from mcp.client.stdio import stdio_client
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+# pydantic_ai.models.mcp_sampling`
	`2`	`+`
	`3`	`+::: pydantic_ai.models.mcp_sampling`