-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add WebFetchTool builtin tool support #3427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
9a92f05
8a0bbcc
3e36dc5
a95b249
f72bbe8
726957b
4ddee67
4d5da14
31dbe36
cd8b04b
4e54dfb
d8ef825
f8f9cd5
ba7f1fd
1e59fce
89751ce
b054020
ced98b4
e8f7a25
17fb30c
c048d15
52f2f86
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,7 +9,7 @@ Pydantic AI supports the following built-in tools: | |
| - **[`WebSearchTool`][pydantic_ai.builtin_tools.WebSearchTool]**: Allows agents to search the web | ||
| - **[`CodeExecutionTool`][pydantic_ai.builtin_tools.CodeExecutionTool]**: Enables agents to execute code in a secure environment | ||
| - **[`ImageGenerationTool`][pydantic_ai.builtin_tools.ImageGenerationTool]**: Enables agents to generate images | ||
| - **[`UrlContextTool`][pydantic_ai.builtin_tools.UrlContextTool]**: Enables agents to pull URL contents into their context | ||
| - **[`WebFetchTool`][pydantic_ai.builtin_tools.WebFetchTool]**: Enables agents to fetch web pages | ||
| - **[`MemoryTool`][pydantic_ai.builtin_tools.MemoryTool]**: Enables agents to use memory | ||
| - **[`MCPServerTool`][pydantic_ai.builtin_tools.MCPServerTool]**: Enables agents to use remote MCP servers with communication handled by the model provider | ||
|
|
||
|
|
@@ -306,18 +306,18 @@ For more details, check the [API documentation][pydantic_ai.builtin_tools.ImageG | |
| | `quality` | ✅ | ❌ | | ||
| | `size` | ✅ | ❌ | | ||
|
|
||
| ## URL Context Tool | ||
| ## Web Fetch Tool | ||
|
|
||
| The [`UrlContextTool`][pydantic_ai.builtin_tools.UrlContextTool] enables your agent to pull URL contents into its context, | ||
| The [`WebFetchTool`][pydantic_ai.builtin_tools.WebFetchTool] enables your agent to pull URL contents into its context, | ||
| allowing it to pull up-to-date information from the web. | ||
|
|
||
| ### Provider Support | ||
|
|
||
| | Provider | Supported | Notes | | ||
| |----------|-----------|-------| | ||
| | Anthropic | ✅ | Full feature support. Uses Anthropic's [Web Fetch Tool](https://docs.claude.com/en/docs/agents-and-tools/tool-use/web-fetch-tool) internally to retrieve URL contents. | | ||
| | Google | ✅ | No [`BuiltinToolCallPart`][pydantic_ai.messages.BuiltinToolCallPart] or [`BuiltinToolReturnPart`][pydantic_ai.messages.BuiltinToolReturnPart] is currently generated; please submit an issue if you need this. Using built-in tools and function tools (including [output tools](output.md#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](output.md#prompted-output) instead. | | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While we're at it, would you be up for fixing "No [ Per https://ai.google.dev/gemini-api/docs/url-context#contextual-response the data is available, and I see the same in the It may be better to do that in a future PR (not necessarily you), unless you feel like doing it now :) |
||
| | OpenAI | ❌ | | | ||
| | Anthropic | ❌ | | | ||
| | Groq | ❌ | | | ||
| | Bedrock | ❌ | | | ||
| | Mistral | ❌ | | | ||
|
|
@@ -327,10 +327,10 @@ allowing it to pull up-to-date information from the web. | |
|
|
||
| ### Usage | ||
|
|
||
| ```py {title="url_context_basic.py"} | ||
| from pydantic_ai import Agent, UrlContextTool | ||
| ```py {title="web_fetch_basic.py"} | ||
| from pydantic_ai import Agent, WebFetchTool | ||
|
|
||
| agent = Agent('google-gla:gemini-2.5-flash', builtin_tools=[UrlContextTool()]) | ||
| agent = Agent('google-gla:gemini-2.5-flash', builtin_tools=[WebFetchTool()]) | ||
|
|
||
| result = agent.run_sync('What is this? https://ai.pydantic.dev') | ||
| print(result.output) | ||
|
|
@@ -339,6 +339,48 @@ print(result.output) | |
|
|
||
| _(This example is complete, it can be run "as is")_ | ||
|
|
||
| ### Parameters | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We call them Configuration Options in all the other examples; please make sure the wording is consistent, as well as the way the table is structured, the Provider Support subsection, etc. |
||
|
|
||
| The [`WebFetchTool`][pydantic_ai.builtin_tools.WebFetchTool] supports several configuration parameters. The parameters that are actually used depend on the model provider. | ||
|
|
||
| | Parameter | Type | Description | Supported by | | ||
| |-----------|------|-------------|--------------| | ||
| | `max_uses` | `int \| None` | Limit the number of URL fetches per request | Anthropic | | ||
| | `allowed_domains` | `list[str] \| None` | Only fetch from these domains | Anthropic | | ||
| | `blocked_domains` | `list[str] \| None` | Never fetch from these domains | Anthropic | | ||
| | `citations_enabled` | `bool` | Enable citations for fetched content | Anthropic | | ||
| | `max_content_tokens` | `int \| None` | Maximum content length in tokens | Anthropic | | ||
|
|
||
| !!! note | ||
| With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. | ||
|
|
||
| !!! note | ||
| Google's URL context tool does not support any configuration parameters. The limits are fixed at 20 URLs per request with a maximum of 34MB per URL. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be in a Provider Support Notes column |
||
|
|
||
| Example with parameters (Anthropic only): | ||
|
|
||
| ```py {title="web_fetch_with_params.py"} | ||
| from pydantic_ai import Agent, WebFetchTool | ||
|
|
||
| # Configure WebFetchTool with domain filtering and limits | ||
| web_fetch = WebFetchTool( | ||
| allowed_domains=['ai.pydantic.dev', 'docs.pydantic.dev'], | ||
| max_uses=10, | ||
| citations_enabled=True, | ||
| max_content_tokens=50000, | ||
| ) | ||
|
|
||
| agent = Agent('anthropic:claude-sonnet-4-0', builtin_tools=[web_fetch]) | ||
|
|
||
| result = agent.run_sync( | ||
| 'Compare the documentation at https://ai.pydantic.dev and https://docs.pydantic.dev' | ||
| ) | ||
| print(result.output) | ||
| """ | ||
| Both sites provide comprehensive documentation for Pydantic projects. ai.pydantic.dev focuses on PydanticAI, a framework for building AI agents, while docs.pydantic.dev covers Pydantic, the data validation library. They share similar documentation styles and both emphasize type safety and developer experience. | ||
| """ | ||
| ``` | ||
|
|
||
| ## Memory Tool | ||
|
|
||
| The [`MemoryTool`][pydantic_ai.builtin_tools.MemoryTool] enables your agent to use memory. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,13 +6,14 @@ | |
|
|
||
| import pydantic | ||
| from pydantic_core import core_schema | ||
| from typing_extensions import TypedDict | ||
| from typing_extensions import TypedDict, deprecated | ||
|
|
||
| __all__ = ( | ||
| 'AbstractBuiltinTool', | ||
| 'WebSearchTool', | ||
| 'WebSearchUserLocation', | ||
| 'CodeExecutionTool', | ||
| 'WebFetchTool', | ||
| 'UrlContextTool', | ||
| 'ImageGenerationTool', | ||
| 'MemoryTool', | ||
|
|
@@ -166,18 +167,79 @@ class CodeExecutionTool(AbstractBuiltinTool): | |
|
|
||
|
|
||
| @dataclass(kw_only=True) | ||
| class UrlContextTool(AbstractBuiltinTool): | ||
| class WebFetchTool(AbstractBuiltinTool): | ||
| """Allows your agent to access contents from URLs. | ||
|
|
||
| The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models. | ||
|
|
||
| Supported by: | ||
|
|
||
| * Anthropic | ||
| """ | ||
|
|
||
| kind: str = 'url_context' | ||
| max_uses: int | None = None | ||
| """If provided, the tool will stop fetching URLs after the given number of uses. | ||
|
|
||
| Supported by: | ||
|
|
||
| * Anthropic | ||
| """ | ||
|
|
||
| allowed_domains: list[str] | None = None | ||
| """If provided, only these domains will be fetched. | ||
|
|
||
| With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. | ||
|
|
||
| Supported by: | ||
|
|
||
| * Anthropic, see <https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering> | ||
| """ | ||
|
|
||
| blocked_domains: list[str] | None = None | ||
| """If provided, these domains will never be fetched. | ||
|
|
||
| With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. | ||
|
|
||
| Supported by: | ||
|
|
||
| * Anthropic, see <https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering> | ||
| """ | ||
|
|
||
| citations_enabled: bool = False | ||
| """If True, enables citations for fetched content. | ||
|
|
||
| Supported by: | ||
|
|
||
| * Anthropic | ||
| """ | ||
|
|
||
| max_content_tokens: int | None = None | ||
| """Maximum content length in tokens for fetched content. | ||
|
|
||
| Supported by: | ||
|
|
||
| * Anthropic | ||
| """ | ||
|
|
||
| kind: str = 'web_fetch' | ||
| """The kind of tool.""" | ||
|
|
||
|
|
||
| @deprecated('Use `WebFetchTool` instead.') | ||
| class UrlContextTool(WebFetchTool): | ||
| """Deprecated alias for WebFetchTool. Use WebFetchTool instead.""" | ||
|
|
||
| def __init_subclass__(cls, **kwargs: Any) -> None: | ||
| # Skip registration in _BUILTIN_TOOL_TYPES to avoid breaking the discriminated union | ||
| pass | ||
|
|
||
|
|
||
| # Remove UrlContextTool from _BUILTIN_TOOL_TYPES and restore WebFetchTool | ||
| # This ensures the discriminated union only includes WebFetchTool | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would that cause issues with old payloads that are being deserialized now? Or old code that is now giving a deprecation warning but hasn't actually been updated yet? Would be worth testing in test_builtin_tools. |
||
| _BUILTIN_TOOL_TYPES['url_context'] = WebFetchTool | ||
|
|
||
|
|
||
| @dataclass(kw_only=True) | ||
| class ImageGenerationTool(AbstractBuiltinTool): | ||
| """A builtin tool that allows your agent to generate images. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,7 +13,7 @@ | |
| from .. import ModelHTTPError, UnexpectedModelBehavior, _utils, usage | ||
| from .._run_context import RunContext | ||
| from .._utils import guard_tool_call_id as _guard_tool_call_id | ||
| from ..builtin_tools import CodeExecutionTool, MCPServerTool, MemoryTool, WebSearchTool | ||
| from ..builtin_tools import CodeExecutionTool, MCPServerTool, MemoryTool, WebFetchTool, WebSearchTool | ||
| from ..exceptions import UserError | ||
| from ..messages import ( | ||
| BinaryContent, | ||
|
|
@@ -60,6 +60,7 @@ | |
| BetaBase64PDFBlockParam, | ||
| BetaBase64PDFSourceParam, | ||
| BetaCacheControlEphemeralParam, | ||
| BetaCitationsConfigParam, | ||
| BetaCitationsDelta, | ||
| BetaCodeExecutionTool20250522Param, | ||
| BetaCodeExecutionToolResultBlock, | ||
|
|
@@ -107,12 +108,18 @@ | |
| BetaToolUnionParam, | ||
| BetaToolUseBlock, | ||
| BetaToolUseBlockParam, | ||
| BetaWebFetchTool20250910Param, | ||
| BetaWebFetchToolResultBlock, | ||
| BetaWebFetchToolResultBlockParam, | ||
| BetaWebSearchTool20250305Param, | ||
| BetaWebSearchToolResultBlock, | ||
| BetaWebSearchToolResultBlockContent, | ||
| BetaWebSearchToolResultBlockParam, | ||
| BetaWebSearchToolResultBlockParamContentParam, | ||
| ) | ||
| from anthropic.types.beta.beta_web_fetch_tool_result_block_param import ( | ||
| Content as WebFetchToolResultBlockParamContent, | ||
| ) | ||
| from anthropic.types.beta.beta_web_search_tool_20250305_param import UserLocation | ||
| from anthropic.types.model_param import ModelParam | ||
|
|
||
|
|
@@ -412,6 +419,8 @@ def _process_response(self, response: BetaMessage) -> ModelResponse: | |
| items.append(_map_web_search_tool_result_block(item, self.system)) | ||
| elif isinstance(item, BetaCodeExecutionToolResultBlock): | ||
| items.append(_map_code_execution_tool_result_block(item, self.system)) | ||
| elif isinstance(item, BetaWebFetchToolResultBlock): | ||
| items.append(_map_web_fetch_tool_result_block(item, self.system)) | ||
| elif isinstance(item, BetaRedactedThinkingBlock): | ||
| items.append( | ||
| ThinkingPart(id='redacted_thinking', content='', signature=item.data, provider_name=self.system) | ||
|
|
@@ -507,6 +516,20 @@ def _add_builtin_tools( | |
| elif isinstance(tool, CodeExecutionTool): # pragma: no branch | ||
| tools.append(BetaCodeExecutionTool20250522Param(name='code_execution', type='code_execution_20250522')) | ||
| beta_features.append('code-execution-2025-05-22') | ||
| elif isinstance(tool, WebFetchTool): # pragma: no branch | ||
| citations = BetaCitationsConfigParam(enabled=tool.citations_enabled) if tool.citations_enabled else None | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's name our field |
||
| tools.append( | ||
| BetaWebFetchTool20250910Param( | ||
| name='web_fetch', | ||
| type='web_fetch_20250910', | ||
| max_uses=tool.max_uses, | ||
| allowed_domains=tool.allowed_domains, | ||
| blocked_domains=tool.blocked_domains, | ||
| citations=citations, | ||
| max_content_tokens=tool.max_content_tokens, | ||
| ) | ||
| ) | ||
| beta_features.append('web-fetch-2025-09-10') | ||
| elif isinstance(tool, MemoryTool): # pragma: no branch | ||
| if 'memory' not in model_request_parameters.tool_defs: | ||
| raise UserError("Built-in `MemoryTool` requires a 'memory' tool to be defined.") | ||
|
|
@@ -616,6 +639,7 @@ async def _map_message( # noqa: C901 | |
| | BetaServerToolUseBlockParam | ||
| | BetaWebSearchToolResultBlockParam | ||
| | BetaCodeExecutionToolResultBlockParam | ||
| | BetaWebFetchToolResultBlockParam | ||
| | BetaThinkingBlockParam | ||
| | BetaRedactedThinkingBlockParam | ||
| | BetaMCPToolUseBlockParam | ||
|
|
@@ -678,6 +702,14 @@ async def _map_message( # noqa: C901 | |
| input=response_part.args_as_dict(), | ||
| ) | ||
| assistant_content_params.append(server_tool_use_block_param) | ||
| elif response_part.tool_name == WebFetchTool.kind: | ||
| server_tool_use_block_param = BetaServerToolUseBlockParam( | ||
| id=tool_use_id, | ||
| type='server_tool_use', | ||
| name='web_fetch', | ||
| input=response_part.args_as_dict(), | ||
| ) | ||
| assistant_content_params.append(server_tool_use_block_param) | ||
| elif ( | ||
| response_part.tool_name.startswith(MCPServerTool.kind) | ||
| and (server_id := response_part.tool_name.split(':', 1)[1]) | ||
|
|
@@ -724,6 +756,19 @@ async def _map_message( # noqa: C901 | |
| ), | ||
| ) | ||
| ) | ||
| elif response_part.tool_name == WebFetchTool.kind and isinstance( | ||
| response_part.content, dict | ||
| ): | ||
| assistant_content_params.append( | ||
| BetaWebFetchToolResultBlockParam( | ||
| tool_use_id=tool_use_id, | ||
| type='web_fetch_tool_result', | ||
| content=cast( | ||
| WebFetchToolResultBlockParamContent, | ||
| response_part.content, # pyright: ignore[reportUnknownMemberType] | ||
| ), | ||
| ) | ||
| ) | ||
| elif response_part.tool_name.startswith(MCPServerTool.kind) and isinstance( | ||
| response_part.content, dict | ||
| ): # pragma: no branch | ||
|
|
@@ -944,6 +989,11 @@ async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: | |
| vendor_part_id=event.index, | ||
| part=_map_code_execution_tool_result_block(current_block, self.provider_name), | ||
| ) | ||
| elif isinstance(current_block, BetaWebFetchToolResultBlock): # pragma: lax no cover | ||
| yield self._parts_manager.handle_part( | ||
| vendor_part_id=event.index, | ||
| part=_map_web_fetch_tool_result_block(current_block, self.provider_name), | ||
| ) | ||
| elif isinstance(current_block, BetaMCPToolUseBlock): | ||
| call_part = _map_mcp_server_use_block(current_block, self.provider_name) | ||
| builtin_tool_calls[call_part.tool_call_id] = call_part | ||
|
|
@@ -1050,7 +1100,14 @@ def _map_server_tool_use_block(item: BetaServerToolUseBlock, provider_name: str) | |
| args=cast(dict[str, Any], item.input) or None, | ||
| tool_call_id=item.id, | ||
| ) | ||
| elif item.name in ('web_fetch', 'bash_code_execution', 'text_editor_code_execution'): # pragma: no cover | ||
| elif item.name == 'web_fetch': | ||
| return BuiltinToolCallPart( | ||
| provider_name=provider_name, | ||
| tool_name=WebFetchTool.kind, | ||
| args=cast(dict[str, Any], item.input) or None, | ||
| tool_call_id=item.id, | ||
| ) | ||
| elif item.name in ('bash_code_execution', 'text_editor_code_execution'): # pragma: no cover | ||
| raise NotImplementedError(f'Anthropic built-in tool {item.name!r} is not currently supported.') | ||
| else: | ||
| assert_never(item.name) | ||
|
|
@@ -1086,6 +1143,16 @@ def _map_code_execution_tool_result_block( | |
| ) | ||
|
|
||
|
|
||
| def _map_web_fetch_tool_result_block(item: BetaWebFetchToolResultBlock, provider_name: str) -> BuiltinToolReturnPart: | ||
| return BuiltinToolReturnPart( | ||
| provider_name=provider_name, | ||
| tool_name=WebFetchTool.kind, | ||
| # Store just the content field (BetaWebFetchBlock) which has {content, type, url, retrieved_at} | ||
| content=item.content.model_dump(mode='json'), | ||
| tool_call_id=item.tool_use_id, | ||
| ) | ||
|
|
||
|
|
||
| def _map_mcp_server_use_block(item: BetaMCPToolUseBlock, provider_name: str) -> BuiltinToolCallPart: | ||
| return BuiltinToolCallPart( | ||
| provider_name=provider_name, | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.