-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add Agent.to_mcp()
method
#3076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the final PR Bugbot will review for you during this billing cycle
Your free Bugbot reviews will reset on November 9
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
Bug: Image Data Decoding Error
The map_from_pai_messages
function incorrectly calls base64.b64decode()
on image chunk.data
. Since chunk.data
is already binary, not a base64 string, this causes a decoding error and prevents proper image content conversion for MCP.
pydantic_ai_slim/pydantic_ai/_mcp.py#L93-L95
pydantic-ai/pydantic_ai_slim/pydantic_ai/_mcp.py
Lines 93 to 95 in 0f5e987
if isinstance(chunk, str): | |
add_msg('user', mcp_types.TextContent(type='text', text=chunk)) | |
elif isinstance(chunk, messages.BinaryContent) and chunk.is_image: |
Docs Preview
|
server_name: str | None = None, | ||
tool_name: str | None = None, | ||
tool_description: str | None = None, | ||
# TODO(Marcelo): Should this actually be a factory that is created in every tool call? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a union of static deps and a deps factory makes sense, if the deps factory would get the tool call _meta
.
name=tool_name, | ||
description=tool_description, | ||
inputSchema={'type': 'object', 'properties': {'prompt': {'type': 'string'}}}, | ||
# TODO(Marcelo): How do I get this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's not currently a nice way to get this, but it'd be useful to have a new output_json_schema
property on AbstractAgent
.
In the case of the concrete
Agent
, it would depend on agent._output_schema
:
- if
StructuredTextOutputSchema
(superclass ofNativeOutputSchema
andPromptedOutputSchema
), get it from.object_def.json_schema
- if it's
OutputSchemaWithoutMode
, get it from.processor.object_def.json_schema
- if it's
PlainTextOutputSchema
, it's just{'type': 'string'}
- if it's
ToolOutputSchema
, we need to create a union schema of all.toolset.processors
usingUnionOutputProcessor
, which currently takes a sequence of output types and createsObjectOutputProcessor
s for them on the fly, but could also take a sequence ofObjectOutputProcessor
s (the ones we get from.toolset.processors
) directly. Once we have thatUnionOutputProcessor
, the schema is on.object_def.json_schema
- if it's
ToolOrTextOutputSchema
, it's the above or{'type': 'string'}
Note that this changes a bit with some refactoring I did in #2970, but directionally it's the same: there's not currently a nice way to get this, and it's especially tricky for tool output, because we don't have a union of all types handy.
I should be able to implement this pretty quickly through, once that images PR with the output types refactor merges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this PR wait for it then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll wait for it then.
raise ValueError(f'Unknown tool: {name}') | ||
|
||
# TODO(Marcelo): Should we pass the `message_history` instead? | ||
prompt = cast(str, args['prompt']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a typed dict for args so we don't have to cast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that would be incorrect... What I actually need to check if 'prompt'
is in args
, and check that it's a string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect the library to validate the args match the type hint, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explained via Slack - answering here: no.
if name != tool_name: | ||
raise ValueError(f'Unknown tool: {name}') | ||
|
||
# TODO(Marcelo): Should we pass the `message_history` instead? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just the prompt is fine, when would the LLM generate an entire message history?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I think the point is that we need to maintain the history in the session...
Good point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need to create a database abstraction here. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure the tool should be stateful like that? If it's essentially a subagent, wouldn't multiple calls be expected to start separate subagents? I think continuing the conversation should be explicit, with some conversation ID returned and passed in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if the client wants to create a new conversation, they can open a new session.
I think continuing the conversation should be explicit, with some conversation ID returned and passed in
The MCP spec handles this with a session ID.
|
||
return dict(result=result.output) | ||
|
||
app.list_tools()(list_tools) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These could be decorators right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decorators inside a function are treated as misused type-wise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lame
PR pending work on #3076 (comment) |
No description provided.