Skip to content

Fix image output validators, unknown tool retries, incomplete tool calls, and parallel task cancellation#4325

Draft
dsfaccini wants to merge 12 commits intopydantic:mainfrom
dsfaccini:fix/agent-graph-bugfixes
Draft

Fix image output validators, unknown tool retries, incomplete tool calls, and parallel task cancellation#4325
dsfaccini wants to merge 12 commits intopydantic:mainfrom
dsfaccini:fix/agent-graph-bugfixes

Conversation

@dsfaccini
Copy link
Collaborator

@dsfaccini dsfaccini commented Feb 15, 2026

Closes #4385

Summary

1. Image output validators were being skipped (bug fix)

Files: _agent_graph.py, result.py

When you configure an agent with output_type=BinaryImage, you can register output validators — functions that run on the output before it's returned to you. These validators were never actually called for image outputs. Both the non-streaming and streaming paths now loop through validators, just like they already do for text and tool-based outputs.

2. Unknown tool calls were exhausting retries prematurely (bug fix)

File: _agent_graph.py

When a model returns a tool call for a tool that doesn't exist, pydantic-ai tells the model "that tool doesn't exist" and retries. The old code incremented a global retry counter for unknown tools before processing any valid tools in the same response. So if the model returned both an unknown tool AND a valid tool together, the global counter could hit the limit and fail the whole run. Unknown tools now use per-tool retry tracking instead.

3. Deduplicate output validator execution + fix consistent ctx.retry for output validators (#4385)

Files: _agent_graph.py, _tool_manager.py

Output validator logic was duplicated between _handle_text_response and _handle_image_response. Extracted shared helpers (_build_output_run_context, _run_output_validators).

Additionally, @agent.output_validator received inconsistent ctx.retry values: text/image paths used the global retry counter, but the tool output path used per-tool retry count. Now output tools consistently use the global output retry counter in the RunContext passed to validators.

4. Truncated tool calls now retry instead of failing immediately

File: _agent_graph.py

Removed the proactive _check_incomplete_tool_calls call from process_tool_calls. When a model's response gets truncated (finish_reason == 'length'), incomplete tool call args now go through normal validation/retry flow instead of immediately raising IncompleteToolCall. The helper is still used in increment_retries to provide a clear error message when max retries are exceeded.

5. Parallel tool tasks weren't cancelled on non-cancellation errors (bug fix)

File: _agent_graph.py

When pydantic-ai runs multiple tools in parallel and one fails, it should cancel the others. The except block only caught asyncio.CancelledError, so regular exceptions (like RuntimeError) left sibling tasks running as orphans. Changed to except BaseException so any exception triggers cancellation.

🤖 Generated with Claude Code

…lls, and parallel task cancellation

- Run output validators on image outputs (both streaming and non-streaming)
- Use per-tool retry tracking for unknown tool calls instead of global counter
- Detect and raise clear IncompleteToolCall error on truncated model responses
- Cancel all sibling tasks on any exception, not just CancelledError

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added size: S Small PR (≤100 weighted lines) bug Report that something isn't working, or PR implementing a fix labels Feb 15, 2026
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 4 additional findings in Devin Review.

Open in Devin Review

@dsfaccini dsfaccini marked this pull request as draft February 16, 2026 00:58
@github-actions github-actions bot added size: M Medium PR (101-500 weighted lines) and removed size: S Small PR (≤100 weighted lines) labels Feb 16, 2026
@github-actions
Copy link
Contributor

This PR bundles 4 separate, unrelated changes together (image output validators, unknown tool retries, incomplete tool calls, parallel task cancellation) with no linked GitHub issues for any of them. This makes it harder to review, discuss, and (if needed) revert individual changes independently.

Per the project's contribution guidelines, each of these would benefit from its own issue with clearly defined scope and maintainer input, and its own PR. This is especially true for the behavioral changes to unknown tool retry tracking and the new IncompleteToolCall detection, which alter core agent loop semantics.

I'd recommend splitting this into 4 separate PRs, each linked to its own issue, so maintainers can evaluate each change on its own merits. I've left inline comments on the specific changes below in case you proceed with this PR as-is. @DouweM

dsfaccini and others added 2 commits February 20, 2026 14:26
…4385)

- Deduplicate output validator execution between text/image response handlers
- Remove proactive _check_incomplete_tool_calls (truncated responses now retry)
- Fix ctx.retry for output validators on tool path to use global counter
- Differentiate output tools vs function tools in _call_tool RunContext

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve conflicts with upstream refactoring:
- _agent_graph.py: keep both our dedup helpers and main's _emit_skipped_output_tool;
  inline _check_incomplete_tool_calls into increment_retries (matching main);
  add increment_retries call for unknown tools (matching main)
- _tool_manager.py: apply output-tool retry differentiation to main's
  _build_tool_context method
- test_agent.py: keep main's tests, update error expectations for unknown tools
- Update test snapshots in test_streaming, test_ui, test_vercel_ai

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Report that something isn't working, or PR implementing a fix size: M Medium PR (101-500 weighted lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@agent.output_validator receives inconsistent ctx.retry depending on text vs tool output path

2 participants