Skip to content

Conversation

kazmer97
Copy link

@kazmer97 kazmer97 commented Sep 30, 2025

Summary

Map Anthropic's pause_turn finish reason to incomplete instead of 'stop' so the agent automatically continues with a new request when this finish reason is encountered.

Background

When using Anthropic's builtin tools (like web_search) with output_type set, long-running operations may trigger the pause_turn finish reason. Previously, this was mapped to 'stop', causing the agent to halt execution and attempt a retry. This led to malformed requests with errors like:

ModelHTTPError: status_code: 400, model_name: claude-sonnet-4-20250514, 
body: {'type': 'error', 'error': {'type': 'invalid_request_error', 
'message': "messages.1: 'web_search' tool use with id '<some-id>' was found 
without a corresponding 'web_search_tool_result' block"}}

Changes:

  • Add 'incomplete' to FinishReason TypeAlias with documentation
  • Map Anthropic's 'pause_turn' to 'incomplete' instead of None
  • Update agent graph to recognize 'incomplete' and continue with empty request

Benefits:

  • Provider-agnostic: other models can use 'incomplete' for similar behavior
  • Proper separation of concerns: agent graph doesn't check provider-specific details

The agent graph now checks for finish_reason == 'incomplete' instead of checking
provider_details for 'pause_turn', maintaining clean architectural boundaries.

Fix: Don't increment retries for 'incomplete' finish reason

The 'incomplete' finish reason indicates expected mid-execution pausing
(e.g., Anthropic's pause_turn during long-running builtin tools), not an
error condition. This is normal behavior where the model is saying 'I'm
working on something, let me continue' rather than 'something went wrong'.

Fixes #2600

@kazmer97 kazmer97 force-pushed the fix-anthropic-pause-turn-2600 branch from 4d76ed1 to a5ae73a Compare September 30, 2025 23:49
@DouweM
Copy link
Collaborator

DouweM commented Sep 30, 2025

@kazmer97 Thanks for working on this! Looks like most of the changes are missing though :)

@kazmer97
Copy link
Author

@kazmer97 Thanks for working on this! Looks like most of the changes are missing though :)

just started, will move to draft. trying to fix in my own project first in parallel

@kazmer97 kazmer97 marked this pull request as draft September 30, 2025 23:52
@DouweM
Copy link
Collaborator

DouweM commented Sep 30, 2025

@kazmer97 Sounds good, ping me once this is ready for review!

@kazmer97
Copy link
Author

kazmer97 commented Oct 1, 2025

@kazmer97 Sounds good, ping me once this is ready for review!

quick question while i've got your attention,

pydantic_ai_slim/pydantic_ai/_agent_graph.py:590

else:
                        # we got an empty response with no tool calls, text, thinking, or built-in tool calls.
                        # this sometimes happens with anthropic (and perhaps other models)
                        # when the model has already returned text along side tool calls
                        # in this scenario, if text responses are allowed, we return text from the most recent model
                        # response, if any
                        if isinstance(ctx.deps.output_schema, _output.TextOutputSchema):
                            for message in reversed(ctx.state.message_history):
                                if isinstance(message, _messages.ModelResponse):
                                    text = ''
                                    for part in message.parts:
                                        if isinstance(part, _messages.TextPart):
                                            text += part.content
                                        elif isinstance(part, _messages.BuiltinToolCallPart):
                                            # Text parts before a built-in tool call are essentially thoughts,
                                            # not part of the final result output, so we reset the accumulated text
                                            text = ''  # pragma: no cover
                                    if text:
                                        self._next_node = await self._handle_text_response(ctx, text)
                                        return

                        # Go back to the model request node with an empty request, which means we'll essentially
                        # resubmit the most recent request that resulted in an empty response,
                        # as the empty response and request will not create any items in the API payload,
                        # in the hope the model will return a non-empty response this time.

Setting the pause_turn mapping to None according to my reading would land us into this else statement which leads to a retry with unaltered message history which I believe we want. So I think the solution might actually be ready, but correct me please if I missed something obvious, still wrapping my head around the agent graph steps.

monkeypatching the mapper in my project seems to have solved the behaviour.
@DouweM

@DouweM
Copy link
Collaborator

DouweM commented Oct 1, 2025

@kazmer97 We'd want to automatically do a new request right? I'd expect that to be implemented inside AnthropicModel.request/AnthropicModel.request_stream. Just having finish_reason=None does not cause Pydantic AI to re-issue a request, there's no logic for that currently.

@kazmer97 kazmer97 force-pushed the fix-anthropic-pause-turn-2600 branch 3 times, most recently from 0d880f5 to 625f1c1 Compare October 1, 2025 21:28
@kazmer97 kazmer97 marked this pull request as ready for review October 1, 2025 21:35
@kazmer97
Copy link
Author

kazmer97 commented Oct 1, 2025

@kazmer97 We'd want to automatically do a new request right? I'd expect that to be implemented inside AnthropicModel.request/AnthropicModel.request_stream. Just having finish_reason=None does not cause Pydantic AI to re-issue a request, there's no logic for that currently.

I went with a slightly different approach, let me know what you think.
@DouweM

@kazmer97 kazmer97 force-pushed the fix-anthropic-pause-turn-2600 branch from 625f1c1 to 0ffbf24 Compare October 2, 2025 00:17
'content_filter',
'tool_call',
'error',
'incomplete',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't add this here, unfortunately, as these values need to match those supported by the OpenTelemetry gen_ai spec.

So if we want to implement this generically in the agent graph, we'd need to add a new incomplete=True boolean to ModelResponse. I'm hesitant to do that though, for something that so far only Anthropic requires -- we typically don't add fields unless they're supported by 2 major providers.

So if it all possible, can you find a way to implement this exclusively inside the AnthropicModel?

Copy link
Author

@kazmer97 kazmer97 Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can look into that, but as far as I understand the project design so far, isn't the agent graph that drives the api requests to the model? So far model classes have only been responsible for message translation. Wouldn't it pollute the class if we started introducing api requests into at the model level? Or do you have another mechanism in mind that would allow to manipulate the message chain before it reaches the agent graph?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To address your points:

  1. As far as I can read the gen_ai spec of otel the pydantic finish reasons are already extending over the examples that are documented.
  2. Other models are not exhibiting this behaviour yet, but it is a very likely pattern if the trend of provider hosted tools becoming more common continues. Moonshotai has their own internal websearchTool, openai has their internal websearchTool along with goggle. It is a likely pattern to emerge for web search and code execution tools that run longer server side as pause turn allows the client application to prevent timeouts on the api.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can look into that, but as far as I understand the project design so far, isn't the agent graph that drives the api requests to the model? So far model classes have only been responsible for message translation. Wouldn't it pollute the class if we started introducing api requests into at the model level? Or do you have another mechanism in mind that would allow to manipulate the message chain before it reaches the agent graph?

Model requests are already performed by the request and request_stream methods on Model subclasses. The agent graph essentially just goes back and forth on every conversation step between ModelRequestNode that calls Model.request{_stream}, and CallToolsNode which executes tools and handles output. Some API clients like OpenAI already auto-retry failed requests, and we support HTTP retries for most model classes using https://ai.pydantic.dev/retries/. This case where an incomplete response is "retried" with a new request in the same conversation step, also feels like the responsibility of the model request methods rather than the agent graph.

As far as I can read the gen_ai spec of otel the pydantic finish reasons are already extending over the examples that are documented.

The values supported by OTel are defined here: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-output-messages.json. Those are the same ones Pydantic AI uses.

2. It is a likely pattern to emerge for web search and code execution tools that run longer server side as pause turn allows the client application to prevent timeouts on the api.

I agree it would make sense for other providers to end up with a pattern like this as well, but despite Google and OpenAI both supporting code execution and web search as well now, a pause_turn equivalent hasn't shown up yet :)

Introduces a new official 'incomplete' finish reason to the FinishReason enum
to represent cases where a model pauses mid-execution and will continue (e.g.,
Anthropic's pause_turn during long-running builtin tools).

Changes:
- Add 'incomplete' to FinishReason TypeAlias with documentation
- Map Anthropic's 'pause_turn' to 'incomplete' instead of None
- Update agent graph to recognize 'incomplete' and continue with empty request

Benefits:
- Provider-agnostic: other models can use 'incomplete' for similar behavior
- Proper separation of concerns: agent graph doesn't check provider-specific details

The agent graph now checks for finish_reason == 'incomplete' instead of checking
provider_details for 'pause_turn', maintaining clean architectural boundaries.

Fix: Don't increment retries for 'incomplete' finish reason

The 'incomplete' finish reason indicates expected mid-execution pausing
(e.g., Anthropic's pause_turn during long-running builtin tools), not an
error condition. This is normal behavior where the model is saying 'I'm
working on something, let me continue' rather than 'something went wrong'.
@kazmer97 kazmer97 force-pushed the fix-anthropic-pause-turn-2600 branch from 0ffbf24 to 3b45734 Compare October 3, 2025 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Anthropic stop_reason pause_turn is not handled correctly, resulting in errors with long-running built-in tools
2 participants