fix: Anthropic extended thinking with tool use and minor fixes #2487

Kaushal-26 · 2025-08-08T20:12:17Z

Fixes: #2425 (issue)

In anthropic extended thinking we cannot send tool_choice: {"type": "any"} or tool_choice: {"type": "tool", "name": "..."} will give an error.
- Source: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#extended-thinking-with-tool-use
Fixes anthropic models file to explicitly handle thinking enabled flag.
Add testcases for output variations in pydantic as: pydantic.BaseModel, ToolOutput and StructuredDict. Failed earlier without changes
Add a warning for temperature < 1, as anthropic only allows temperature == 1 if enabled thinking and 0.95 <= top_p <= 1.
- Source: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#feature-compatibility
Minor fix in .gitignore as .venv 3.9, 3.10, ... handled and pycache, ruff cache handled as previously dependent on .gitignore file inside this folders.
Fix CLAUDE.md to use pydantic_ai_slim/pydantic_ai/agent/ instead of pydantic_ai_slim/pydantic_ai/agent.py as file doesn't exists.

- In anthropic extended thinking we cannot send `tool_choice: {"type": "any"}` or `tool_choice: {"type": "tool", "name": "..."}` will give an error. - Source: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#extended-thinking-with-tool-use - Fixes anthropic models file to explicitly handle thinking enabled flag. - Add testcases for output variations in pydantic as: `pydantic.BaseModel, ToolOutput and StructuredDict`. Failed earlier without changes - Add a warning for `temperature < 1`, as anthropic only allows `temperature == 1` if enabled thinking and `0.95 <= top_p <= 1`. - Source: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#feature-compatibility - Minor fix in .gitignore as .venv 3.9, 3.10, ... handled and pycache, ruff cache handled as previously dependent on .gitignore file inside this folders. - Fix CLAUDE.md to use `pydantic_ai_slim/pydantic_ai/agent/` instead of `pydantic_ai_slim/pydantic_ai/agent.py` as file doesn't exists.

hyperlint-ai · 2025-08-08T20:12:31Z

PR Change Summary

Enhanced Anthropic extended thinking functionality and addressed several issues.

Fixed tool choice handling in extended thinking to prevent errors.
Updated anthropic models to manage the thinking enabled flag explicitly.
Added test cases for output variations in Pydantic.
Implemented a warning for temperature settings in accordance with Anthropic guidelines.

Modified Files

CLAUDE.md

How can I customize these reviews?

Check out the Hyperlint AI Reviewer docs for more information on how to customize the review.

If you just want to ignore it on this PR, you can add the hyperlint-ignore label to the PR. Future changes won't trigger a Hyperlint review.

Note specifically for link checks, we only check the first 30 links in a file and we cache the results for several hours (for instance, if you just added a page, you might experience this). Our recommendation is to add hyperlint-ignore to the PR to ignore the link check for this PR.

DouweM · 2025-08-12T18:29:58Z

@Kaushal-26 Simply setting tool_choice to auto instead of any when thinking is enabled is not a proper fix, as this may cause the model to never actually call the output tool. The proper solution is to not use tool output mode, and instead use PromptedOutput. We can document that, and/or raise an error telling the user to do that. GPT-5 has a similar issue when built-in tools are used (#2488 (comment)) and I suggested raising an error for the user to fix. Can you please update the PR to do that here as well?

DouweM · 2025-08-12T18:30:13Z

.gitignore

Please revert these change that seem specific to your environment

DouweM · 2025-08-12T18:30:24Z

pydantic_ai_slim/pydantic_ai/models/anthropic.py


        if not tools:
            tool_choice = None
        else:
            if not model_request_parameters.allow_text_output:
                tool_choice = {'type': 'any'}
+                if thinking and thinking.get('type') == 'enabled':
+                    tool_choice = {'type': 'auto'}
+                    if temperature and temperature != 1:


I'd rather have the user see an error and adjust this manually, than to change it for them and log warnings. So please remove this from the PR for now and keep it focused on the tool choice issue. If others report the temperature/top_p issue and find it hard to work around, we can consider something like this then.

DouweM · 2025-08-12T18:32:58Z

tests/models/test_anthropic.py

@@ -2163,3 +2163,80 @@ async def test_anthropic_web_search_tool_stream(allow_model_requests: None, anth

 Additional notable stories include Vietnam's plan to ban fossil-fuel motorcycles in the heart of Hanoi starting July 2026, aiming to cut air pollution and move toward cleaner transport, and ongoing restoration efforts for Copenhagen's Old Stock Exchange, which is taking shape 15 months after a fire destroyed more than half of the 400-year-old building.\
 """)
+
+
+async def test_anthropic_extended_thinking_with_tool_use(allow_model_requests: None, anthropic_api_key: str):


With the suggested change to raise an error instead, we can modify this to try just one output_type and then use pytest.raises to verify the error

Kaushal-26 · 2025-08-12T19:11:39Z

@Kaushal-26 Simply setting tool_choice to auto instead of any when thinking is enabled is not a proper fix, as this may cause the model to never actually call the output tool. The proper solution is to not use tool output mode, and instead use PromptedOutput. We can document that, and/or raise an error telling the user to do that. GPT-5 has a similar issue when built-in tools are used (#2488 (comment)) and I suggested raising an error for the user to fix. Can you please update the PR to do that here as well?

Okay, I agree with you and am up for the requested changes.

But let's talk about this: Will the model use the tool or not for auto mode? Anthropic says:

Our testing has shown that this should not reduce performance. If you would like to keep chain-of-thought (particularly with Opus) while still requesting that the model use a specific tool, you can use {"type": "auto"} for tool_choice (the default) and add explicit instructions in a user message. For example: What's the weather like in London? Use the get_weather tool in your response.

from here: https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/implement-tool-use#forcing-tool-use

I feel we should allow both. Let me know your pov

DouweM · 2025-08-12T19:23:02Z

@Kaushal-26 Ah, good find. That says we should add explicit instructions in a user message though, which we don't currently do, although we do include this in the final_result tool description:

pydantic-ai/pydantic_ai_slim/pydantic_ai/_output.py

Line 69 in 3bc8e43

    
           DEFAULT_OUTPUT_TOOL_DESCRIPTION = 'The final response which ends this conversation'

It may help to make that more strongly worded, but I'm not sure if that will give the same good result as having it in the user message.

Kaushal-26 · 2025-08-12T19:39:30Z

tests/models/cassettes/test_anthropic/test_anthropic_extended_thinking_with_tool_use.yaml

+      - content:
+        - text: |-
+            Validation feedback:
+            Plain text responses are not permitted, please include your response in a tool call
+
+            Fix the errors and try again.
+          type: text


@DouweM I tested a few things, you are right, though the tool is not 100% sure to be called.

Even in the test cassette, it gave a response after a retry

DouweM requested changes Aug 12, 2025

View reviewed changes

DouweM self-assigned this Aug 12, 2025

DouweM added the awaiting author revision label Aug 12, 2025

Kaushal-26 commented Aug 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Anthropic extended thinking with tool use and minor fixes #2487

fix: Anthropic extended thinking with tool use and minor fixes #2487

Kaushal-26 commented Aug 8, 2025

Uh oh!

hyperlint-ai bot commented Aug 8, 2025

Uh oh!

DouweM commented Aug 12, 2025

Uh oh!

DouweM Aug 12, 2025

Uh oh!

DouweM Aug 12, 2025

Uh oh!

DouweM Aug 12, 2025

Uh oh!

Kaushal-26 commented Aug 12, 2025

Uh oh!

DouweM commented Aug 12, 2025

Uh oh!

Kaushal-26 Aug 12, 2025

Uh oh!

Uh oh!

fix: Anthropic extended thinking with tool use and minor fixes #2487

Are you sure you want to change the base?

fix: Anthropic extended thinking with tool use and minor fixes #2487

Conversation

Kaushal-26 commented Aug 8, 2025

Uh oh!

hyperlint-ai bot commented Aug 8, 2025

PR Change Summary

Uh oh!

DouweM commented Aug 12, 2025

Uh oh!

DouweM Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

DouweM Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

DouweM Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

Kaushal-26 commented Aug 12, 2025

Uh oh!

DouweM commented Aug 12, 2025

Uh oh!

Kaushal-26 Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!