feat: Support Anthropic extended thinking and interleaved thinking #1744

serialx · 2025-09-15T11:20:28Z

Anthropic's API requires thinking blocks to be the first content in assistant messages when reasoning is enabled and tool calls are present.
ref: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking

This change:

Stores thinking text in ResponseReasoningItem.content field
Stores signature in encrypted_content field for metadata
Reconstructs thinking blocks when converting items->messages
Handles both streaming and non-streaming cases
Only affects Anthropic models with reasoning enabled
Adds regression test to prevent future breakage

Fixes compatibility with Claude models when using tools with interleaved thinking.

This fixes #765
This fixes #678

Code to reproduce:

import asyncio

from openai.types.shared import Reasoning
from pydantic import BaseModel

from agents import (
    Agent,
    AgentUpdatedStreamEvent,
    ModelSettings,
    RawResponsesStreamEvent,
    RunItemStreamEvent,
    Runner,
    function_tool,
)


class WeatherTemp(BaseModel):
    city: str
    temperature_range: str


class WeatherConditions(BaseModel):
    city: str
    conditions: str


@function_tool
def get_weather_temp(city: str) -> WeatherTemp:
    """Get the current weather temperature information for a specified city."""
    print("[debug] get_weather_temp called")
    return WeatherTemp(city=city, temperature_range="14-20C")


@function_tool
def get_weather_conditions(city: str) -> WeatherConditions:
    """Get the current weather conditions information for a specified city."""
    print("[debug] get_weather_conditions called")
    return WeatherConditions(city=city, conditions="Sunny with wind.")


agent = Agent(
    name="Hello world",
    instructions="You are a helpful agent.",
    tools=[get_weather_temp, get_weather_conditions],
    model="litellm/anthropic/claude-sonnet-4-20250514",
    model_settings=ModelSettings(
        reasoning=Reasoning(effort="medium", summary="auto"),
        # Works with both with and without interleaved thinking
        extra_headers={"anthropic-beta": "interleaved-thinking-2025-05-14"},
    ),
)

PROMPT = (
    "What's the temperature in Tokyo? If the temp is below 30C, get the weather "
    "conditions and if it's sunny with wind, get the temperature in Tokyo's nearest "
    "neighbour country's capital"
)


async def agent_run():
    print("=== Runner.run ===")
    result = await Runner.run(
        agent,
        input=PROMPT,
    )
    print(result.final_output)
    print()


async def agent_run_streamed():
    print("=== Runner.run_streamed ===")
    result = Runner.run_streamed(
        agent,
        input=PROMPT,
    )
    async for event in result.stream_events():
        match event:
            case RawResponsesStreamEvent(data=data):
                delta = data.delta if hasattr(data, "delta") else ""
                print(f"Raw response: {data.type} {repr(delta)}")
            case RunItemStreamEvent():
                print("Run item:", event.type)
            case AgentUpdatedStreamEvent():
                print("Agent updated:", event.type)
    print(result.final_output)


async def main():
    await agent_run_streamed()
    await agent_run()


if __name__ == "__main__":
    asyncio.run(main())

Above code results in error in current main HEAD:

litellm.exceptions.BadRequestError: litellm.BadRequestError: AnthropicException - {"type":"error","error":{"type":"invalid_request_error","message":"messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `text`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking"}}

Anthropic's API requires thinking blocks to be the first content in assistant messages when reasoning is enabled and tool calls are present. ref: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking This change: - Stores thinking text in ResponseReasoningItem.content field - Stores signature in encrypted_content field for metadata - Reconstructs thinking blocks when converting items->messages - Handles both streaming and non-streaming cases - Only affects Anthropic models with reasoning enabled - Adds regression test to prevent future breakage Fixes compatibility with Claude models when using tools with interleaved thinking.

seratch · 2025-09-15T11:29:50Z

@codex review this

chatgpt-codex-connector

Codex Review: Here are some suggestions.

Reply with @codex fix comments to fix any unresolved comments.

About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".

chatgpt-codex-connector · 2025-09-15T11:33:09Z

src/agents/models/chatcmpl_converter.py

+            elif reasoning_item := cls.maybe_reasoning_message(item):
+                # Reconstruct thinking blocks from content (text) and encrypted_content (signature)
+                content_items = reasoning_item.get("content", [])
+                signature = reasoning_item.get("encrypted_content")
+
+                if content_items:
+                    # Reconstruct thinking blocks from content and signature
+                    pending_thinking_blocks = []
+                    for content_item in content_items:
+                        if (
+                            isinstance(content_item, dict)
+                            and content_item.get("type") == "reasoning_text"
+                        ):
+                            thinking_block = {
+                                "type": "thinking",
+                                "thinking": content_item.get("text", ""),
+                            }
+                            # Add signature if available
+                            if signature:
+                                thinking_block["signature"] = signature


[P1] Preserve redacted thinking blocks when building chat messages

The reasoning item is only converted into thinking blocks when content contains reasoning_text. When Anthropic redacts the reasoning text, the SDK still returns a signature in encrypted_content but leaves content empty. In that case this branch never runs, so pending_thinking_blocks stays None and the assistant message is emitted with plain text before the tool call, which continues to violate Anthropic’s requirement that the message start with a thinking/redacted_thinking block. Consider generating a placeholder thinking block whenever a signature exists even if no text is supplied so that redacted reasoning still surfaces as the first content element.

Useful? React with 👍 / 👎.

@codex actually, when the reasoning is redacted the form is completely different. See the example in the official docs:

The second block is the example of a redacted thinking content:

{ "content": [ { "type": "thinking", "thinking": "Let me analyze this step by step...", "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...." }, { "type": "redacted_thinking", "data": "EmwKAhgBEgy3va3pzix/LafPsn4aDFIT2Xlxh0L5L8rLVyIwxtE3rAFBa8cr3qpPkNRj2YfWXGmKDxH4mPnZ5sQ7vB9URj2pLmN3kF8/dW5hR7xJ0aP1oLs9yTcMnKVf2wRpEGjH9XZaBt4UvDcPrQ..." }, { "type": "text", "text": "Based on my analysis..." } ] }

My PR currently doesn't support this case since I was unable to test this. The special test prompt mentioned in the documentation didn't work.

chatgpt-codex-connector · 2025-09-15T11:33:09Z

src/agents/models/chatcmpl_stream_handler.py

        if state.reasoning_content_index_and_output:
-            outputs.append(state.reasoning_content_index_and_output[1])
+            reasoning_item = state.reasoning_content_index_and_output[1]
+            # Store thinking text in content and signature in encrypted_content
+            if state.thinking_text:
+                # Add thinking text as a Content object
+                if not reasoning_item.content:
+                    reasoning_item.content = []
+                reasoning_item.content.append(
+                    Content(text=state.thinking_text, type="reasoning_text")
+                )
+            # Store signature in encrypted_content
+            if state.thinking_signature:
+                reasoning_item.encrypted_content = state.thinking_signature


[P1] Streaming path drops thinking block when Anthropic redacts reasoning text

The streaming handler only appends a reasoning_text content item when state.thinking_text is non-empty. For streams where Anthropic suppresses the thinking text and provides only a signature in thinking_blocks, thinking_text remains empty while thinking_signature is set. The resulting ResponseReasoningItem carries the signature but no content, so downstream conversion still produces an assistant message that begins with normal text and fails Anthropic’s extended-thinking requirement. A redacted thinking block should be emitted whenever a signature is collected even if no text arrives.

Useful? React with 👍 / 👎.

@codex This is the same as above.

To use Codex here, create an environment for this repo.

chatgpt-codex-connector

Codex Review: Here are some suggestions.

About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".

chatgpt-codex-connector · 2025-09-15T11:49:35Z

src/agents/models/chatcmpl_converter.py

+            # 7) reasoning message => extract thinking blocks if present
+            elif reasoning_item := cls.maybe_reasoning_message(item):
+                # Reconstruct thinking blocks from content (text) and encrypted_content (signature)
+                content_items = reasoning_item.get("content", [])
+                signature = reasoning_item.get("encrypted_content")
+
+                if content_items:
+                    # Reconstruct thinking blocks from content and signature
+                    pending_thinking_blocks = []
+                    for content_item in content_items:
+                        if (
+                            isinstance(content_item, dict)
+                            and content_item.get("type") == "reasoning_text"
+                        ):
+                            thinking_block = {
+                                "type": "thinking",
+                                "thinking": content_item.get("text", ""),
+                            }
+                            # Add signature if available
+                            if signature:
+                                thinking_block["signature"] = signature


[P1] Handle redacted thinking blocks when rebuilding assistant messages

This code only reconstructs thinking blocks from reasoning items when the content array contains reasoning_text, and it always emits blocks of type "thinking" with a thinking string. The Anthropic API can also return blocks of type "redacted_thinking" where the plaintext is replaced with a data payload, and those blocks must still be preserved at the start of the assistant message. In the redacted case reasoning_item.content will be empty and the encrypted payload is never captured, so no thinking block is emitted and the next assistant message begins with plain text again, triggering the same 400 error (“Expected thinking or redacted_thinking”) this change is meant to fix. Consider detecting redacted_thinking blocks, storing their data alongside the signature, and emitting {"type": "redacted_thinking", ...} when rebuilding the assistant message.

Useful? React with 👍 / 👎.

@codex My PR currently doesn't support this case since I was unable to test this. The special test prompt mentioned in the documentation didn't work. In practice, I didn't observe any redacted_thinking in the wild. I think this is a very rare condition.

To use Codex here, create an environment for this repo.

serialx · 2025-09-15T12:43:20Z

If anyone want to test this, you can add this to your pyproject.toml to test it:

[tool.uv.sources]
openai-agents = { git = "https://github.com/openai/openai-agents-python", branch = "refs/pull/1744/head" }

Move the location of thinking block insertion from `ensure_assistant_message()` to `elif func_call := cls.maybe_function_tool_call(item):` to be more specific and prevent unintended side-effects.

Be more selective/surgical on the reasoning message preservation. When we handoff from Claude 4 Sonnet Thinking to non-thinking agent, we get errors because non-thinking models expects no thinking blocks in the request. This fixes this edge case by only preserving blocks when reasoning effort is not None.

src/agents/models/chatcmpl_converter.py

Since this parameter is anthropic-specific logic, changed the name to anthropic terminology

seratch

Confirmed this works with the model

seratch added feature:chat-completions feature:lite-llm labels Sep 15, 2025

chatgpt-codex-connector bot reviewed Sep 15, 2025

View reviewed changes

serialx added 2 commits September 16, 2025 12:39

refactor: more specific code path for thinking block insertion

543d192

Move the location of thinking block insertion from `ensure_assistant_message()` to `elif func_call := cls.maybe_function_tool_call(item):` to be more specific and prevent unintended side-effects.

serialx force-pushed the main branch from af712d1 to 8ab9357 Compare September 16, 2025 05:37

seratch requested changes Sep 16, 2025

View reviewed changes

src/agents/models/chatcmpl_converter.py Outdated Show resolved Hide resolved

refactor: rename preserve_reasoning_message to preserve_thinking_blocks

c8d5e4a

Since this parameter is anthropic-specific logic, changed the name to anthropic terminology

serialx requested a review from seratch September 16, 2025 08:24

seratch approved these changes Sep 16, 2025

View reviewed changes

seratch merged commit 605611c into openai:main Sep 16, 2025
5 checks passed

This was referenced Sep 16, 2025

Support Claude extended thinking #941

Closed

fix(chatcmpl): preserve all LiteLLM thinking-block signatures in converter #1779

Merged

seratch mentioned this pull request Oct 29, 2025

feat: Anthropic extended thinking and interleaved thinking openai/openai-agents-js#628

Open

feat: Support Anthropic extended thinking and interleaved thinking #1744

feat: Support Anthropic extended thinking and interleaved thinking #1744

Uh oh!

Conversation

serialx commented Sep 15, 2025 • edited by seratch Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seratch commented Sep 15, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

serialx Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

serialx Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

serialx Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

serialx commented Sep 15, 2025

Uh oh!

Uh oh!

seratch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

serialx commented Sep 15, 2025 •

edited by seratch

Loading