-
Notifications
You must be signed in to change notification settings - Fork 2.7k
feat: Support Anthropic extended thinking and interleaved thinking #1744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Anthropic's API requires thinking blocks to be the first content in assistant messages when reasoning is enabled and tool calls are present. ref: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking This change: - Stores thinking text in ResponseReasoningItem.content field - Stores signature in encrypted_content field for metadata - Reconstructs thinking blocks when converting items->messages - Handles both streaming and non-streaming cases - Only affects Anthropic models with reasoning enabled - Adds regression test to prevent future breakage Fixes compatibility with Claude models when using tools with interleaved thinking.
@codex review this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codex Review: Here are some suggestions.
Reply with @codex fix comments
to fix any unresolved comments.
About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".
elif reasoning_item := cls.maybe_reasoning_message(item): | ||
# Reconstruct thinking blocks from content (text) and encrypted_content (signature) | ||
content_items = reasoning_item.get("content", []) | ||
signature = reasoning_item.get("encrypted_content") | ||
|
||
if content_items: | ||
# Reconstruct thinking blocks from content and signature | ||
pending_thinking_blocks = [] | ||
for content_item in content_items: | ||
if ( | ||
isinstance(content_item, dict) | ||
and content_item.get("type") == "reasoning_text" | ||
): | ||
thinking_block = { | ||
"type": "thinking", | ||
"thinking": content_item.get("text", ""), | ||
} | ||
# Add signature if available | ||
if signature: | ||
thinking_block["signature"] = signature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P1] Preserve redacted thinking blocks when building chat messages
The reasoning item is only converted into thinking
blocks when content
contains reasoning_text
. When Anthropic redacts the reasoning text, the SDK still returns a signature in encrypted_content
but leaves content
empty. In that case this branch never runs, so pending_thinking_blocks
stays None
and the assistant message is emitted with plain text before the tool call, which continues to violate Anthropic’s requirement that the message start with a thinking
/redacted_thinking
block. Consider generating a placeholder thinking block whenever a signature exists even if no text is supplied so that redacted reasoning still surfaces as the first content element.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@codex actually, when the reasoning is redacted the form is completely different. See the example in the official docs:
The second block is the example of a redacted thinking content:
{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
},
{
"type": "redacted_thinking",
"data": "EmwKAhgBEgy3va3pzix/LafPsn4aDFIT2Xlxh0L5L8rLVyIwxtE3rAFBa8cr3qpPkNRj2YfWXGmKDxH4mPnZ5sQ7vB9URj2pLmN3kF8/dW5hR7xJ0aP1oLs9yTcMnKVf2wRpEGjH9XZaBt4UvDcPrQ..."
},
{
"type": "text",
"text": "Based on my analysis..."
}
]
}
My PR currently doesn't support this case since I was unable to test this. The special test prompt mentioned in the documentation didn't work.
if state.reasoning_content_index_and_output: | ||
outputs.append(state.reasoning_content_index_and_output[1]) | ||
reasoning_item = state.reasoning_content_index_and_output[1] | ||
# Store thinking text in content and signature in encrypted_content | ||
if state.thinking_text: | ||
# Add thinking text as a Content object | ||
if not reasoning_item.content: | ||
reasoning_item.content = [] | ||
reasoning_item.content.append( | ||
Content(text=state.thinking_text, type="reasoning_text") | ||
) | ||
# Store signature in encrypted_content | ||
if state.thinking_signature: | ||
reasoning_item.encrypted_content = state.thinking_signature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P1] Streaming path drops thinking block when Anthropic redacts reasoning text
The streaming handler only appends a reasoning_text
content item when state.thinking_text
is non-empty. For streams where Anthropic suppresses the thinking text and provides only a signature in thinking_blocks
, thinking_text
remains empty while thinking_signature
is set. The resulting ResponseReasoningItem
carries the signature but no content, so downstream conversion still produces an assistant message that begins with normal text and fails Anthropic’s extended-thinking requirement. A redacted thinking block should be emitted whenever a signature is collected even if no text arrives.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@codex This is the same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To use Codex here, create an environment for this repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codex Review: Here are some suggestions.
About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".
# 7) reasoning message => extract thinking blocks if present | ||
elif reasoning_item := cls.maybe_reasoning_message(item): | ||
# Reconstruct thinking blocks from content (text) and encrypted_content (signature) | ||
content_items = reasoning_item.get("content", []) | ||
signature = reasoning_item.get("encrypted_content") | ||
|
||
if content_items: | ||
# Reconstruct thinking blocks from content and signature | ||
pending_thinking_blocks = [] | ||
for content_item in content_items: | ||
if ( | ||
isinstance(content_item, dict) | ||
and content_item.get("type") == "reasoning_text" | ||
): | ||
thinking_block = { | ||
"type": "thinking", | ||
"thinking": content_item.get("text", ""), | ||
} | ||
# Add signature if available | ||
if signature: | ||
thinking_block["signature"] = signature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P1] Handle redacted thinking blocks when rebuilding assistant messages
This code only reconstructs thinking blocks from reasoning items when the content array contains reasoning_text
, and it always emits blocks of type "thinking"
with a thinking
string. The Anthropic API can also return blocks of type "redacted_thinking"
where the plaintext is replaced with a data
payload, and those blocks must still be preserved at the start of the assistant message. In the redacted case reasoning_item.content
will be empty and the encrypted payload is never captured, so no thinking block is emitted and the next assistant message begins with plain text again, triggering the same 400 error (“Expected thinking
or redacted_thinking
”) this change is meant to fix. Consider detecting redacted_thinking
blocks, storing their data
alongside the signature, and emitting {"type": "redacted_thinking", ...}
when rebuilding the assistant message.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@codex My PR currently doesn't support this case since I was unable to test this. The special test prompt mentioned in the documentation didn't work. In practice, I didn't observe any redacted_thinking in the wild. I think this is a very rare condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To use Codex here, create an environment for this repo.
If anyone want to test this, you can add this to your [tool.uv.sources]
openai-agents = { git = "https://github.com/openai/openai-agents-python", branch = "refs/pull/1744/head" } |
Move the location of thinking block insertion from `ensure_assistant_message()` to `elif func_call := cls.maybe_function_tool_call(item):` to be more specific and prevent unintended side-effects.
Be more selective/surgical on the reasoning message preservation. When we handoff from Claude 4 Sonnet Thinking to non-thinking agent, we get errors because non-thinking models expects no thinking blocks in the request. This fixes this edge case by only preserving blocks when reasoning effort is not None.
Since this parameter is anthropic-specific logic, changed the name to anthropic terminology
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed this works with the model
Anthropic's API requires thinking blocks to be the first content in assistant messages when reasoning is enabled and tool calls are present.
ref: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking
This change:
Fixes compatibility with Claude models when using tools with interleaved thinking.
This fixes #765
This fixes #678
Code to reproduce:
Above code results in error in current main HEAD: