Skip to content

Add claude-opus-4-5, claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS#2862

Open
juanmichelini wants to merge 3 commits intomainfrom
add-claude-opus-4-7-extended-thinking
Open

Add claude-opus-4-5, claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS#2862
juanmichelini wants to merge 3 commits intomainfrom
add-claude-opus-4-7-extended-thinking

Conversation

@juanmichelini
Copy link
Copy Markdown
Collaborator

@juanmichelini juanmichelini commented Apr 16, 2026

Summary

This PR adds claude-opus-4-5, claude-opus-4-6 and claude-opus-4-7 to the EXTENDED_THINKING_MODELS list in model_features.py.

Problem

The OpenHands SDK currently only has claude-sonnet-4-5, claude-sonnet-4-6, and claude-haiku-4-5 in the EXTENDED_THINKING_MODELS list. Both Claude Opus 4.6 and 4.7 support extended thinking capability and should be included alongside other Claude 4 models.

Changes

  • Added "claude-opus-4-5" to EXTENDED_THINKING_MODELS
  • Added "claude-opus-4-6" to EXTENDED_THINKING_MODELS
  • Added "claude-opus-4-7" to EXTENDED_THINKING_MODELS

This follows the existing pattern in PROMPT_CACHE_MODELS where both Opus 4.5, 4.6, and 4.7 are already listed.

Testing

The change is straightforward - it adds model identifiers to an existing list. The model_matches function performs case-insensitive substring matching, so both claude-opus-4-6 and claude-opus-4-7 will correctly match model identifiers like anthropic/claude-opus-4-6-20251120 and anthropic/claude-opus-4-7-20250709.


This PR was created by an AI assistant (OpenHands) on behalf of the user.

@juanmichelini can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:0706778-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-0706778-python \
  ghcr.io/openhands/agent-server:0706778-python

All tags pushed for this build

ghcr.io/openhands/agent-server:0706778-golang-amd64
ghcr.io/openhands/agent-server:0706778-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:0706778-golang-arm64
ghcr.io/openhands/agent-server:0706778-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:0706778-java-amd64
ghcr.io/openhands/agent-server:0706778-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:0706778-java-arm64
ghcr.io/openhands/agent-server:0706778-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:0706778-python-amd64
ghcr.io/openhands/agent-server:0706778-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:0706778-python-arm64
ghcr.io/openhands/agent-server:0706778-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:0706778-golang
ghcr.io/openhands/agent-server:0706778-java
ghcr.io/openhands/agent-server:0706778-python

About Multi-Architecture Support

  • Each variant tag (e.g., 0706778-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 0706778-python-amd64) are also available if needed

Both models support extended thinking capability and should be included
in the EXTENDED_THINKING_MODELS list alongside other Claude 4 models.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 16, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 16, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean addition following existing patterns.

One clarification: I notice claude-opus-4-5 is in PROMPT_CACHE_MODELS (line 113) but not being added to EXTENDED_THINKING_MODELS. Is this intentional (e.g., opus-4-5 doesn't support extended thinking, or wasn't deemed beneficial for agents)? Just want to confirm it's not an oversight.

Risk: 🟢 LOW - Straightforward configuration change adding model support.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ QA Report: PASS WITH ISSUES

The code changes work correctly and achieve the stated goal, but the PR is incomplete - it breaks the existing test suite.

Does this PR achieve its stated goal?

Partially. The code correctly adds claude-opus-4-6 and claude-opus-4-7 to the EXTENDED_THINKING_MODELS list, and the model matching works as expected with provider prefixes and version suffixes. However, the PR breaks the test suite by not updating test expectations, which means it cannot be merged as-is.

Phase Result
Environment Setup ✅ Clean build, all dependencies installed
CI & Tests ❌ sdk-tests FAILED - 1 test failure due to outdated expectations
Functional Verification ✅ Model matching works correctly for both opus-4-6 and opus-4-7
Functional Verification

Test 1: Baseline (main branch)

Step 1 — Establish baseline (main branch without the changes):

Checked out origin/main and verified the EXTENDED_THINKING_MODELS list:

EXTENDED_THINKING_MODELS: list[str] = [
    "claude-sonnet-4-5",
    "claude-sonnet-4-6",
    "claude-haiku-4-5",
]

Ran test script:

2. Test claude-opus-4-6:
   supports_extended_thinking: False

3. Test claude-opus-4-7:
   supports_extended_thinking: False

This confirms the baseline behavior - Opus 4.6 and 4.7 do NOT support extended thinking on main.

Step 2 — Apply the PR's changes:

Checked out PR branch add-claude-opus-4-7-extended-thinking which adds:

+ "claude-opus-4-6",
+ "claude-opus-4-7",

Step 3 — Re-run with the fix in place:

Ran test script on PR branch:

1. Check EXTENDED_THINKING_MODELS list:
   ['claude-sonnet-4-5', 'claude-sonnet-4-6', 'claude-haiku-4-5', 'claude-opus-4-6', 'claude-opus-4-7']

2. Test claude-opus-4-6:
   supports_extended_thinking: True
   supports_reasoning_effort: True
   supports_prompt_cache: True

3. Test claude-opus-4-7:
   supports_extended_thinking: True
   supports_reasoning_effort: True
   supports_prompt_cache: True

The code correctly identifies both models as supporting extended thinking.

Test 2: Provider Prefixes and Version Suffixes

Tested various model identifier formats:

4. Test with provider prefix (anthropic/claude-opus-4-6):
   supports_extended_thinking: True

5. Test with versioned identifier (anthropic/claude-opus-4-6-20251120):
   supports_extended_thinking: True

6. Test with versioned identifier (anthropic/claude-opus-4-7-20250709):
   supports_extended_thinking: True

The substring matching works correctly across all identifier formats.

Test 3: Existing Models Not Affected

Verified other Claude models still work correctly:

✓ claude-sonnet-4-5: True (expected: True)
✓ claude-sonnet-4-6: True (expected: True)
✓ claude-haiku-4-5: True (expected: True)
✓ claude-opus-4-5: False (expected: False)
✓ claude-3-7-sonnet: False (expected: False)

No regressions in existing model detection.

Test 4: Existing Test Suite

Ran the existing test suite:

$ uv run pytest tests/sdk/llm/test_model_features.py::test_extended_thinking_support -v

Result:

FAILED tests/sdk/llm/test_model_features.py::test_extended_thinking_support[claude-opus-4-6-False]
- assert True == False

The test expects claude-opus-4-6 to return False, but the code now correctly returns True. The test expectations need to be updated.

Issues Found

  • 🔴 Critical: The PR breaks the test suite. Test test_extended_thinking_support[claude-opus-4-6-False] fails because the test expects False but the code now returns True (which is correct behavior).
  • 🟠 Important: Missing test case for claude-opus-4-7 - the test suite should include test cases for the newly added model.

Comment thread openhands-sdk/openhands/sdk/llm/utils/model_features.py
@juanmichelini juanmichelini changed the title Add claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS Add claude-opus-4-5 and claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS Apr 16, 2026
@juanmichelini juanmichelini changed the title Add claude-opus-4-5 and claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS Add claude-opus-4-5, claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS Apr 16, 2026
@juanmichelini
Copy link
Copy Markdown
Collaborator Author

🟢 Good taste - Clean addition following existing patterns.

One clarification: I notice claude-opus-4-5 is in PROMPT_CACHE_MODELS (line 113) but not being added to EXTENDED_THINKING_MODELS. Is this intentional (e.g., opus-4-5 doesn't support extended thinking, or wasn't deemed beneficial for agents)? Just want to confirm it's not an oversight.

Risk: 🟢 LOW - Straightforward configuration change adding model support.

good catch!

@juanmichelini
Copy link
Copy Markdown
Collaborator Author

@juanmichelini juanmichelini enabled auto-merge (squash) April 16, 2026 21:46
@juanmichelini juanmichelini disabled auto-merge April 17, 2026 05:46
@juanmichelini juanmichelini enabled auto-merge (squash) April 17, 2026 05:46
@juanmichelini juanmichelini disabled auto-merge April 17, 2026 05:51
@juanmichelini
Copy link
Copy Markdown
Collaborator Author

@OpenHands can you see why checks are failing?

@openhands-ai
Copy link
Copy Markdown

openhands-ai bot commented Apr 17, 2026

I'm on it! juanmichelini can track my progress at all-hands.dev

The tests previously expected claude-opus-4-5 and claude-opus-4-6 to NOT
support extended thinking. Update test parametrization to reflect the
model_features.py change and add claude-opus-4-7 test cases.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Collaborator Author

The CI failures were:

  1. sdk-tests: Two tests in tests/sdk/llm/test_model_features.py expected claude-opus-4-5 and claude-opus-4-6 to NOT support extended thinking (expected=False), but the PR added them to EXTENDED_THINKING_MODELS, making them return True. Fixed in 11b0b66 by updating the test parametrization to expect True for all three opus models and adding claude-opus-4-7 test cases.

  2. Review Thread Gate: The review thread was already resolved — this should pass on the next run.

This comment was created by an AI assistant (OpenHands) on behalf of the user.

@openhands-ai
Copy link
Copy Markdown

openhands-ai bot commented Apr 17, 2026

Summary

The PR comment asked why CI checks were failing. I investigated and fixed both issues:

Root Cause

  1. sdk-tests failure: The test file tests/sdk/llm/test_model_features.py had claude-opus-4-5 and claude-opus-4-6 listed as models that do not support extended thinking (expected=False), but the PR's code change added them to EXTENDED_THINKING_MODELS, making the feature check return True — causing assert True == False.

  2. Review Thread Gate: Unresolved review threads — already resolved, will pass on next run.

Fix Applied (commit 11b0b66c)

  • Updated test parametrization to expect True for claude-opus-4-5, claude-opus-4-6, and claude-opus-4-7
  • Added anthropic/claude-opus-4-7 as a provider-prefixed positive test case
  • Removed claude-opus-4-5 and claude-opus-4-6 from the "don't support extended thinking" section

Checklist

  • Identified the failing checks and root cause
  • Fixed the test expectations to match the code change
  • Ran pre-commit hooks — all passed
  • Ran the specific tests — all 15 passed
  • Committed and pushed to the PR branch
  • Posted explanatory comment on the PR

Conciseness

The change is minimal — only the test file was modified (4 insertions, 2 deletions), which is exactly what was needed. No extraneous changes were made.

@github-actions
Copy link
Copy Markdown
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/llm/utils
   model_features.py58198%35
TOTAL22584647971% 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants