Add claude-opus-4-5, claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS by juanmichelini · Pull Request #2862 · OpenHands/software-agent-sdk

juanmichelini · 2026-04-16T21:22:26Z

Summary

This PR adds claude-opus-4-5, claude-opus-4-6 and claude-opus-4-7 to the EXTENDED_THINKING_MODELS list in model_features.py.

Problem

The OpenHands SDK currently only has claude-sonnet-4-5, claude-sonnet-4-6, and claude-haiku-4-5 in the EXTENDED_THINKING_MODELS list. Both Claude Opus 4.6 and 4.7 support extended thinking capability and should be included alongside other Claude 4 models.

Changes

Added "claude-opus-4-5" to EXTENDED_THINKING_MODELS
Added "claude-opus-4-6" to EXTENDED_THINKING_MODELS
Added "claude-opus-4-7" to EXTENDED_THINKING_MODELS

This follows the existing pattern in PROMPT_CACHE_MODELS where both Opus 4.5, 4.6, and 4.7 are already listed.

Testing

The change is straightforward - it adds model identifiers to an existing list. The model_matches function performs case-insensitive substring matching, so both claude-opus-4-6 and claude-opus-4-7 will correctly match model identifiers like anthropic/claude-opus-4-6-20251120 and anthropic/claude-opus-4-7-20250709.

This PR was created by an AI assistant (OpenHands) on behalf of the user.

@juanmichelini can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:0706778-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-0706778-python \
  ghcr.io/openhands/agent-server:0706778-python

All tags pushed for this build

ghcr.io/openhands/agent-server:0706778-golang-amd64
ghcr.io/openhands/agent-server:0706778-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:0706778-golang-arm64
ghcr.io/openhands/agent-server:0706778-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:0706778-java-amd64
ghcr.io/openhands/agent-server:0706778-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:0706778-java-arm64
ghcr.io/openhands/agent-server:0706778-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:0706778-python-amd64
ghcr.io/openhands/agent-server:0706778-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:0706778-python-arm64
ghcr.io/openhands/agent-server:0706778-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:0706778-golang
ghcr.io/openhands/agent-server:0706778-java
ghcr.io/openhands/agent-server:0706778-python

About Multi-Architecture Support

Each variant tag (e.g., 0706778-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 0706778-python-amd64) are also available if needed

Both models support extended thinking capability and should be included in the EXTENDED_THINKING_MODELS list alongside other Claude 4 models.

github-actions · 2026-04-16T21:22:51Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-04-16T21:23:01Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

all-hands-bot

🟢 Good taste - Clean addition following existing patterns.

One clarification: I notice claude-opus-4-5 is in PROMPT_CACHE_MODELS (line 113) but not being added to EXTENDED_THINKING_MODELS. Is this intentional (e.g., opus-4-5 doesn't support extended thinking, or wasn't deemed beneficial for agents)? Just want to confirm it's not an oversight.

Risk: 🟢 LOW - Straightforward configuration change adding model support.

all-hands-bot

⚠️ QA Report: PASS WITH ISSUES

The code changes work correctly and achieve the stated goal, but the PR is incomplete - it breaks the existing test suite.

Does this PR achieve its stated goal?

Partially. The code correctly adds claude-opus-4-6 and claude-opus-4-7 to the EXTENDED_THINKING_MODELS list, and the model matching works as expected with provider prefixes and version suffixes. However, the PR breaks the test suite by not updating test expectations, which means it cannot be merged as-is.

Phase	Result
Environment Setup	✅ Clean build, all dependencies installed
CI & Tests	❌ sdk-tests FAILED - 1 test failure due to outdated expectations
Functional Verification	✅ Model matching works correctly for both opus-4-6 and opus-4-7

Functional Verification

Test 1: Baseline (main branch)

Step 1 — Establish baseline (main branch without the changes):

Checked out origin/main and verified the EXTENDED_THINKING_MODELS list:

EXTENDED_THINKING_MODELS: list[str] = [
    "claude-sonnet-4-5",
    "claude-sonnet-4-6",
    "claude-haiku-4-5",
]

Ran test script:

2. Test claude-opus-4-6:
   supports_extended_thinking: False

3. Test claude-opus-4-7:
   supports_extended_thinking: False

This confirms the baseline behavior - Opus 4.6 and 4.7 do NOT support extended thinking on main.

Step 2 — Apply the PR's changes:

Checked out PR branch add-claude-opus-4-7-extended-thinking which adds:

+ "claude-opus-4-6",
+ "claude-opus-4-7",

Step 3 — Re-run with the fix in place:

Ran test script on PR branch:

1. Check EXTENDED_THINKING_MODELS list:
   ['claude-sonnet-4-5', 'claude-sonnet-4-6', 'claude-haiku-4-5', 'claude-opus-4-6', 'claude-opus-4-7']

2. Test claude-opus-4-6:
   supports_extended_thinking: True
   supports_reasoning_effort: True
   supports_prompt_cache: True

3. Test claude-opus-4-7:
   supports_extended_thinking: True
   supports_reasoning_effort: True
   supports_prompt_cache: True

The code correctly identifies both models as supporting extended thinking.

Test 2: Provider Prefixes and Version Suffixes

Tested various model identifier formats:

4. Test with provider prefix (anthropic/claude-opus-4-6):
   supports_extended_thinking: True

5. Test with versioned identifier (anthropic/claude-opus-4-6-20251120):
   supports_extended_thinking: True

6. Test with versioned identifier (anthropic/claude-opus-4-7-20250709):
   supports_extended_thinking: True

The substring matching works correctly across all identifier formats.

Test 3: Existing Models Not Affected

Verified other Claude models still work correctly:

✓ claude-sonnet-4-5: True (expected: True)
✓ claude-sonnet-4-6: True (expected: True)
✓ claude-haiku-4-5: True (expected: True)
✓ claude-opus-4-5: False (expected: False)
✓ claude-3-7-sonnet: False (expected: False)

No regressions in existing model detection.

Test 4: Existing Test Suite

Ran the existing test suite:

$ uv run pytest tests/sdk/llm/test_model_features.py::test_extended_thinking_support -v

Result:

FAILED tests/sdk/llm/test_model_features.py::test_extended_thinking_support[claude-opus-4-6-False]
- assert True == False

The test expects claude-opus-4-6 to return False, but the code now correctly returns True. The test expectations need to be updated.

Issues Found

🔴 Critical: The PR breaks the test suite. Test test_extended_thinking_support[claude-opus-4-6-False] fails because the test expects False but the code now returns True (which is correct behavior).
🟠 Important: Missing test case for claude-opus-4-7 - the test suite should include test cases for the newly added model.

juanmichelini · 2026-04-16T21:31:19Z

🟢 Good taste - Clean addition following existing patterns.

One clarification: I notice claude-opus-4-5 is in PROMPT_CACHE_MODELS (line 113) but not being added to EXTENDED_THINKING_MODELS. Is this intentional (e.g., opus-4-5 doesn't support extended thinking, or wasn't deemed beneficial for agents)? Just want to confirm it's not an oversight.

Risk: 🟢 LOW - Straightforward configuration change adding model support.

good catch!

juanmichelini · 2026-04-16T21:45:57Z

tested on Claude-Opus-4-6

https://eval-monitor-git-legacy-txt-list-openhands.vercel.app/?run=swebench%2Flitellm_proxy-anthropic-claude-opus-4-6%2F24535005032%2F

error still persis in Claude-Opus-4-7

https://openhands-eval-monitor.vercel.app/?run=swebench%2Flitellm_proxy-anthropic-claude-opus-4-7%2F24534824052&days=15

juanmichelini · 2026-04-17T05:51:07Z

@OpenHands can you see why checks are failing?

openhands-ai · 2026-04-17T05:51:29Z

I'm on it! juanmichelini can track my progress at all-hands.dev

The tests previously expected claude-opus-4-5 and claude-opus-4-6 to NOT support extended thinking. Update test parametrization to reflect the model_features.py change and add claude-opus-4-7 test cases. Co-authored-by: openhands <openhands@all-hands.dev>

juanmichelini · 2026-04-17T05:54:03Z

The CI failures were:

sdk-tests: Two tests in tests/sdk/llm/test_model_features.py expected claude-opus-4-5 and claude-opus-4-6 to NOT support extended thinking (expected=False), but the PR added them to EXTENDED_THINKING_MODELS, making them return True. Fixed in 11b0b66 by updating the test parametrization to expect True for all three opus models and adding claude-opus-4-7 test cases.
Review Thread Gate: The review thread was already resolved — this should pass on the next run.

This comment was created by an AI assistant (OpenHands) on behalf of the user.

openhands-ai · 2026-04-17T05:54:46Z

Summary

The PR comment asked why CI checks were failing. I investigated and fixed both issues:

Root Cause

sdk-tests failure: The test file tests/sdk/llm/test_model_features.py had claude-opus-4-5 and claude-opus-4-6 listed as models that do not support extended thinking (expected=False), but the PR's code change added them to EXTENDED_THINKING_MODELS, making the feature check return True — causing assert True == False.
Review Thread Gate: Unresolved review threads — already resolved, will pass on next run.

Fix Applied (commit `11b0b66c`)

Updated test parametrization to expect True for claude-opus-4-5, claude-opus-4-6, and claude-opus-4-7
Added anthropic/claude-opus-4-7 as a provider-prefixed positive test case
Removed claude-opus-4-5 and claude-opus-4-6 from the "don't support extended thinking" section

Checklist

Identified the failing checks and root cause
Fixed the test expectations to match the code change
Ran pre-commit hooks — all passed
Ran the specific tests — all 15 passed
Committed and pushed to the PR branch
Posted explanatory comment on the PR

Conciseness

The change is minimal — only the test file was modified (4 insertions, 2 deletions), which is exactly what was needed. No extraneous changes were made.

github-actions · 2026-04-17T05:56:23Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/llm/utils
model_features.py	58	1	98%	35
TOTAL	22584	6479	71%

Add claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS

88566c3

Both models support extended thinking capability and should be included in the EXTENDED_THINKING_MODELS list alongside other Claude 4 models.

all-hands-bot approved these changes Apr 16, 2026

View reviewed changes

all-hands-bot reviewed Apr 16, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/llm/utils/model_features.py

Update model_features.py

0706778

juanmichelini changed the title ~~Add claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS~~ Add claude-opus-4-5 and claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS Apr 16, 2026

juanmichelini changed the title ~~Add claude-opus-4-5 and claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS~~ Add claude-opus-4-5, claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS Apr 16, 2026

juanmichelini enabled auto-merge (squash) April 16, 2026 21:46

juanmichelini disabled auto-merge April 17, 2026 05:46

juanmichelini enabled auto-merge (squash) April 17, 2026 05:46

juanmichelini disabled auto-merge April 17, 2026 05:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add claude-opus-4-5, claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS#2862

Add claude-opus-4-5, claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS#2862
juanmichelini wants to merge 3 commits intomainfrom
add-claude-opus-4-7-extended-thinking

juanmichelini commented Apr 16, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

all-hands-bot left a comment

Uh oh!

all-hands-bot left a comment

Uh oh!

Uh oh!

juanmichelini commented Apr 16, 2026

Uh oh!

juanmichelini commented Apr 16, 2026

Uh oh!

juanmichelini commented Apr 17, 2026

Uh oh!

openhands-ai bot commented Apr 17, 2026

Uh oh!

juanmichelini commented Apr 17, 2026

Uh oh!

openhands-ai bot commented Apr 17, 2026

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

juanmichelini commented Apr 16, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Changes

Testing

Uh oh!

github-actions bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

⚠️ QA Report: PASS WITH ISSUES

Does this PR achieve its stated goal?

Test 1: Baseline (main branch)

Test 2: Provider Prefixes and Version Suffixes

Test 3: Existing Models Not Affected

Test 4: Existing Test Suite

Issues Found

Uh oh!

Uh oh!

juanmichelini commented Apr 16, 2026

Uh oh!

juanmichelini commented Apr 16, 2026

Uh oh!

juanmichelini commented Apr 17, 2026

Uh oh!

openhands-ai bot commented Apr 17, 2026

Uh oh!

juanmichelini commented Apr 17, 2026

Uh oh!

openhands-ai bot commented Apr 17, 2026

Summary

Root Cause

Fix Applied (commit 11b0b66c)

Checklist

Conciseness

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

juanmichelini commented Apr 16, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Apr 16, 2026 •

edited

Loading

github-actions bot commented Apr 16, 2026 •

edited

Loading

Fix Applied (commit `11b0b66c`)