Add claude-opus-4-5, claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS#2862
Add claude-opus-4-5, claude-opus-4-6 and claude-opus-4-7 to EXTENDED_THINKING_MODELS#2862juanmichelini wants to merge 3 commits intomainfrom
Conversation
Both models support extended thinking capability and should be included in the EXTENDED_THINKING_MODELS list alongside other Claude 4 models.
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste - Clean addition following existing patterns.
One clarification: I notice claude-opus-4-5 is in PROMPT_CACHE_MODELS (line 113) but not being added to EXTENDED_THINKING_MODELS. Is this intentional (e.g., opus-4-5 doesn't support extended thinking, or wasn't deemed beneficial for agents)? Just want to confirm it's not an oversight.
Risk: 🟢 LOW - Straightforward configuration change adding model support.
all-hands-bot
left a comment
There was a problem hiding this comment.
⚠️ QA Report: PASS WITH ISSUES
The code changes work correctly and achieve the stated goal, but the PR is incomplete - it breaks the existing test suite.
Does this PR achieve its stated goal?
Partially. The code correctly adds claude-opus-4-6 and claude-opus-4-7 to the EXTENDED_THINKING_MODELS list, and the model matching works as expected with provider prefixes and version suffixes. However, the PR breaks the test suite by not updating test expectations, which means it cannot be merged as-is.
| Phase | Result |
|---|---|
| Environment Setup | ✅ Clean build, all dependencies installed |
| CI & Tests | ❌ sdk-tests FAILED - 1 test failure due to outdated expectations |
| Functional Verification | ✅ Model matching works correctly for both opus-4-6 and opus-4-7 |
Functional Verification
Test 1: Baseline (main branch)
Step 1 — Establish baseline (main branch without the changes):
Checked out origin/main and verified the EXTENDED_THINKING_MODELS list:
EXTENDED_THINKING_MODELS: list[str] = [
"claude-sonnet-4-5",
"claude-sonnet-4-6",
"claude-haiku-4-5",
]Ran test script:
2. Test claude-opus-4-6:
supports_extended_thinking: False
3. Test claude-opus-4-7:
supports_extended_thinking: False
This confirms the baseline behavior - Opus 4.6 and 4.7 do NOT support extended thinking on main.
Step 2 — Apply the PR's changes:
Checked out PR branch add-claude-opus-4-7-extended-thinking which adds:
+ "claude-opus-4-6",
+ "claude-opus-4-7",Step 3 — Re-run with the fix in place:
Ran test script on PR branch:
1. Check EXTENDED_THINKING_MODELS list:
['claude-sonnet-4-5', 'claude-sonnet-4-6', 'claude-haiku-4-5', 'claude-opus-4-6', 'claude-opus-4-7']
2. Test claude-opus-4-6:
supports_extended_thinking: True
supports_reasoning_effort: True
supports_prompt_cache: True
3. Test claude-opus-4-7:
supports_extended_thinking: True
supports_reasoning_effort: True
supports_prompt_cache: True
The code correctly identifies both models as supporting extended thinking.
Test 2: Provider Prefixes and Version Suffixes
Tested various model identifier formats:
4. Test with provider prefix (anthropic/claude-opus-4-6):
supports_extended_thinking: True
5. Test with versioned identifier (anthropic/claude-opus-4-6-20251120):
supports_extended_thinking: True
6. Test with versioned identifier (anthropic/claude-opus-4-7-20250709):
supports_extended_thinking: True
The substring matching works correctly across all identifier formats.
Test 3: Existing Models Not Affected
Verified other Claude models still work correctly:
✓ claude-sonnet-4-5: True (expected: True)
✓ claude-sonnet-4-6: True (expected: True)
✓ claude-haiku-4-5: True (expected: True)
✓ claude-opus-4-5: False (expected: False)
✓ claude-3-7-sonnet: False (expected: False)
No regressions in existing model detection.
Test 4: Existing Test Suite
Ran the existing test suite:
$ uv run pytest tests/sdk/llm/test_model_features.py::test_extended_thinking_support -vResult:
FAILED tests/sdk/llm/test_model_features.py::test_extended_thinking_support[claude-opus-4-6-False]
- assert True == False
The test expects claude-opus-4-6 to return False, but the code now correctly returns True. The test expectations need to be updated.
Issues Found
- 🔴 Critical: The PR breaks the test suite. Test
test_extended_thinking_support[claude-opus-4-6-False]fails because the test expectsFalsebut the code now returnsTrue(which is correct behavior). - 🟠 Important: Missing test case for
claude-opus-4-7- the test suite should include test cases for the newly added model.
good catch! |
|
tested on Claude-Opus-4-6 error still persis in Claude-Opus-4-7 |
|
@OpenHands can you see why checks are failing? |
|
I'm on it! juanmichelini can track my progress at all-hands.dev |
The tests previously expected claude-opus-4-5 and claude-opus-4-6 to NOT support extended thinking. Update test parametrization to reflect the model_features.py change and add claude-opus-4-7 test cases. Co-authored-by: openhands <openhands@all-hands.dev>
|
The CI failures were:
This comment was created by an AI assistant (OpenHands) on behalf of the user. |
SummaryThe PR comment asked why CI checks were failing. I investigated and fixed both issues: Root Cause
Fix Applied (commit
|
Summary
This PR adds
claude-opus-4-5,claude-opus-4-6andclaude-opus-4-7to theEXTENDED_THINKING_MODELSlist inmodel_features.py.Problem
The OpenHands SDK currently only has
claude-sonnet-4-5,claude-sonnet-4-6, andclaude-haiku-4-5in theEXTENDED_THINKING_MODELSlist. Both Claude Opus 4.6 and 4.7 support extended thinking capability and should be included alongside other Claude 4 models.Changes
"claude-opus-4-5"toEXTENDED_THINKING_MODELS"claude-opus-4-6"toEXTENDED_THINKING_MODELS"claude-opus-4-7"toEXTENDED_THINKING_MODELSThis follows the existing pattern in
PROMPT_CACHE_MODELSwhere both Opus 4.5, 4.6, and 4.7 are already listed.Testing
The change is straightforward - it adds model identifiers to an existing list. The
model_matchesfunction performs case-insensitive substring matching, so bothclaude-opus-4-6andclaude-opus-4-7will correctly match model identifiers likeanthropic/claude-opus-4-6-20251120andanthropic/claude-opus-4-7-20250709.This PR was created by an AI assistant (OpenHands) on behalf of the user.
@juanmichelini can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:0706778-pythonRun
All tags pushed for this build
About Multi-Architecture Support
0706778-python) is a multi-arch manifest supporting both amd64 and arm640706778-python-amd64) are also available if needed