Skip to content

fix: pin SDK to known-good 500-image build SHA#506

Closed
simonrosenberg wants to merge 1 commit intomainfrom
fix/issue-504-sdk-pin
Closed

fix: pin SDK to known-good 500-image build SHA#506
simonrosenberg wants to merge 1 commit intomainfrom
fix/issue-504-sdk-pin

Conversation

@simonrosenberg
Copy link
Copy Markdown
Collaborator

@simonrosenberg simonrosenberg commented Mar 12, 2026

Summary

  • pin the vendored software-agent-sdk submodule to 30819566b56b592cedcd1dc0d503a454f8ca6668
  • restore the last SDK revision already verified by successful end-to-end image builds for this regression
  • keep the benchmark-side code path unchanged so the fix is isolated to the proven regression surface

Why this fix

Issue #504 tracks the 500-image build regression that followed the SDK bump bundled into #456. Recent repo history shows:

  • successful full SWE-Bench build: run 22967520137 on SDK 30819566b56b592cedcd1dc0d503a454f8ca6668
  • successful full SWT-Bench build: run 22967516108 on the same SDK SHA
  • later SDK SHAs regress again into much slower or cancelled runs

This branch pins benchmarks back to the last SDK SHA already proven to complete both workflows end to end.

Validation

  • make build
  • uv run pytest tests/test_llm_config.py
  • dispatched full branch workflows:
    • SWE-Bench: 22985173176
    • SWT-Bench: 22985173172

Fixes #504

Pin the vendored software-agent-sdk submodule to 30819566b56b592cedcd1dc0d503a454f8ca6668, which is the last SDK commit verified by successful end-to-end 500-image SWE-Bench and SWT-Bench builds.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach is correct (rollback on regression), but the PR description has unfilled template placeholders where evidence should be. See detailed feedback below.

🟡 Verdict: The submodule rollback itself is pragmatically sound, but the PR description needs completion before merge. The claims about "known-good SHA" and workflow run links are blank/missing - fill these in to substantiate the rollback justification.

@simonrosenberg
Copy link
Copy Markdown
Collaborator Author

Pinning the sdk to an old version is not an acceptable fix since it won't allow to evaluate new sdk versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Track root cause isolation and fix plan for the 500-image build regression after #456

2 participants