fix(ci): keep REST release-based and enforce SDK deprecation runway#2802
fix(ci): keep REST release-based and enforce SDK deprecation runway#2802
Conversation
Add a PR-base OpenAPI comparison to the REST API breakage check so unreleased endpoint removals on main are caught before merge. Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable - Solid, pragmatic solution with one minor inefficiency.
KEY INSIGHT: This closes a real blind spot in the CI checks by comparing against both the PR base (unreleased changes) and published baseline (release compatibility). The refactoring into _check_breaking_changes() is clean and avoids duplication.
VERDICT: ✅ Worth merging - addresses a real problem demonstrated by #2789, with proper tests and graceful error handling.
| ) | ||
| return 0 | ||
| def _normalized_openapi_copy(schema: dict) -> dict: | ||
| return _normalize_openapi_for_oasdiff(json.loads(json.dumps(schema))) |
There was a problem hiding this comment.
🟡 Suggestion: Using json.loads(json.dumps(schema)) for deep copying is inefficient and could lose Python-specific types. Consider using copy.deepcopy(schema) instead:
| return _normalize_openapi_for_oasdiff(json.loads(json.dumps(schema))) | |
| return _normalize_openapi_for_oasdiff(copy.deepcopy(schema)) |
You'd need to add import copy at the top. Not blocking since schemas are likely small, but it's cleaner and more explicit.
all-hands-bot
left a comment
There was a problem hiding this comment.
QA Report: REST API Base Ref Comparison
✅ PASS
All functionality works as described. The PR successfully adds PR-base OpenAPI comparison while maintaining backward compatibility with the existing baseline release checks.
Environment Setup
Status: ✅ Success
$ uv sync --dev
Resolved 401 packages in 1ms
Installed 232 packages in 412msAll dependencies installed successfully.
CI Status
Status: ✅ All passing
$ gh pr view 2802 --json statusCheckRollup
✓ REST API (OpenAPI): SUCCESS
✓ 18 other checks: SUCCESS
- 2 checks: SKIPPED (expected)
- 7 checks: IN PROGRESS
- 0 checks: FAILEDThe critical "REST API breakage checks" workflow that this PR modifies has passed successfully.
Test Results
Status: ✅ All passing (23/23)
$ uv run pytest tests/ci_scripts/test_check_agent_server_rest_api_breakage.py -q
....................... [100%]
23 passed in 0.06sIncluding the 2 new tests added in this PR:
- ✅
test_main_rejects_breakage_against_pr_base_ref - ✅
test_main_warns_when_pr_base_ref_cannot_be_resolved
Pre-commit checks:
$ uv run pre-commit run --files AGENTS.md .github/scripts/check_agent_server_rest_api_breakage.py .github/workflows/agent-server-rest-api-breakage.yml tests/ci_scripts/test_check_agent_server_rest_api_breakage.py
Format YAML files........................................................Passed
Ruff format..............................................................Passed
Ruff lint................................................................Passed
PEP8 style check (pycodestyle)...........................................Passed
Type check with pyright..................................................Passed
Check import dependency rules............................................Passed
Check Tool subclass registration.........................................PassedFunctional Verification
Status: ✅ Verified
Test 1: Dual comparison when base ref is set
$ REST_API_BREAKAGE_BASE_REF="main" uv run --with packaging python .github/scripts/check_agent_server_rest_api_breakage.py
No breaking changes detected against PR base ref origin/main.
No breaking changes detected against baseline release v1.16.1.Result: ✅ Both comparisons execute correctly
Test 2: Single comparison when base ref is not set
$ REST_API_BREAKAGE_BASE_REF="" GITHUB_BASE_REF="" uv run --with packaging python .github/scripts/check_agent_server_rest_api_breakage.py
No breaking changes detected against baseline release v1.16.1.Result: ✅ Gracefully falls back to baseline-only comparison
Test 3: Git ref resolution
The _resolve_git_ref() function correctly tries:
origin/{ref}first (for remote refs)- Falls back to
{ref}(for local refs)
This ensures compatibility with both GitHub Actions context and local testing.
Code Quality Review
Status: ✅ Good
- Clean refactoring: Extracted
_check_breaking_changes()to eliminate code duplication between the two comparison paths - Robust error handling: Uses warnings instead of failures when comparisons can't be performed, avoiding flaky CI
- Clear comparison labels: Each comparison result clearly indicates which baseline it checked against
- Documentation: AGENTS.md accurately documents the new dual-comparison behavior
Verdict
✅ PASS - Change works as described, no regressions detected
The PR successfully:
- ✅ Adds PR-base OpenAPI comparison to catch unreleased breaking changes
- ✅ Maintains existing baseline release comparison for published compatibility
- ✅ Handles edge cases gracefully (missing refs, schema generation failures)
- ✅ Includes comprehensive test coverage
- ✅ Documents the behavior in AGENTS.md
No issues found. Ready to merge.
This reverts commit 86dafea.
Keep the REST API breakage gate release-based and align the Python API breakage policy with the SDK guidance that public removals need a 5-minor-release deprecation runway. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟢 Good taste - Clean, focused implementation that enforces a real policy requirement.
KEY INSIGHT: This closes a real enforcement gap by making the documented 5-minor-release deprecation runway programmatically checked rather than manually verified.
VERDICT: ✅ Worth merging - solves the problem demonstrated in #2789 with clean data structures, focused tests, and clear error messages.
all-hands-bot
left a comment
There was a problem hiding this comment.
QA Report: SDK Deprecation Runway Enforcement
✅ VERDICT: PASS
All functionality works as described. The PR successfully implements 5-minor-release deprecation runway enforcement with comprehensive validation and clear error messages.
Environment Setup
✅ Status: Success
$ uv sync --dev
# Dependencies installed successfully
$ uv run pre-commit run --files .github/scripts/check_sdk_api_breakage.py tests/ci_scripts/test_check_sdk_api_breakage.py AGENTS.md openhands-sdk/openhands/sdk/AGENTS.md
Ruff format..............................................................Passed
Ruff lint................................................................Passed
PEP8 style check (pycodestyle)...........................................Passed
Type check with pyright..................................................PassedCI & Test Status
✅ All critical checks passing:
- pre-commit: ✅ SUCCESS
- Python API (SDK breakage check): ✅ SUCCESS
- sdk-tests: ✅ SUCCESS
- tools-tests: ✅ SUCCESS
- agent-server-tests: ✅ SUCCESS
- coverage-report: ✅ SUCCESS
- unresolved-review-threads (not code-quality related)
$ uv run pytest tests/ci_scripts/test_check_sdk_api_breakage.py -v
========================================== 45 passed in 0.30s ==========================================All 45 tests pass, including the 3 new tests for metadata extraction and runway enforcement.
Functional Verification
Test 1: Metadata Extraction from @deprecated
✅ Correctly extracts deprecation metadata from decorators
# Test code with @deprecated decorator
@deprecated(deprecated_in='1.2.0', removed_in='1.7.0')
class OldClass:
pass
class ActiveClass:
@deprecated(deprecated_in='1.3.0', removed_in='1.8.0')
def old_method(self):
passResult:
OldClass:
deprecated_in: 1.2.0
removed_in: 1.7.0
ActiveClass.old_method:
deprecated_in: 1.3.0
removed_in: 1.8.0
Test 2: 5-Minor-Release Runway Calculation
✅ Correctly calculates minimum removal version
DEPRECATION_RUNWAY_MINOR_RELEASES = 5
# Test: deprecated_in='1.0.0' → minimum removed_in='1.5.0'
minimum = _minimum_removed_in('1.0.0')
# Result: '1.5.0' ✅Test 3: Deprecation Schedule Validation
✅ Enforces all policy requirements with clear error messages
| Scenario | deprecated_in | removed_in | current | Result | Error Message |
|---|---|---|---|---|---|
| Valid schedule | 1.0.0 | 1.5.0 | 1.5.0 | ✅ PASS | None |
| Too short runway | 1.0.0 | 1.3.0 | 1.5.0 | ❌ FAIL | "invalid deprecation schedule... require at least 5 minor releases (minimum removed_in: 1.5.0)" |
| Removed too early | 1.0.0 | 1.5.0 | 1.4.0 | ❌ FAIL | "Removed before scheduled removal version 1.5.0. Current version is 1.4.0" |
| Missing metadata | - | - | 1.5.0 | ❌ FAIL | "Removed without prior deprecation. Mark with @deprecated... keep deprecated for 5 minor releases" |
| Missing removed_in | 1.0.0 | None | 1.5.0 | ❌ FAIL | "does not declare removed_in. Public API removals require 5 minor releases of runway" |
Test 4: warn_deprecated Support
✅ Correctly extracts metadata from warn_deprecated() calls
class MyClass:
def old_feature(self):
warn_deprecated(
'MyClass.old_feature',
deprecated_in='1.1.0',
removed_in='1.6.0'
)Extracted metadata:
MyClass.old_feature:
deprecated_in: 1.1.0
removed_in: 1.6.0
Test 5: End-to-End Integration
✅ Full breakage detection with runway enforcement
Scenario A: Remove method at scheduled version (v1.5.0)
Old: @deprecated(deprecated_in='1.0.0', removed_in='1.5.0')
New: method removed
Current version: 1.5.0
Result:
Total breaks: 1
Removal policy errors: 0 ✅
Scenario B: Remove method before scheduled version (v1.4.0)
Old: @deprecated(deprecated_in='1.0.0', removed_in='1.5.0')
New: method removed
Current version: 1.4.0
Result:
Total breaks: 1
Removal policy errors: 1 ❌
Error: "Removed 'MyClass.old_method' before its scheduled removal version 1.5.0"
Test 6: Real SDK Code
✅ Works with actual SDK deprecations
$ # Script successfully found and processed real deprecations:
LLM.safety_settings:
deprecated_in: 1.15.0
removed_in: 1.20.0Test 7: Documentation Accuracy
✅ All documentation updated correctly
- ✅
AGENTS.md: Updated to mention "5 minor releases" requirement - ✅
openhands-sdk/openhands/sdk/AGENTS.md: Policy description updated - ✅ Script docstring: Accurately describes new 5-release runway policy
Issues Found
None - No bugs, regressions, or inconsistencies detected.
Summary
This PR successfully enhances the SDK API breakage checker to enforce a rigorous 5-minor-release deprecation runway policy. The implementation:
- ✅ Adds
DeprecationMetadatadataclass to capturedeprecated_inandremoved_in - ✅ Extracts metadata from both
@deprecated()decorators andwarn_deprecated()calls - ✅ Validates deprecation schedules against 4 criteria:
- Metadata exists
- Both
deprecated_inandremoved_inare declared - Runway is at least 5 minor releases
- Current version has reached
removed_in
- ✅ Provides clear, actionable error messages for violations
- ✅ Updates all relevant documentation
- ✅ Includes comprehensive test coverage (3 new test cases)
- ✅ Maintains backward compatibility with existing deprecations
The change strengthens API stability guarantees and aligns with documented policy. No functional issues or regressions detected.
Summary
removed_inRelease-history notes for PR #2789
VerificationSettings.confirmation_modeandVerificationSettings.security_analyzerfirst appeared in release v1.16.0 via commit5d158246(sdk: centralize programmatic settings schema)v1.16.0) and one patch release (v1.16.1)Testing
uv run pre-commit run --files AGENTS.md openhands-sdk/openhands/sdk/AGENTS.md .github/scripts/check_sdk_api_breakage.py tests/ci_scripts/test_check_sdk_api_breakage.pyuv run pytest tests/ci_scripts/test_check_sdk_api_breakage.py -qThis PR was created by an AI assistant (OpenHands) on behalf of the user.
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:c2d6701-pythonRun
All tags pushed for this build
About Multi-Architecture Support
c2d6701-python) is a multi-arch manifest supporting both amd64 and arm64c2d6701-python-amd64) are also available if needed