Skip to content

fix(ci): keep REST release-based and enforce SDK deprecation runway#2802

Open
enyst wants to merge 3 commits intomainfrom
openhands/fix-rest-api-base-ref-check
Open

fix(ci): keep REST release-based and enforce SDK deprecation runway#2802
enyst wants to merge 3 commits intomainfrom
openhands/fix-rest-api-base-ref-check

Conversation

@enyst
Copy link
Copy Markdown
Collaborator

@enyst enyst commented Apr 10, 2026

Summary

  • revert the earlier PR-base REST OpenAPI comparison so the REST API breakage gate remains based on the last published release
  • align the SDK Python API breakage policy with the documented requirement that public removals need a deprecation schedule spanning 5 minor releases
  • teach the SDK breakage checker to read deprecation metadata and fail removals that are undeprecated, missing a scheduled removal version, scheduled too early, or removed before their declared removed_in
  • add focused tests for the metadata capture and deprecation-runway enforcement, and update repo memory/docs to match

Release-history notes for PR #2789

  • VerificationSettings.confirmation_mode and VerificationSettings.security_analyzer first appeared in release v1.16.0 via commit 5d158246 (sdk: centralize programmatic settings schema)
  • they are also present in v1.16.1
  • they do not exist in v1.15.0
  • so they have been part of the released public API for 2 releases total: one minor release (v1.16.0) and one patch release (v1.16.1)

Testing

  • uv run pre-commit run --files AGENTS.md openhands-sdk/openhands/sdk/AGENTS.md .github/scripts/check_sdk_api_breakage.py tests/ci_scripts/test_check_sdk_api_breakage.py
  • uv run pytest tests/ci_scripts/test_check_sdk_api_breakage.py -q

This PR was created by an AI assistant (OpenHands) on behalf of the user.


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:c2d6701-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-c2d6701-python \
  ghcr.io/openhands/agent-server:c2d6701-python

All tags pushed for this build

ghcr.io/openhands/agent-server:c2d6701-golang-amd64
ghcr.io/openhands/agent-server:c2d6701-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:c2d6701-golang-arm64
ghcr.io/openhands/agent-server:c2d6701-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:c2d6701-java-amd64
ghcr.io/openhands/agent-server:c2d6701-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:c2d6701-java-arm64
ghcr.io/openhands/agent-server:c2d6701-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:c2d6701-python-amd64
ghcr.io/openhands/agent-server:c2d6701-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:c2d6701-python-arm64
ghcr.io/openhands/agent-server:c2d6701-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:c2d6701-golang
ghcr.io/openhands/agent-server:c2d6701-java
ghcr.io/openhands/agent-server:c2d6701-python

About Multi-Architecture Support

  • Each variant tag (e.g., c2d6701-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., c2d6701-python-amd64) are also available if needed

Add a PR-base OpenAPI comparison to the REST API breakage check so unreleased endpoint removals on main are caught before merge.

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 10, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 10, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable - Solid, pragmatic solution with one minor inefficiency.

KEY INSIGHT: This closes a real blind spot in the CI checks by comparing against both the PR base (unreleased changes) and published baseline (release compatibility). The refactoring into _check_breaking_changes() is clean and avoids duplication.

VERDICT: ✅ Worth merging - addresses a real problem demonstrated by #2789, with proper tests and graceful error handling.

)
return 0
def _normalized_openapi_copy(schema: dict) -> dict:
return _normalize_openapi_for_oasdiff(json.loads(json.dumps(schema)))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Using json.loads(json.dumps(schema)) for deep copying is inefficient and could lose Python-specific types. Consider using copy.deepcopy(schema) instead:

Suggested change
return _normalize_openapi_for_oasdiff(json.loads(json.dumps(schema)))
return _normalize_openapi_for_oasdiff(copy.deepcopy(schema))

You'd need to add import copy at the top. Not blocking since schemas are likely small, but it's cleaner and more explicit.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QA Report: REST API Base Ref Comparison

✅ PASS

All functionality works as described. The PR successfully adds PR-base OpenAPI comparison while maintaining backward compatibility with the existing baseline release checks.


Environment Setup

Status: ✅ Success

$ uv sync --dev
Resolved 401 packages in 1ms
Installed 232 packages in 412ms

All dependencies installed successfully.


CI Status

Status: ✅ All passing

$ gh pr view 2802 --json statusCheckRollup
✓ REST API (OpenAPI): SUCCESS
✓ 18 other checks: SUCCESS  
- 2 checks: SKIPPED (expected)
- 7 checks: IN PROGRESS
- 0 checks: FAILED

The critical "REST API breakage checks" workflow that this PR modifies has passed successfully.


Test Results

Status: ✅ All passing (23/23)

$ uv run pytest tests/ci_scripts/test_check_agent_server_rest_api_breakage.py -q
.......................                                                    [100%]
23 passed in 0.06s

Including the 2 new tests added in this PR:

  • test_main_rejects_breakage_against_pr_base_ref
  • test_main_warns_when_pr_base_ref_cannot_be_resolved

Pre-commit checks:

$ uv run pre-commit run --files AGENTS.md .github/scripts/check_agent_server_rest_api_breakage.py .github/workflows/agent-server-rest-api-breakage.yml tests/ci_scripts/test_check_agent_server_rest_api_breakage.py
Format YAML files........................................................Passed
Ruff format..............................................................Passed
Ruff lint................................................................Passed
PEP8 style check (pycodestyle)...........................................Passed
Type check with pyright..................................................Passed
Check import dependency rules............................................Passed
Check Tool subclass registration.........................................Passed

Functional Verification

Status: ✅ Verified

Test 1: Dual comparison when base ref is set

$ REST_API_BREAKAGE_BASE_REF="main" uv run --with packaging python .github/scripts/check_agent_server_rest_api_breakage.py
No breaking changes detected against PR base ref origin/main.
No breaking changes detected against baseline release v1.16.1.

Result: ✅ Both comparisons execute correctly

Test 2: Single comparison when base ref is not set

$ REST_API_BREAKAGE_BASE_REF="" GITHUB_BASE_REF="" uv run --with packaging python .github/scripts/check_agent_server_rest_api_breakage.py
No breaking changes detected against baseline release v1.16.1.

Result: ✅ Gracefully falls back to baseline-only comparison

Test 3: Git ref resolution

The _resolve_git_ref() function correctly tries:

  1. origin/{ref} first (for remote refs)
  2. Falls back to {ref} (for local refs)

This ensures compatibility with both GitHub Actions context and local testing.


Code Quality Review

Status: ✅ Good

  • Clean refactoring: Extracted _check_breaking_changes() to eliminate code duplication between the two comparison paths
  • Robust error handling: Uses warnings instead of failures when comparisons can't be performed, avoiding flaky CI
  • Clear comparison labels: Each comparison result clearly indicates which baseline it checked against
  • Documentation: AGENTS.md accurately documents the new dual-comparison behavior

Verdict

✅ PASS - Change works as described, no regressions detected

The PR successfully:

  1. ✅ Adds PR-base OpenAPI comparison to catch unreleased breaking changes
  2. ✅ Maintains existing baseline release comparison for published compatibility
  3. ✅ Handles edge cases gracefully (missing refs, schema generation failures)
  4. ✅ Includes comprehensive test coverage
  5. ✅ Documents the behavior in AGENTS.md

No issues found. Ready to merge.

Keep the REST API breakage gate release-based and align the Python API breakage policy with the SDK guidance that public removals need a 5-minor-release deprecation runway.

Co-authored-by: openhands <openhands@all-hands.dev>
@enyst enyst changed the title fix(ci): compare REST API changes to PR base ref fix(ci): keep REST release-based and enforce SDK deprecation runway Apr 11, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
TOTAL22316646871% 
report-only-changed-files is enabled. No files were changed during this commit :)

@enyst enyst requested a review from all-hands-bot April 11, 2026 01:59
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟢 Good taste - Clean, focused implementation that enforces a real policy requirement.

KEY INSIGHT: This closes a real enforcement gap by making the documented 5-minor-release deprecation runway programmatically checked rather than manually verified.

VERDICT: ✅ Worth merging - solves the problem demonstrated in #2789 with clean data structures, focused tests, and clear error messages.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QA Report: SDK Deprecation Runway Enforcement

✅ VERDICT: PASS

All functionality works as described. The PR successfully implements 5-minor-release deprecation runway enforcement with comprehensive validation and clear error messages.


Environment Setup

Status: Success

$ uv sync --dev
# Dependencies installed successfully

$ uv run pre-commit run --files .github/scripts/check_sdk_api_breakage.py tests/ci_scripts/test_check_sdk_api_breakage.py AGENTS.md openhands-sdk/openhands/sdk/AGENTS.md
Ruff format..............................................................Passed
Ruff lint................................................................Passed
PEP8 style check (pycodestyle)...........................................Passed
Type check with pyright..................................................Passed

CI & Test Status

All critical checks passing:

  • pre-commit: ✅ SUCCESS
  • Python API (SDK breakage check): ✅ SUCCESS
  • sdk-tests: ✅ SUCCESS
  • tools-tests: ✅ SUCCESS
  • agent-server-tests: ✅ SUCCESS
  • coverage-report: ✅ SUCCESS

⚠️ Expected failures:

  • unresolved-review-threads (not code-quality related)
$ uv run pytest tests/ci_scripts/test_check_sdk_api_breakage.py -v
========================================== 45 passed in 0.30s ==========================================

All 45 tests pass, including the 3 new tests for metadata extraction and runway enforcement.


Functional Verification

Test 1: Metadata Extraction from @deprecated

Correctly extracts deprecation metadata from decorators

# Test code with @deprecated decorator
@deprecated(deprecated_in='1.2.0', removed_in='1.7.0')
class OldClass:
    pass

class ActiveClass:
    @deprecated(deprecated_in='1.3.0', removed_in='1.8.0')
    def old_method(self):
        pass

Result:

OldClass:
  deprecated_in: 1.2.0
  removed_in: 1.7.0
ActiveClass.old_method:
  deprecated_in: 1.3.0
  removed_in: 1.8.0

Test 2: 5-Minor-Release Runway Calculation

Correctly calculates minimum removal version

DEPRECATION_RUNWAY_MINOR_RELEASES = 5

# Test: deprecated_in='1.0.0' → minimum removed_in='1.5.0'
minimum = _minimum_removed_in('1.0.0')
# Result: '1.5.0' ✅

Test 3: Deprecation Schedule Validation

Enforces all policy requirements with clear error messages

Scenario deprecated_in removed_in current Result Error Message
Valid schedule 1.0.0 1.5.0 1.5.0 ✅ PASS None
Too short runway 1.0.0 1.3.0 1.5.0 ❌ FAIL "invalid deprecation schedule... require at least 5 minor releases (minimum removed_in: 1.5.0)"
Removed too early 1.0.0 1.5.0 1.4.0 ❌ FAIL "Removed before scheduled removal version 1.5.0. Current version is 1.4.0"
Missing metadata - - 1.5.0 ❌ FAIL "Removed without prior deprecation. Mark with @deprecated... keep deprecated for 5 minor releases"
Missing removed_in 1.0.0 None 1.5.0 ❌ FAIL "does not declare removed_in. Public API removals require 5 minor releases of runway"

Test 4: warn_deprecated Support

Correctly extracts metadata from warn_deprecated() calls

class MyClass:
    def old_feature(self):
        warn_deprecated(
            'MyClass.old_feature',
            deprecated_in='1.1.0',
            removed_in='1.6.0'
        )

Extracted metadata:

MyClass.old_feature:
  deprecated_in: 1.1.0
  removed_in: 1.6.0

Test 5: End-to-End Integration

Full breakage detection with runway enforcement

Scenario A: Remove method at scheduled version (v1.5.0)

Old: @deprecated(deprecated_in='1.0.0', removed_in='1.5.0')
New: method removed
Current version: 1.5.0

Result:
  Total breaks: 1
  Removal policy errors: 0  ✅

Scenario B: Remove method before scheduled version (v1.4.0)

Old: @deprecated(deprecated_in='1.0.0', removed_in='1.5.0')
New: method removed  
Current version: 1.4.0

Result:
  Total breaks: 1
  Removal policy errors: 1  ❌
  Error: "Removed 'MyClass.old_method' before its scheduled removal version 1.5.0"

Test 6: Real SDK Code

Works with actual SDK deprecations

$ # Script successfully found and processed real deprecations:
LLM.safety_settings:
  deprecated_in: 1.15.0
  removed_in: 1.20.0

Test 7: Documentation Accuracy

All documentation updated correctly

  • AGENTS.md: Updated to mention "5 minor releases" requirement
  • openhands-sdk/openhands/sdk/AGENTS.md: Policy description updated
  • ✅ Script docstring: Accurately describes new 5-release runway policy

Issues Found

None - No bugs, regressions, or inconsistencies detected.


Summary

This PR successfully enhances the SDK API breakage checker to enforce a rigorous 5-minor-release deprecation runway policy. The implementation:

  1. ✅ Adds DeprecationMetadata dataclass to capture deprecated_in and removed_in
  2. ✅ Extracts metadata from both @deprecated() decorators and warn_deprecated() calls
  3. ✅ Validates deprecation schedules against 4 criteria:
    • Metadata exists
    • Both deprecated_in and removed_in are declared
    • Runway is at least 5 minor releases
    • Current version has reached removed_in
  4. ✅ Provides clear, actionable error messages for violations
  5. ✅ Updates all relevant documentation
  6. ✅ Includes comprehensive test coverage (3 new test cases)
  7. ✅ Maintains backward compatibility with existing deprecations

The change strengthens API stability guarantees and aligns with documented policy. No functional issues or regressions detected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants