Skip to content

Add support for EXTENSIONS_REF environment variable#2607

Merged
juanmichelini merged 10 commits intomainfrom
openhands/allow-extensions-branch-selection
Apr 9, 2026
Merged

Add support for EXTENSIONS_REF environment variable#2607
juanmichelini merged 10 commits intomainfrom
openhands/allow-extensions-branch-selection

Conversation

@juanmichelini
Copy link
Copy Markdown
Collaborator

@juanmichelini juanmichelini commented Mar 30, 2026

Summary

This PR enables the software-agent-sdk to load extensions (skills and plugins) from a specified branch of the OpenHands/extensions repository via the EXTENSIONS_REF environment variable. This allows testing of feature branches during evaluations.

Part of OpenHands/evaluation#372

Changes Made

Core Changes

  • skill.py: Updated PUBLIC_SKILLS_BRANCH constant to read from EXTENSIONS_REF environment variable
  • Falls back to main if EXTENSIONS_REF is not set
  • Added os import to support environment variable access

Testing

  • Added comprehensive test suite in test_extensions_ref.py:
    • Test default behavior (falls back to 'main')
    • Test custom branch via EXTENSIONS_REF
    • Test integration with load_public_skills()
    • Test empty string handling

Implementation Details

Before

PUBLIC_SKILLS_BRANCH = "main"

After

# Allow overriding the branch via EXTENSIONS_REF environment variable
# (used by evaluation/benchmarks workflows to test feature branches)
PUBLIC_SKILLS_BRANCH = os.environ.get("EXTENSIONS_REF", "main")

Integration

This PR is part of a multi-repository feature:

  1. ✅ evaluation (PR #373): Accepts extensions_branch input
  2. ✅ benchmarks (PR #591): Accepts extensions-ref and passes EXTENSIONS_REF
  3. 🔄 software-agent-sdk (this PR): Uses EXTENSIONS_REF when loading extensions

Backward Compatibility

✅ Fully backward compatible:

  • No changes to public API
  • Defaults to main when environment variable is not set
  • Existing code continues to work without modification

Usage

Environment Variable

Set EXTENSIONS_REF before running the SDK:

export EXTENSIONS_REF=feature-branch
python your_script.py

Docker/Kubernetes

Pass as environment variable in container:

env:
  - name: EXTENSIONS_REF
    value: feature-branch

Programmatic (if needed)

import os
os.environ["EXTENSIONS_REF"] = "feature-branch"

# Then import and use SDK
from openhands.sdk.context.skills import load_public_skills
skills = load_public_skills()  # Will use feature-branch

Testing

Unit Tests

Run the new test suite:

pytest tests/sdk/context/skill/test_extensions_ref.py -v

Integration Testing

After all PRs are merged:

  1. Create a test branch in OpenHands/extensions
  2. Set EXTENSIONS_REF=test-branch
  3. Build evaluation images
  4. Verify extensions are loaded from the test branch

Security Considerations

  • The environment variable only controls which branch of the official OpenHands/extensions repository is used
  • No risk of loading from arbitrary repositories (repo URL is hardcoded)
  • Branch names are passed directly to git and validated by GitHub

Performance Impact

  • No performance impact when using default (main) branch
  • Minimal impact when using feature branches (same git clone/update mechanism)

Next Steps

  • Merge this PR
  • Test end-to-end with all three repositories
  • Update documentation if needed

Checklist

  • Core functionality implemented
  • Unit tests added
  • Backward compatible
  • Follows existing patterns
  • Documentation in code comments

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:f3f335b-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-f3f335b-python \
  ghcr.io/openhands/agent-server:f3f335b-python

All tags pushed for this build

ghcr.io/openhands/agent-server:f3f335b-golang-amd64
ghcr.io/openhands/agent-server:f3f335b-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:f3f335b-golang-arm64
ghcr.io/openhands/agent-server:f3f335b-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:f3f335b-java-amd64
ghcr.io/openhands/agent-server:f3f335b-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:f3f335b-java-arm64
ghcr.io/openhands/agent-server:f3f335b-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:f3f335b-python-amd64
ghcr.io/openhands/agent-server:f3f335b-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:f3f335b-python-arm64
ghcr.io/openhands/agent-server:f3f335b-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:f3f335b-golang
ghcr.io/openhands/agent-server:f3f335b-java
ghcr.io/openhands/agent-server:f3f335b-python

About Multi-Architecture Support

  • Each variant tag (e.g., f3f335b-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., f3f335b-python-amd64) are also available if needed

This change allows the software-agent-sdk to load extensions from a specified
branch of the OpenHands/extensions repository via the EXTENSIONS_REF environment
variable, enabling testing of feature branches during evaluations.

Changes:
- Add os import to skill.py
- Update PUBLIC_SKILLS_BRANCH to read from EXTENSIONS_REF environment variable
- Falls back to 'main' if EXTENSIONS_REF is not set
- Add comprehensive tests for the new functionality

This enables the evaluation and benchmarks workflows to specify which extensions
branch to use when building images and running evaluations.

Part of OpenHands/evaluation#372

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

Copy link
Copy Markdown
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, can we use marketplace.json or a plugin reference for this?

I think Graham's implementation of the marketplace is able to load Agentskills from some git source 🤔 and same should be true about the plugins

This change allows the SDK's eval workflow to specify which extensions
branch to use when triggering evaluations, enabling end-to-end testing
of the extensions branch selection feature.

Changes:
- Add extensions_branch workflow input (defaults to 'main')
- Pass extensions_branch in workflow dispatch payload
- Add to parameter logging for debugging

This allows testing of feature branches through CI:
1. SDK run-eval workflow accepts extensions_branch input
2. Passes it to evaluation workflow
3. Evaluation passes to benchmarks via EXTENSIONS_REF
4. SDK loads extensions from specified branch

Part of OpenHands/evaluation#372

Co-authored-by: openhands <openhands@all-hands.dev>
@juanmichelini
Copy link
Copy Markdown
Collaborator Author

@enyst marketplace.json allows bringing individual skills not entire repo. we would need to download the skills repo, and create something like this everytime:

{
"skills": [
{"name": "github", "source": {"repo": "...", "ref": "feature-branch", "path": "skills/github"}},
{"name": "linear", "source": {"repo": "...", "ref": "feature-branch", "path": "skills/linear"}},
// ... and so on
]
}

also we would need to add support for feature branches.

current change instead is a oneliner in skill.py and adding the parameter to the eval-job.

Copy link
Copy Markdown
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, thank you!

Copy link
Copy Markdown
Collaborator Author

✅ Feature Verified - Extensions Branch Selection Working

The extensions_branch parameter feature has been successfully tested and is working correctly.

Test Evidence

Eval Run: https://openhands-eval-monitor.vercel.app/?run=swebench%2Flitellm_proxy-claude-sonnet-4-5-20250929%2F24157514197%2F

Result: Instance scikit-learn__scikit-learn-13439 resolved successfully (✓)

How the Skill Was Loaded

From the results.tar.gz analysis:

  1. test-marker-372 skill loaded from test branch:

    • Source: /root/.openhands/cache/skills/public-skills/skills/test-marker-372/SKILL.md
    • Description: [TEST-BRANCH-372-MARKER] This is a test marker skill to verify extensions branch selection is working
  2. github skill modified to include test marker:

    • Content includes: **🔍 TEST MARKER FOR ISSUE #372: If you see this message, the extensions branch selection feature is working! This skill was loaded from the test branch 'test/extensions-branch-selection-372' instead of main. 🔍**

Configuration Used

gh workflow run run-eval.yml \
  --repo OpenHands/software-agent-sdk \
  -f sdk_ref=openhands/allow-extensions-branch-selection \
  -f eval_branch=openhands/allow-extensions-branch-selection \
  -f benchmarks_branch=openhands/allow-extensions-branch-selection \
  -f extensions_branch=test/extensions-branch-selection-372 \
  -f allow_unreleased_branches=true \
  -f benchmark=swebench \
  -f eval_limit=1

Confirmed

  • extensions_branch parameter passed correctly
  • EXTENSIONS_REF environment variable was set
  • ✅ SDK loaded extensions from the specified branch
  • ✅ Feature working end-to-end!

This comment was posted by an AI assistant (OpenHands) on behalf of the user.

1 similar comment
Copy link
Copy Markdown
Collaborator Author

✅ Feature Verified - Extensions Branch Selection Working

The extensions_branch parameter feature has been successfully tested and is working correctly.

Test Evidence

Eval Run: https://openhands-eval-monitor.vercel.app/?run=swebench%2Flitellm_proxy-claude-sonnet-4-5-20250929%2F24157514197%2F

Result: Instance scikit-learn__scikit-learn-13439 resolved successfully (✓)

How the Skill Was Loaded

From the results.tar.gz analysis:

  1. test-marker-372 skill loaded from test branch:

    • Source: /root/.openhands/cache/skills/public-skills/skills/test-marker-372/SKILL.md
    • Description: [TEST-BRANCH-372-MARKER] This is a test marker skill to verify extensions branch selection is working
  2. github skill modified to include test marker:

    • Content includes: **🔍 TEST MARKER FOR ISSUE #372: If you see this message, the extensions branch selection feature is working! This skill was loaded from the test branch 'test/extensions-branch-selection-372' instead of main. 🔍**

Configuration Used

gh workflow run run-eval.yml \
  --repo OpenHands/software-agent-sdk \
  -f sdk_ref=openhands/allow-extensions-branch-selection \
  -f eval_branch=openhands/allow-extensions-branch-selection \
  -f benchmarks_branch=openhands/allow-extensions-branch-selection \
  -f extensions_branch=test/extensions-branch-selection-372 \
  -f allow_unreleased_branches=true \
  -f benchmark=swebench \
  -f eval_limit=1

Confirmed

  • extensions_branch parameter passed correctly
  • EXTENSIONS_REF environment variable was set
  • ✅ SDK loaded extensions from the specified branch
  • ✅ Feature working end-to-end!

This comment was posted by an AI assistant (OpenHands) on behalf of the user.

@juanmichelini juanmichelini marked this pull request as ready for review April 9, 2026 15:42
@juanmichelini juanmichelini enabled auto-merge (squash) April 9, 2026 15:42
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable - Core implementation is solid and pragmatic. The environment variable approach is simple and backward compatible. However, there's a critical bug in the empty string test logic that needs fixing.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QA Report: EXTENSIONS_REF Environment Variable Support

Summary

⚠️ PASS WITH ISSUES: The core functionality works correctly — EXTENSIONS_REF successfully controls which branch of the extensions repo is used. However, pre-commit checks fail and there's an edge case issue with empty string handling.


Environment Setup

SUCCESS

make build
# Dependencies installed successfully
# Build complete! Development environment is ready.

CI & Test Status

⚠️ Pre-commit check FAILING

  • pre-commit: Line too long in test file (line 86: 90 > 88 characters)
  • agent-server-tests: Passed
  • tools-tests: Passed
  • check-docstrings: Passed
  • Python API: Passed
  • REST API (OpenAPI): Passed
  • 🔄 sdk-tests, build-binary-and-test: In progress

Unit tests: All 4 new tests pass

uv run pytest tests/sdk/context/skill/test_extensions_ref.py -v
# 4 passed in 0.38s

Functional Verification

Test 1: Default behavior (no EXTENSIONS_REF set)

.venv/bin/python -c "from openhands.sdk.context.skills.skill import PUBLIC_SKILLS_BRANCH; print(PUBLIC_SKILLS_BRANCH)"
# Output: main

PASS: Defaults to 'main' when environment variable is not set.

Test 2: Custom branch via EXTENSIONS_REF

EXTENSIONS_REF=venv-test-branch .venv/bin/python -c "
import os
os.environ['EXTENSIONS_REF'] = 'venv-test-branch'
from openhands.sdk.context.skills.skill import PUBLIC_SKILLS_BRANCH
print(f'PUBLIC_SKILLS_BRANCH: {PUBLIC_SKILLS_BRANCH}')
"
# Output: PUBLIC_SKILLS_BRANCH: venv-test-branch

PASS: Custom branch is correctly used when EXTENSIONS_REF is set.

Test 3: Empty string edge case

EXTENSIONS_REF='' .venv/bin/python -c "
import os
os.environ['EXTENSIONS_REF'] = ''
from openhands.sdk.context.skills.skill import PUBLIC_SKILLS_BRANCH
print(f'PUBLIC_SKILLS_BRANCH: \"{PUBLIC_SKILLS_BRANCH}\"')
"
# Output: PUBLIC_SKILLS_BRANCH: ""

⚠️ ISSUE FOUND: Empty string is NOT treated as fallback to 'main'. This could cause git errors when trying to checkout an empty branch name.

Test 4: Workflow changes

VERIFIED: Workflow correctly accepts extensions_branch input and forwards it to evaluation repo dispatch payload.


Issues Found

🔴 Issue 1: Pre-commit failure (Line 86 in test_extensions_ref.py)

Line too long: 90 characters instead of max 88.

Current code (line 85-86):

assert PUBLIC_SKILLS_BRANCH == "", \
    "Empty EXTENSIONS_REF should result in empty string (os.environ.get behavior)"

Fix option 1 - Multi-line string:

assert PUBLIC_SKILLS_BRANCH == "", (
    "Empty EXTENSIONS_REF should result in empty string "
    "(os.environ.get behavior)"
)

Fix option 2 - Shorter message:

assert PUBLIC_SKILLS_BRANCH == "", \
    "Empty EXTENSIONS_REF returns empty string per os.environ.get semantics"

🟠 Issue 2: Empty string handling (Line 897 in skill.py)

Problem: os.environ.get("EXTENSIONS_REF", "main") returns an empty string when EXTENSIONS_REF="" is explicitly set. This could cause git errors.

Current implementation:

PUBLIC_SKILLS_BRANCH = os.environ.get("EXTENSIONS_REF", "main")

Behavior:

  • EXTENSIONS_REF not set → "main"
  • EXTENSIONS_REF="feature-branch""feature-branch"
  • EXTENSIONS_REF="""" ⚠️ (empty string, could break git checkout)

Recommended fix:

PUBLIC_SKILLS_BRANCH = os.environ.get("EXTENSIONS_REF") or "main"

This treats empty strings the same as missing values and falls back to 'main'.

If this fix is applied, also update test_extensions_ref_empty_string() to:

  1. Change line 85 to expect "main" instead of ""
  2. Update the docstring (line 76) to match the new behavior

🟡 Issue 3: Test docstring inconsistency (Line 76 in test_extensions_ref.py)

Docstring says "Empty EXTENSIONS_REF should fall back to 'main'" but the test expects empty string (line 85), which contradicts the docstring.

This is related to Issue #2 — if you fix the implementation, update this test to match.


Additional Notes

Workflow verification: The .github/workflows/run-eval.yml changes correctly pass extensions_branch to the evaluation repo. Note that this workflow uses EXTENSIONS_BRANCH internally for dispatch logic, while the SDK code expects EXTENSIONS_REF (which will be set by downstream evaluation/benchmarks workflows).


Verdict

⚠️ PASS WITH ISSUES

The feature works as designed — environment variable successfully controls the extensions branch. Functional testing confirms:

  • ✅ Default behavior works (falls back to 'main')
  • ✅ Custom branches work correctly
  • ✅ Workflow integration is correct

However, two issues must be addressed:

  1. Must fix: Pre-commit check failure (line length)
  2. Should fix: Empty string edge case to avoid potential git errors

Both issues are straightforward to fix and do not block the core functionality.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline code review comments for specific issues:

openhands-agent and others added 5 commits April 9, 2026 18:39
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
The original tests used module reloading which polluted sys.modules for
other tests. Using subprocesses ensures complete isolation.

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/context/skills
   skill.py4313392%95–96, 251–252, 438–444, 447, 564–567, 811–812, 871–872, 942–943, 1011, 1039, 1062, 1069–1070, 1114–1115, 1121–1122, 1128–1129
TOTAL22289645971% 

@juanmichelini juanmichelini merged commit d2a77ab into main Apr 9, 2026
23 checks passed
@juanmichelini juanmichelini deleted the openhands/allow-extensions-branch-selection branch April 9, 2026 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants