Add support for EXTENSIONS_REF environment variable#2607
Add support for EXTENSIONS_REF environment variable#2607juanmichelini merged 10 commits intomainfrom
Conversation
This change allows the software-agent-sdk to load extensions from a specified branch of the OpenHands/extensions repository via the EXTENSIONS_REF environment variable, enabling testing of feature branches during evaluations. Changes: - Add os import to skill.py - Update PUBLIC_SKILLS_BRANCH to read from EXTENSIONS_REF environment variable - Falls back to 'main' if EXTENSIONS_REF is not set - Add comprehensive tests for the new functionality This enables the evaluation and benchmarks workflows to specify which extensions branch to use when building images and running evaluations. Part of OpenHands/evaluation#372 Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
enyst
left a comment
There was a problem hiding this comment.
I wonder, can we use marketplace.json or a plugin reference for this?
I think Graham's implementation of the marketplace is able to load Agentskills from some git source 🤔 and same should be true about the plugins
This change allows the SDK's eval workflow to specify which extensions branch to use when triggering evaluations, enabling end-to-end testing of the extensions branch selection feature. Changes: - Add extensions_branch workflow input (defaults to 'main') - Pass extensions_branch in workflow dispatch payload - Add to parameter logging for debugging This allows testing of feature branches through CI: 1. SDK run-eval workflow accepts extensions_branch input 2. Passes it to evaluation workflow 3. Evaluation passes to benchmarks via EXTENSIONS_REF 4. SDK loads extensions from specified branch Part of OpenHands/evaluation#372 Co-authored-by: openhands <openhands@all-hands.dev>
|
@enyst marketplace.json allows bringing individual skills not entire repo. we would need to download the skills repo, and create something like this everytime: { also we would need to add support for feature branches. current change instead is a oneliner in skill.py and adding the parameter to the eval-job. |
✅ Feature Verified - Extensions Branch Selection WorkingThe Test EvidenceResult: Instance How the Skill Was LoadedFrom the
Configuration Usedgh workflow run run-eval.yml \
--repo OpenHands/software-agent-sdk \
-f sdk_ref=openhands/allow-extensions-branch-selection \
-f eval_branch=openhands/allow-extensions-branch-selection \
-f benchmarks_branch=openhands/allow-extensions-branch-selection \
-f extensions_branch=test/extensions-branch-selection-372 \
-f allow_unreleased_branches=true \
-f benchmark=swebench \
-f eval_limit=1Confirmed
This comment was posted by an AI assistant (OpenHands) on behalf of the user. |
1 similar comment
✅ Feature Verified - Extensions Branch Selection WorkingThe Test EvidenceResult: Instance How the Skill Was LoadedFrom the
Configuration Usedgh workflow run run-eval.yml \
--repo OpenHands/software-agent-sdk \
-f sdk_ref=openhands/allow-extensions-branch-selection \
-f eval_branch=openhands/allow-extensions-branch-selection \
-f benchmarks_branch=openhands/allow-extensions-branch-selection \
-f extensions_branch=test/extensions-branch-selection-372 \
-f allow_unreleased_branches=true \
-f benchmark=swebench \
-f eval_limit=1Confirmed
This comment was posted by an AI assistant (OpenHands) on behalf of the user. |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable - Core implementation is solid and pragmatic. The environment variable approach is simple and backward compatible. However, there's a critical bug in the empty string test logic that needs fixing.
all-hands-bot
left a comment
There was a problem hiding this comment.
QA Report: EXTENSIONS_REF Environment Variable Support
Summary
EXTENSIONS_REF successfully controls which branch of the extensions repo is used. However, pre-commit checks fail and there's an edge case issue with empty string handling.
Environment Setup
✅ SUCCESS
make build
# Dependencies installed successfully
# Build complete! Development environment is ready.CI & Test Status
- ❌
pre-commit: Line too long in test file (line 86: 90 > 88 characters) - ✅
agent-server-tests: Passed - ✅
tools-tests: Passed - ✅
check-docstrings: Passed - ✅
Python API: Passed - ✅
REST API (OpenAPI): Passed - 🔄
sdk-tests,build-binary-and-test: In progress
Unit tests: All 4 new tests pass
uv run pytest tests/sdk/context/skill/test_extensions_ref.py -v
# 4 passed in 0.38sFunctional Verification
Test 1: Default behavior (no EXTENSIONS_REF set)
.venv/bin/python -c "from openhands.sdk.context.skills.skill import PUBLIC_SKILLS_BRANCH; print(PUBLIC_SKILLS_BRANCH)"
# Output: main✅ PASS: Defaults to 'main' when environment variable is not set.
Test 2: Custom branch via EXTENSIONS_REF
EXTENSIONS_REF=venv-test-branch .venv/bin/python -c "
import os
os.environ['EXTENSIONS_REF'] = 'venv-test-branch'
from openhands.sdk.context.skills.skill import PUBLIC_SKILLS_BRANCH
print(f'PUBLIC_SKILLS_BRANCH: {PUBLIC_SKILLS_BRANCH}')
"
# Output: PUBLIC_SKILLS_BRANCH: venv-test-branch✅ PASS: Custom branch is correctly used when EXTENSIONS_REF is set.
Test 3: Empty string edge case
EXTENSIONS_REF='' .venv/bin/python -c "
import os
os.environ['EXTENSIONS_REF'] = ''
from openhands.sdk.context.skills.skill import PUBLIC_SKILLS_BRANCH
print(f'PUBLIC_SKILLS_BRANCH: \"{PUBLIC_SKILLS_BRANCH}\"')
"
# Output: PUBLIC_SKILLS_BRANCH: ""Test 4: Workflow changes
✅ VERIFIED: Workflow correctly accepts extensions_branch input and forwards it to evaluation repo dispatch payload.
Issues Found
🔴 Issue 1: Pre-commit failure (Line 86 in test_extensions_ref.py)
Line too long: 90 characters instead of max 88.
Current code (line 85-86):
assert PUBLIC_SKILLS_BRANCH == "", \
"Empty EXTENSIONS_REF should result in empty string (os.environ.get behavior)"Fix option 1 - Multi-line string:
assert PUBLIC_SKILLS_BRANCH == "", (
"Empty EXTENSIONS_REF should result in empty string "
"(os.environ.get behavior)"
)Fix option 2 - Shorter message:
assert PUBLIC_SKILLS_BRANCH == "", \
"Empty EXTENSIONS_REF returns empty string per os.environ.get semantics"🟠 Issue 2: Empty string handling (Line 897 in skill.py)
Problem: os.environ.get("EXTENSIONS_REF", "main") returns an empty string when EXTENSIONS_REF="" is explicitly set. This could cause git errors.
Current implementation:
PUBLIC_SKILLS_BRANCH = os.environ.get("EXTENSIONS_REF", "main")Behavior:
EXTENSIONS_REFnot set →"main"✅EXTENSIONS_REF="feature-branch"→"feature-branch"✅EXTENSIONS_REF=""→""⚠️ (empty string, could break git checkout)
Recommended fix:
PUBLIC_SKILLS_BRANCH = os.environ.get("EXTENSIONS_REF") or "main"This treats empty strings the same as missing values and falls back to 'main'.
If this fix is applied, also update test_extensions_ref_empty_string() to:
- Change line 85 to expect
"main"instead of"" - Update the docstring (line 76) to match the new behavior
🟡 Issue 3: Test docstring inconsistency (Line 76 in test_extensions_ref.py)
Docstring says "Empty EXTENSIONS_REF should fall back to 'main'" but the test expects empty string (line 85), which contradicts the docstring.
This is related to Issue #2 — if you fix the implementation, update this test to match.
Additional Notes
✅ Workflow verification: The .github/workflows/run-eval.yml changes correctly pass extensions_branch to the evaluation repo. Note that this workflow uses EXTENSIONS_BRANCH internally for dispatch logic, while the SDK code expects EXTENSIONS_REF (which will be set by downstream evaluation/benchmarks workflows).
Verdict
The feature works as designed — environment variable successfully controls the extensions branch. Functional testing confirms:
- ✅ Default behavior works (falls back to 'main')
- ✅ Custom branches work correctly
- ✅ Workflow integration is correct
However, two issues must be addressed:
- Must fix: Pre-commit check failure (line length)
- Should fix: Empty string edge case to avoid potential git errors
Both issues are straightforward to fix and do not block the core functionality.
all-hands-bot
left a comment
There was a problem hiding this comment.
Inline code review comments for specific issues:
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
The original tests used module reloading which polluted sys.modules for other tests. Using subprocesses ensures complete isolation. Co-authored-by: openhands <openhands@all-hands.dev>
Summary
This PR enables the software-agent-sdk to load extensions (skills and plugins) from a specified branch of the OpenHands/extensions repository via the
EXTENSIONS_REFenvironment variable. This allows testing of feature branches during evaluations.Part of OpenHands/evaluation#372
Changes Made
Core Changes
PUBLIC_SKILLS_BRANCHconstant to read fromEXTENSIONS_REFenvironment variablemainifEXTENSIONS_REFis not setosimport to support environment variable accessTesting
test_extensions_ref.py:Implementation Details
Before
After
Integration
This PR is part of a multi-repository feature:
extensions_branchinputextensions-refand passesEXTENSIONS_REFEXTENSIONS_REFwhen loading extensionsBackward Compatibility
✅ Fully backward compatible:
mainwhen environment variable is not setUsage
Environment Variable
Set
EXTENSIONS_REFbefore running the SDK:export EXTENSIONS_REF=feature-branch python your_script.pyDocker/Kubernetes
Pass as environment variable in container:
Programmatic (if needed)
Testing
Unit Tests
Run the new test suite:
Integration Testing
After all PRs are merged:
EXTENSIONS_REF=test-branchSecurity Considerations
Performance Impact
Next Steps
Checklist
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:f3f335b-pythonRun
All tags pushed for this build
About Multi-Architecture Support
f3f335b-python) is a multi-arch manifest supporting both amd64 and arm64f3f335b-python-amd64) are also available if needed