Add support for EXTENSIONS_REF environment variable by juanmichelini · Pull Request #2607 · OpenHands/software-agent-sdk

juanmichelini · 2026-03-30T14:32:26Z

Summary

This PR enables the software-agent-sdk to load extensions (skills and plugins) from a specified branch of the OpenHands/extensions repository via the EXTENSIONS_REF environment variable. This allows testing of feature branches during evaluations.

Part of OpenHands/evaluation#372

Changes Made

Core Changes

skill.py: Updated PUBLIC_SKILLS_BRANCH constant to read from EXTENSIONS_REF environment variable
Falls back to main if EXTENSIONS_REF is not set
Added os import to support environment variable access

Testing

Added comprehensive test suite in test_extensions_ref.py:
- Test default behavior (falls back to 'main')
- Test custom branch via EXTENSIONS_REF
- Test integration with load_public_skills()
- Test empty string handling

Implementation Details

Before

PUBLIC_SKILLS_BRANCH = "main"

After

# Allow overriding the branch via EXTENSIONS_REF environment variable
# (used by evaluation/benchmarks workflows to test feature branches)
PUBLIC_SKILLS_BRANCH = os.environ.get("EXTENSIONS_REF", "main")

Integration

This PR is part of a multi-repository feature:

✅ evaluation (PR #373): Accepts extensions_branch input
✅ benchmarks (PR #591): Accepts extensions-ref and passes EXTENSIONS_REF
🔄 software-agent-sdk (this PR): Uses EXTENSIONS_REF when loading extensions

Backward Compatibility

✅ Fully backward compatible:

No changes to public API
Defaults to main when environment variable is not set
Existing code continues to work without modification

Usage

Environment Variable

Set EXTENSIONS_REF before running the SDK:

export EXTENSIONS_REF=feature-branch
python your_script.py

Docker/Kubernetes

Pass as environment variable in container:

env:
  - name: EXTENSIONS_REF
    value: feature-branch

Programmatic (if needed)

import os
os.environ["EXTENSIONS_REF"] = "feature-branch"

# Then import and use SDK
from openhands.sdk.context.skills import load_public_skills
skills = load_public_skills()  # Will use feature-branch

Testing

Unit Tests

Run the new test suite:

pytest tests/sdk/context/skill/test_extensions_ref.py -v

Integration Testing

After all PRs are merged:

Create a test branch in OpenHands/extensions
Set EXTENSIONS_REF=test-branch
Build evaluation images
Verify extensions are loaded from the test branch

Security Considerations

The environment variable only controls which branch of the official OpenHands/extensions repository is used
No risk of loading from arbitrary repositories (repo URL is hardcoded)
Branch names are passed directly to git and validated by GitHub

Performance Impact

No performance impact when using default (main) branch
Minimal impact when using feature branches (same git clone/update mechanism)

Next Steps

Merge this PR
Test end-to-end with all three repositories
Update documentation if needed

Checklist

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:f3f335b-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-f3f335b-python \
  ghcr.io/openhands/agent-server:f3f335b-python

All tags pushed for this build

ghcr.io/openhands/agent-server:f3f335b-golang-amd64
ghcr.io/openhands/agent-server:f3f335b-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:f3f335b-golang-arm64
ghcr.io/openhands/agent-server:f3f335b-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:f3f335b-java-amd64
ghcr.io/openhands/agent-server:f3f335b-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:f3f335b-java-arm64
ghcr.io/openhands/agent-server:f3f335b-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:f3f335b-python-amd64
ghcr.io/openhands/agent-server:f3f335b-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:f3f335b-python-arm64
ghcr.io/openhands/agent-server:f3f335b-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:f3f335b-golang
ghcr.io/openhands/agent-server:f3f335b-java
ghcr.io/openhands/agent-server:f3f335b-python

About Multi-Architecture Support

Each variant tag (e.g., f3f335b-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., f3f335b-python-amd64) are also available if needed

This change allows the software-agent-sdk to load extensions from a specified branch of the OpenHands/extensions repository via the EXTENSIONS_REF environment variable, enabling testing of feature branches during evaluations. Changes: - Add os import to skill.py - Update PUBLIC_SKILLS_BRANCH to read from EXTENSIONS_REF environment variable - Falls back to 'main' if EXTENSIONS_REF is not set - Add comprehensive tests for the new functionality This enables the evaluation and benchmarks workflows to specify which extensions branch to use when building images and running evaluations. Part of OpenHands/evaluation#372 Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-03-30T14:32:53Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-03-30T14:33:03Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

enyst

I wonder, can we use marketplace.json or a plugin reference for this?

I think Graham's implementation of the marketplace is able to load Agentskills from some git source 🤔 and same should be true about the plugins

This change allows the SDK's eval workflow to specify which extensions branch to use when triggering evaluations, enabling end-to-end testing of the extensions branch selection feature. Changes: - Add extensions_branch workflow input (defaults to 'main') - Pass extensions_branch in workflow dispatch payload - Add to parameter logging for debugging This allows testing of feature branches through CI: 1. SDK run-eval workflow accepts extensions_branch input 2. Passes it to evaluation workflow 3. Evaluation passes to benchmarks via EXTENSIONS_REF 4. SDK loads extensions from specified branch Part of OpenHands/evaluation#372 Co-authored-by: openhands <openhands@all-hands.dev>

juanmichelini · 2026-04-01T14:54:07Z

@enyst marketplace.json allows bringing individual skills not entire repo. we would need to download the skills repo, and create something like this everytime:

{
"skills": [
{"name": "github", "source": {"repo": "...", "ref": "feature-branch", "path": "skills/github"}},
{"name": "linear", "source": {"repo": "...", "ref": "feature-branch", "path": "skills/linear"}},
// ... and so on
]
}

also we would need to add support for feature branches.

current change instead is a oneliner in skill.py and adding the parameter to the eval-job.

enyst

Ah, I see, thank you!

juanmichelini · 2026-04-08T22:09:48Z

✅ Feature Verified - Extensions Branch Selection Working

The extensions_branch parameter feature has been successfully tested and is working correctly.

Test Evidence

Eval Run: https://openhands-eval-monitor.vercel.app/?run=swebench%2Flitellm_proxy-claude-sonnet-4-5-20250929%2F24157514197%2F

Result: Instance scikit-learn__scikit-learn-13439 resolved successfully (✓)

How the Skill Was Loaded

From the results.tar.gz analysis:

test-marker-372 skill loaded from test branch:
- Source: /root/.openhands/cache/skills/public-skills/skills/test-marker-372/SKILL.md
- Description: [TEST-BRANCH-372-MARKER] This is a test marker skill to verify extensions branch selection is working
github skill modified to include test marker:
- Content includes: **🔍 TEST MARKER FOR ISSUE #372: If you see this message, the extensions branch selection feature is working! This skill was loaded from the test branch 'test/extensions-branch-selection-372' instead of main. 🔍**

Configuration Used

gh workflow run run-eval.yml \
  --repo OpenHands/software-agent-sdk \
  -f sdk_ref=openhands/allow-extensions-branch-selection \
  -f eval_branch=openhands/allow-extensions-branch-selection \
  -f benchmarks_branch=openhands/allow-extensions-branch-selection \
  -f extensions_branch=test/extensions-branch-selection-372 \
  -f allow_unreleased_branches=true \
  -f benchmark=swebench \
  -f eval_limit=1

Confirmed

✅ extensions_branch parameter passed correctly
✅ EXTENSIONS_REF environment variable was set
✅ SDK loaded extensions from the specified branch
✅ Feature working end-to-end!

This comment was posted by an AI assistant (OpenHands) on behalf of the user.

juanmichelini · 2026-04-08T22:09:53Z

✅ Feature Verified - Extensions Branch Selection Working

The extensions_branch parameter feature has been successfully tested and is working correctly.

Test Evidence

Eval Run: https://openhands-eval-monitor.vercel.app/?run=swebench%2Flitellm_proxy-claude-sonnet-4-5-20250929%2F24157514197%2F

Result: Instance scikit-learn__scikit-learn-13439 resolved successfully (✓)

How the Skill Was Loaded

From the results.tar.gz analysis:

test-marker-372 skill loaded from test branch:
- Source: /root/.openhands/cache/skills/public-skills/skills/test-marker-372/SKILL.md
- Description: [TEST-BRANCH-372-MARKER] This is a test marker skill to verify extensions branch selection is working
github skill modified to include test marker:
- Content includes: **🔍 TEST MARKER FOR ISSUE #372: If you see this message, the extensions branch selection feature is working! This skill was loaded from the test branch 'test/extensions-branch-selection-372' instead of main. 🔍**

Configuration Used

gh workflow run run-eval.yml \
  --repo OpenHands/software-agent-sdk \
  -f sdk_ref=openhands/allow-extensions-branch-selection \
  -f eval_branch=openhands/allow-extensions-branch-selection \
  -f benchmarks_branch=openhands/allow-extensions-branch-selection \
  -f extensions_branch=test/extensions-branch-selection-372 \
  -f allow_unreleased_branches=true \
  -f benchmark=swebench \
  -f eval_limit=1

Confirmed

✅ extensions_branch parameter passed correctly
✅ EXTENSIONS_REF environment variable was set
✅ SDK loaded extensions from the specified branch
✅ Feature working end-to-end!

This comment was posted by an AI assistant (OpenHands) on behalf of the user.

all-hands-bot

🟡 Acceptable - Core implementation is solid and pragmatic. The environment variable approach is simple and backward compatible. However, there's a critical bug in the empty string test logic that needs fixing.

tests/sdk/context/skill/test_extensions_ref.py

all-hands-bot

QA Report: EXTENSIONS_REF Environment Variable Support

Summary

⚠️ PASS WITH ISSUES: The core functionality works correctly — EXTENSIONS_REF successfully controls which branch of the extensions repo is used. However, pre-commit checks fail and there's an edge case issue with empty string handling.

Environment Setup

✅ SUCCESS

make build
# Dependencies installed successfully
# Build complete! Development environment is ready.

CI & Test Status

⚠️ Pre-commit check FAILING

❌ pre-commit: Line too long in test file (line 86: 90 > 88 characters)
✅ agent-server-tests: Passed
✅ tools-tests: Passed
✅ check-docstrings: Passed
✅ Python API: Passed
✅ REST API (OpenAPI): Passed
🔄 sdk-tests, build-binary-and-test: In progress

Unit tests: All 4 new tests pass

uv run pytest tests/sdk/context/skill/test_extensions_ref.py -v
# 4 passed in 0.38s

Functional Verification

Test 1: Default behavior (no EXTENSIONS_REF set)

.venv/bin/python -c "from openhands.sdk.context.skills.skill import PUBLIC_SKILLS_BRANCH; print(PUBLIC_SKILLS_BRANCH)"
# Output: main

✅ PASS: Defaults to 'main' when environment variable is not set.

Test 2: Custom branch via EXTENSIONS_REF

EXTENSIONS_REF=venv-test-branch .venv/bin/python -c "
import os
os.environ['EXTENSIONS_REF'] = 'venv-test-branch'
from openhands.sdk.context.skills.skill import PUBLIC_SKILLS_BRANCH
print(f'PUBLIC_SKILLS_BRANCH: {PUBLIC_SKILLS_BRANCH}')
"
# Output: PUBLIC_SKILLS_BRANCH: venv-test-branch

✅ PASS: Custom branch is correctly used when EXTENSIONS_REF is set.

Test 3: Empty string edge case

EXTENSIONS_REF='' .venv/bin/python -c "
import os
os.environ['EXTENSIONS_REF'] = ''
from openhands.sdk.context.skills.skill import PUBLIC_SKILLS_BRANCH
print(f'PUBLIC_SKILLS_BRANCH: \"{PUBLIC_SKILLS_BRANCH}\"')
"
# Output: PUBLIC_SKILLS_BRANCH: ""

⚠️ ISSUE FOUND: Empty string is NOT treated as fallback to 'main'. This could cause git errors when trying to checkout an empty branch name.

Test 4: Workflow changes

✅ VERIFIED: Workflow correctly accepts extensions_branch input and forwards it to evaluation repo dispatch payload.

Issues Found

🔴 Issue 1: Pre-commit failure (Line 86 in test_extensions_ref.py)

Line too long: 90 characters instead of max 88.

Current code (line 85-86):

assert PUBLIC_SKILLS_BRANCH == "", \
    "Empty EXTENSIONS_REF should result in empty string (os.environ.get behavior)"

Fix option 1 - Multi-line string:

assert PUBLIC_SKILLS_BRANCH == "", (
    "Empty EXTENSIONS_REF should result in empty string "
    "(os.environ.get behavior)"
)

Fix option 2 - Shorter message:

assert PUBLIC_SKILLS_BRANCH == "", \
    "Empty EXTENSIONS_REF returns empty string per os.environ.get semantics"

🟠 Issue 2: Empty string handling (Line 897 in skill.py)

Problem: os.environ.get("EXTENSIONS_REF", "main") returns an empty string when EXTENSIONS_REF="" is explicitly set. This could cause git errors.

Current implementation:

PUBLIC_SKILLS_BRANCH = os.environ.get("EXTENSIONS_REF", "main")

Behavior:

EXTENSIONS_REF not set → "main" ✅
EXTENSIONS_REF="feature-branch" → "feature-branch" ✅
EXTENSIONS_REF="" → "" ⚠️ (empty string, could break git checkout)

Recommended fix:

PUBLIC_SKILLS_BRANCH = os.environ.get("EXTENSIONS_REF") or "main"

This treats empty strings the same as missing values and falls back to 'main'.

If this fix is applied, also update test_extensions_ref_empty_string() to:

Change line 85 to expect "main" instead of ""
Update the docstring (line 76) to match the new behavior

🟡 Issue 3: Test docstring inconsistency (Line 76 in test_extensions_ref.py)

Docstring says "Empty EXTENSIONS_REF should fall back to 'main'" but the test expects empty string (line 85), which contradicts the docstring.

This is related to Issue #2 — if you fix the implementation, update this test to match.

Additional Notes

✅ Workflow verification: The .github/workflows/run-eval.yml changes correctly pass extensions_branch to the evaluation repo. Note that this workflow uses EXTENSIONS_BRANCH internally for dispatch logic, while the SDK code expects EXTENSIONS_REF (which will be set by downstream evaluation/benchmarks workflows).

Verdict

⚠️ PASS WITH ISSUES

The feature works as designed — environment variable successfully controls the extensions branch. Functional testing confirms:

✅ Default behavior works (falls back to 'main')
✅ Custom branches work correctly
✅ Workflow integration is correct

However, two issues must be addressed:

Must fix: Pre-commit check failure (line length)
Should fix: Empty string edge case to avoid potential git errors

Both issues are straightforward to fix and do not block the core functionality.

all-hands-bot

Inline code review comments for specific issues:

tests/sdk/context/skill/test_extensions_ref.py

openhands-sdk/openhands/sdk/context/skills/skill.py

tests/sdk/context/skill/test_extensions_ref.py

Co-authored-by: openhands <openhands@all-hands.dev>

The original tests used module reloading which polluted sys.modules for other tests. Using subprocesses ensures complete isolation. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-04-09T20:13:27Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/context/skills
skill.py	431	33	92%	95–96, 251–252, 438–444, 447, 564–567, 811–812, 871–872, 942–943, 1011, 1039, 1062, 1069–1070, 1114–1115, 1121–1122, 1128–1129
TOTAL	22289	6459	71%

enyst reviewed Mar 30, 2026

View reviewed changes

juanmichelini mentioned this pull request Apr 1, 2026

[TEST] Add marker to verify extensions branch selection feature OpenHands/extensions#129

Draft

enyst approved these changes Apr 1, 2026

View reviewed changes

openhands-agent and others added 2 commits April 1, 2026 17:49

Merge main into openhands/allow-extensions-branch-selection

810a05c

Merge branch 'main' into openhands/allow-extensions-branch-selection

ec22083

Merge branch 'main' into openhands/allow-extensions-branch-selection

490f9f3

juanmichelini marked this pull request as ready for review April 9, 2026 15:42

juanmichelini enabled auto-merge (squash) April 9, 2026 15:42

all-hands-bot reviewed Apr 9, 2026

View reviewed changes

tests/sdk/context/skill/test_extensions_ref.py Outdated Show resolved Hide resolved

tests/sdk/context/skill/test_extensions_ref.py Outdated Show resolved Hide resolved

tests/sdk/context/skill/test_extensions_ref.py Outdated Show resolved Hide resolved

all-hands-bot reviewed Apr 9, 2026

View reviewed changes

tests/sdk/context/skill/test_extensions_ref.py Outdated Show resolved Hide resolved

openhands-sdk/openhands/sdk/context/skills/skill.py Show resolved Hide resolved

tests/sdk/context/skill/test_extensions_ref.py Outdated Show resolved Hide resolved

openhands-agent and others added 5 commits April 9, 2026 18:39

Fix line too long in test_extensions_ref.py

9af4731

Co-authored-by: openhands <openhands@all-hands.dev>

Fix test to verify branch passed to update_skills_repository

46a057d

Co-authored-by: openhands <openhands@all-hands.dev>

Merge branch 'main' into openhands/allow-extensions-branch-selection

9e88123

Merge branch 'main' into openhands/allow-extensions-branch-selection

d088cfa

Fix test isolation using subprocess to avoid env pollution

41dce05

The original tests used module reloading which polluted sys.modules for other tests. Using subprocesses ensures complete isolation. Co-authored-by: openhands <openhands@all-hands.dev>

juanmichelini merged commit d2a77ab into main Apr 9, 2026
23 checks passed

juanmichelini deleted the openhands/allow-extensions-branch-selection branch April 9, 2026 20:13

Conversation

juanmichelini commented Mar 30, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes Made

Core Changes

Testing

Implementation Details

Before

After

Integration

Backward Compatibility

Usage

Environment Variable

Docker/Kubernetes

Programmatic (if needed)

Testing

Unit Tests

Integration Testing

Security Considerations

Performance Impact

Next Steps

Checklist

Uh oh!

github-actions bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

enyst left a comment

Choose a reason for hiding this comment

Uh oh!

juanmichelini commented Apr 1, 2026

Uh oh!

enyst left a comment

Choose a reason for hiding this comment

Uh oh!

juanmichelini commented Apr 8, 2026

✅ Feature Verified - Extensions Branch Selection Working

Test Evidence

How the Skill Was Loaded

Configuration Used

Confirmed

Uh oh!

juanmichelini commented Apr 8, 2026

✅ Feature Verified - Extensions Branch Selection Working

Test Evidence

How the Skill Was Loaded

Configuration Used

Confirmed

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

QA Report: EXTENSIONS_REF Environment Variable Support

Summary

Environment Setup

CI & Test Status

Functional Verification

Test 1: Default behavior (no EXTENSIONS_REF set)

Test 2: Custom branch via EXTENSIONS_REF

Test 3: Empty string edge case

Test 4: Workflow changes

Issues Found

🔴 Issue 1: Pre-commit failure (Line 86 in test_extensions_ref.py)

🟠 Issue 2: Empty string handling (Line 897 in skill.py)

🟡 Issue 3: Test docstring inconsistency (Line 76 in test_extensions_ref.py)

Additional Notes

Verdict

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

juanmichelini commented Mar 30, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Mar 30, 2026 •

edited

Loading

github-actions bot commented Mar 30, 2026 •

edited

Loading