Add OpenHands issue duplicate automation workflow#2909
Conversation
Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
All automation workflows, scripts, and tests function correctly. The PR successfully adds OpenHands-powered GitHub issue duplicate detection with proper error handling, comprehensive test coverage, and repo-appropriate token management.
Does this PR achieve its stated goal?
Yes. The PR delivers exactly what it promised:
- Issue duplicate automation workflows: Added
issue-duplicate-checker.yml(3 jobs: smoke-clone, issue-check, auto-close) andremove-duplicate-candidate-label.yml(removes label on user activity). - Runner scripts with regression tests: Both
issue_duplicate_check_openhands.pyandauto_close_duplicate_issues.pyare production-ready with 11 comprehensive regression tests covering edge cases (empty payloads, fenced JSON parsing, deleted users, label filtering). - PR review workflow updates: Successfully updated
pr-review-by-openhands.ymlto trigger onsynchronizeevents and usegithub.tokenfallback whenOPENHANDS_BOT_GITHUB_PAT_PUBLICis absent.
All scripts handle errors gracefully, workflows use proper permissions and concurrency controls, and the implementation follows GitHub Actions best practices.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully |
| CI & Tests | ✅ 16/16 critical checks pass; 11/11 new tests pass in cross-tests |
| Functional Verification | ✅ Scripts import, workflows valid, logic verified |
Functional Verification
Test 1: Environment Setup and Build
Step 1 — Baseline environment:
Verified repository state on PR branch enyst/issue-duplicate-automation (commit adf2c7b).
Step 2 — Build environment:
Ran make build:
Dependencies installed successfully.
Pre-commit hooks installed successfully.
Build complete! Development environment is ready.
This confirms the development environment sets up correctly with all dependencies.
Step 3 — Verify no regressions:
Environment setup completes without errors, confirming no breaking changes to build process.
Test 2: Regression Test Suite
Step 1 — Run new tests:
Ran uv run pytest tests/cross/test_issue_duplicate_scripts.py -v:
collected 11 items
tests/cross/test_issue_duplicate_scripts.py::test_list_open_issues_filters_by_duplicate_candidate_label PASSED [ 9%]
tests/cross/test_issue_duplicate_scripts.py::test_has_reaction_from_user_ignores_missing_user_ids PASSED [ 18%]
tests/cross/test_issue_duplicate_scripts.py::test_find_latest_auto_close_comment_returns_latest_candidate PASSED [ 27%]
tests/cross/test_issue_duplicate_scripts.py::test_close_issue_as_duplicate_leaves_label_until_requests_succeed PASSED [ 36%]
tests/cross/test_issue_duplicate_scripts.py::test_keep_open_due_to_newer_comments_removes_candidate_label PASSED [ 45%]
tests/cross/test_issue_duplicate_scripts.py::test_parse_agent_json_handles_single_line_fenced_json PASSED [ 54%]
tests/cross/test_issue_duplicate_scripts.py::test_normalize_result_promotes_actionable_duplicates PASSED [ 63%]
tests/cross/test_issue_duplicate_scripts.py::test_normalize_result_lowercases_classification PASSED [ 72%]
tests/cross/test_issue_duplicate_scripts.py::test_poll_start_task_retries_after_empty_payload PASSED [ 81%]
tests/cross/test_issue_duplicate_scripts.py::test_poll_conversation_retries_after_empty_items PASSED [ 90%]
tests/cross/test_issue_duplicate_scripts.py::test_poll_conversation_raises_on_failed_status PASSED [100%]
11 passed, 5 warnings in 0.05s
Interpretation: All edge cases are properly handled:
- Label filtering (duplicate-candidate)
- Deleted user handling (missing user IDs)
- Comment parsing and metadata extraction
- Error handling during close operations
- JSON parsing with fences
- Result normalization and validation
- Retry logic for empty API responses
- Failure detection in conversation polling
Test 3: Pre-commit Validation
Step 1 — Run pre-commit checks:
Ran pre-commit on all changed files:
Format YAML files........................................................Passed
Ruff format..............................................................Passed
Ruff lint................................................................Passed
PEP8 style check (pycodestyle)...........................................Passed
Type check with pyright..................................................Passed
Check import dependency rules............................................Passed
Check Tool subclass registration.........................................Passed
Interpretation: Code meets all quality standards (formatting, linting, type checking).
Test 4: Script Functionality
Step 1 — Verify scripts can be imported:
import scripts.issue_duplicate_check_openhands
import scripts.auto_close_duplicate_issues
# ✓ Both scripts import successfullyStep 2 — Test error handling:
Ran auto-close script without GITHUB_TOKEN:
error: GITHUB_TOKEN environment variable is required
RuntimeError: GITHUB_TOKEN environment variable is required
Interpretation: Scripts have proper command-line interfaces, import cleanly, and gracefully handle missing credentials.
Test 5: Workflow Structure Validation
Step 1 — Validate YAML syntax:
import yaml
yaml.safe_load(open('.github/workflows/issue-duplicate-checker.yml'))
# ✓ issue-duplicate-checker.yml is valid YAML
yaml.safe_load(open('.github/workflows/remove-duplicate-candidate-label.yml'))
# ✓ remove-duplicate-candidate-label.yml is valid YAMLStep 2 — Validate workflow structure:
Verified programmatically:
✓ Workflow structure validation passed
- 3 jobs: smoke-clone, issue-duplicate-check, auto-close-duplicates
- 3 triggers: issues (opened), schedule (daily 9am), workflow_dispatch
- Permissions: contents:read, issues:write
- Token fallback: OPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token
Interpretation: Workflows are syntactically valid, have correct structure, proper permissions, and use the token fallback pattern consistently.
Test 6: CI Integration
Step 1 — Check CI status:
Ran gh pr checks 2909:
16 passing checks:
✓ cross-tests (1m19s) - includes all 11 new regression tests
✓ pre-commit (1m15s)
✓ sdk-tests (12s)
✓ tools-tests (10s)
✓ agent-server-tests (10s)
✓ workspace-tests (10s)
✓ Python API breakage checks (21s)
✓ REST API breakage checks (37s)
✓ build-binary-and-test ubuntu (2m41s)
✓ Check docstrings (7s)
✓ Check package versions (7s)
✓ Deprecation deadlines (14s)
... and 4 more
Interpretation: All critical checks pass. The new tests run successfully in CI via the cross-tests job. Remaining pending checks are Docker builds and coverage reporting, which don't affect functionality verification.
Test 7: PR Review Workflow Changes
Step 1 — Verify changes:
Checked diff of .github/workflows/pr-review-by-openhands.yml:
- Added
synchronizeto trigger types ✓ - Added synchronize condition to job if-clause ✓
- Changed
github-tokenfromOPENHANDS_BOT_GITHUB_PAT_PUBLICtoOPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token✓
Interpretation: The workflow now triggers on follow-up commits (synchronize) and gracefully falls back to default GitHub token when the bot PAT is unavailable.
Issues Found
None.
all-hands-bot
left a comment
There was a problem hiding this comment.
Comprehensive automation feature with solid structure. Found critical test coverage gaps and several error handling improvements needed before merging. Most issues are straightforward fixes.
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
Good automation feature with solid structure. Found several critical error handling gaps that need to be fixed before merge. Most previously reported issues have been resolved ✅
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS WITH ISSUES
The PR successfully delivers OpenHands-powered issue duplicate automation with comprehensive test coverage and proper token handling. All core functionality works as implemented, but the PR description contains an outdated claim.
Does this PR achieve its stated goal?
Partially Yes. The PR delivers the core goal ("add OpenHands-powered GitHub issue duplicate workflow") with robust implementation and comprehensive tests. However, the PR description claims to "update pr-review-by-openhands.yml to rerun on pull_request.synchronize" — this feature was tested in commit f140353, then intentionally removed in c604d38 with the message "chore: stop auto-review on every PR push". The final implementation only includes the token fallback (OPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token), not the synchronize trigger.
The core automation features work correctly: duplicate detection via OpenHands Cloud, auto-close after a waiting period, and label removal on activity. All 35 tests pass, covering edge cases for empty payloads, fenced JSON parsing, deleted user handling, and label filtering.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed, venv created |
| CI & Tests | ✅ 35/35 tests pass, pre-commit clean, 22 CI checks SUCCESS |
| Functional Verification | ✅ Scripts import, workflows valid, smoke-clone works |
Functional Verification
Test 1: Environment Setup and Test Suite
Baseline: Clean repository at commit ef21d98
Ran make build:
Checking uv version...
uv version 0.11.7 meets requirements
Setting up OpenHands V1 development environment...
Installing dependencies with uv sync --dev...
Resolved 402 packages in 1ms
Creating virtual environment at: .venv
This confirms the environment builds successfully.
Test execution:
Ran uv run pytest tests/cross/test_issue_duplicate_scripts.py -v:
35 passed, 5 warnings in 0.10s
All tests pass, covering:
- Empty polling payloads (test_poll_start_task_retries_after_empty_payload)
- Fenced JSON parsing (test_parse_agent_json_handles_single_line_fenced_json)
- Deleted user safety (test_has_reaction_from_user_ignores_missing_user_ids)
- Label filtering (test_list_open_issues_filters_by_duplicate_candidate_label)
Pre-commit validation:
Ran uv run pre-commit run --files ...:
Format YAML files........................................................Passed
Ruff format..............................................................Passed
Ruff lint................................................................Passed
PEP8 style check (pycodestyle)...........................................Passed
Type check with pyright..................................................Passed
All code quality checks pass.
Test 2: Workflow YAML Validation
Ran python -c "import yaml; yaml.safe_load(open('.github/workflows/issue-duplicate-checker.yml'))" && echo "✓ valid YAML":
✓ issue-duplicate-checker.yml is valid YAML
✓ remove-duplicate-candidate-label.yml is valid YAML
✓ pr-review-by-openhands.yml is valid YAML
All workflow files are syntactically valid.
Test 3: Script Import and Execution
Ran python scripts/issue_duplicate_check_openhands.py --help:
usage: issue_duplicate_check_openhands.py [-h] --repository REPOSITORY --issue-number ISSUE_NUMBER ...
Ran python scripts/auto_close_duplicate_issues.py --help:
usage: auto_close_duplicate_issues.py [-h] --repository REPOSITORY [--close-after-days CLOSE_AFTER_DAYS] [--dry-run]
Both scripts have proper CLI interfaces and can be imported without errors.
Test 4: Smoke Clone Functionality
Simulated the smoke-clone workflow:
git clone --depth 1 https://github.com/OpenHands/OpenHands.git /tmp/test-OpenHands
git clone --depth 1 https://github.com/OpenHands/agent-sdk.git /tmp/test-agent-sdk✓ Both repositories clone successfully, confirming the smoke test workflow will work post-merge.
Test 5: Token Fallback Verification
Checked the pr-review-by-openhands.yml diff:
- github-token: ${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC }}
+ github-token: ${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token }}The token fallback is correctly implemented. Verified it appears in all three workflow files:
- .github/workflows/issue-duplicate-checker.yml (lines 102, 197, 354)
- .github/workflows/remove-duplicate-candidate-label.yml (line 27)
- .github/workflows/pr-review-by-openhands.yml (line 56)
Test 6: Synchronize Trigger Investigation
Checked commit history:
f140353c chore: verify PR review synchronize trigger
c604d386 chore: stop auto-review on every PR push
Inspected c604d38:
- pull_request:
- types: [opened, ready_for_review, synchronize, labeled, review_requested]
+ pull_request:
+ types: [opened, ready_for_review, labeled, review_requested]The synchronize trigger was added for testing, then intentionally removed. The PR description is outdated.
Issues Found
- 🟡 Minor: PR description claims "update pr-review-by-openhands.yml to rerun on pull_request.synchronize" but this feature was intentionally removed in commit c604d38. The description should be updated to reflect the final implementation (token fallback only, no synchronize trigger).
Note: The removal of the synchronize trigger is a valid design decision (avoiding expensive reviews on every push), not a bug. The issue is the outdated PR description, not the implementation.
all-hands-bot
left a comment
There was a problem hiding this comment.
Good automation feature with solid structure and comprehensive test coverage. Found a few issues that should be addressed before merging.
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS WITH ISSUES
All core automation functionality works as designed — workflows are valid, scripts handle edge cases correctly, and 37/37 tests pass.
Does this PR achieve its stated goal?
Yes — the PR successfully adds OpenHands-powered issue duplicate automation to the SDK repo. The two workflows (issue-duplicate-checker.yml, remove-duplicate-candidate-label.yml), two Python scripts (issue_duplicate_check_openhands.py, auto_close_duplicate_issues.py), and comprehensive test suite are all present and functional. However, the PR description contains a factual error about the pr-review-by-openhands.yml changes (see Issues below).
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build succeeded, dependencies installed |
| CI & Tests | ✅ 37/37 tests pass, 15/15 core CI checks pass, pre-commit clean |
| Functional Verification | ✅ Workflows valid, scripts executable, edge cases covered |
Functional Verification
Test 1: Environment Setup
Baseline: Fresh checkout of PR branch c6f9660
Ran setup:
cd /home/runner/work/software-agent-sdk/software-agent-sdk/pr-repo
make buildResult: Build succeeded, virtual environment created at .venv, 402 packages installed.
Test 2: Test Suite Execution
Step 1 — Verify tests exist and are comprehensive
Ran:
uv run pytest tests/cross/test_issue_duplicate_scripts.py -vResult:
37 passed, 5 warnings in 0.11s
Test coverage includes:
- Pagination edge cases (empty responses, multi-page)
- Error handling (URLError, JSON decode failures, HTTP 404)
- Deleted user handling (missing user IDs in reactions)
- Veto logic (author thumbs-down, newer comments)
- Auto-close logic (old duplicates, malformed data)
- JSON parsing (fenced blocks, plain JSON, invalid JSON)
- OpenHands API integration (polling, timeouts, fallbacks)
This confirms comprehensive test coverage.
Test 3: Pre-commit Validation
Step 1 — Run pre-commit on changed files
Ran:
uv run pre-commit run --files \
.github/workflows/issue-duplicate-checker.yml \
.github/workflows/remove-duplicate-candidate-label.yml \
.github/workflows/pr-review-by-openhands.yml \
scripts/issue_duplicate_check_openhands.py \
scripts/auto_close_duplicate_issues.py \
tests/cross/test_issue_duplicate_scripts.pyResult:
Format YAML files........................................................Passed
Ruff format..............................................................Passed
Ruff lint................................................................Passed
PEP8 style check (pycodestyle)...........................................Passed
Type check with pyright..................................................Passed
All code quality checks pass.
Test 4: Workflow YAML Validation
Step 1 — Parse YAML files
Ran Python YAML parser on all three workflow files:
import yaml
for file in ['.github/workflows/issue-duplicate-checker.yml',
'.github/workflows/remove-duplicate-candidate-label.yml',
'.github/workflows/pr-review-by-openhands.yml']:
yaml.safe_load(open(file))Result: All three workflows parse successfully — no syntax errors.
Test 5: Script CLI Interface
Step 1 — Verify scripts have help and proper argument parsing
Ran:
python3 scripts/issue_duplicate_check_openhands.py --help
python3 scripts/auto_close_duplicate_issues.py --helpResult: Both scripts display proper help text with required arguments:
issue_duplicate_check_openhands.py: requires--repository,--issue-number, optional--output,--poll-interval-seconds,--max-wait-secondsauto_close_duplicate_issues.py: requires--repository, optional--close-after-days,--dry-run
Scripts compile cleanly and have well-structured CLIs.
Test 6: CI Status Check
Step 1 — Check PR CI status
Ran:
gh pr checks 2909 --repo OpenHands/software-agent-sdkResult:
- ✅ 15 successful checks (sdk-tests, tools-tests, workspace-tests, agent-server-tests, cross-tests, pre-commit, API breakage checks, etc.)
- ⏳ 10 pending checks (Agent Server builds still running)
- ❌ 1 failing check:
Review Thread Gate/unresolved-review-threads(expected — this is the review thread gate, not a code issue)
All core functionality checks pass.
Issues Found
- 🟡 Documentation Inconsistency: The PR description claims to "update
pr-review-by-openhands.ymlto rerun onpull_request.synchronize", but commitc604d386("stop auto-review on every PR push") removed thesynchronizetrigger after testing. Only the token fallback (secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token) was kept. The PR description should be updated to reflect the actual changes.
all-hands-bot
left a comment
There was a problem hiding this comment.
Solid automation feature with comprehensive implementation. Found several blocking issues that need to be addressed before merge: canonical issue validation, bot comment filtering, and missing test coverage for core helper functions.
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
All functional verification complete. The PR successfully adds OpenHands-powered issue duplicate detection and auto-close workflows with comprehensive test coverage and proper error handling.
Does this PR achieve its stated goal?
Yes. The PR's goal was to "add issue duplicate check + duplicate-candidate auto-close workflows for this repo." The implementation delivers:
- Issue duplicate checker workflow that uses OpenHands Cloud to analyze new issues for duplicates and posts informative comments with candidate matches
- Auto-close workflow that runs daily to close aged duplicate candidates after a 3-day waiting period, with veto logic for author thumbs-down and new activity
- Label removal workflow that removes the duplicate-candidate label when users engage with the issue
- Two production-ready Python scripts with proper error handling, pagination, and edge-case coverage
- 55 regression tests covering empty payloads, malformed data, deleted users, bot filtering, and the complete workflow
- PR review workflow enhancement with token fallback logic
The automation infrastructure is production-ready and will activate after merge when the workflow files appear on the default branch.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed, dependencies installed |
| CI & Tests | ✅ 55/55 new tests pass, pre-commit clean, all required CI checks passing |
| Functional Verification | ✅ Scripts executable, dry-run successful, workflows valid |
Functional Verification
Test 1: New test suite execution
Ran the complete test suite for duplicate automation scripts:
uv run pytest tests/cross/test_issue_duplicate_scripts.py -vResult:
55 passed, 5 warnings in 0.41s
Interpretation: All tests pass, covering:
- Pagination for issues, comments, and reactions
- Error handling for invalid timestamps, malformed JSON, HTTP errors
- Bot comment filtering (GitHub Apps, [bot] suffix, all-hands-bot)
- Author veto detection (thumbs down reactions)
- Newer comment detection (removes duplicate-candidate label)
- Auto-close logic with 3-day waiting period
- OpenHands Cloud conversation polling and response parsing
- Edge cases: deleted users, empty payloads, non-numeric IDs
Test 2: Script executability verification
Checked both scripts have proper CLI interfaces:
python scripts/issue_duplicate_check_openhands.py --help
python scripts/auto_close_duplicate_issues.py --helpResult: Both scripts display proper usage information with required arguments:
issue_duplicate_check_openhands.py: requires--repository,--issue-number, supports--output,--poll-interval-seconds,--max-wait-secondsauto_close_duplicate_issues.py: requires--repository, supports--close-after-days,--dry-run
Interpretation: Scripts are properly structured as command-line tools with argparse.
Test 3: Auto-close dry-run execution
Ran auto-close script in dry-run mode against real repository:
python scripts/auto_close_duplicate_issues.py --repository OpenHands/software-agent-sdk --close-after-days 3 --dry-runResult:
{
"repository": "OpenHands/software-agent-sdk",
"results": []
}Interpretation: Script successfully:
- Connected to GitHub API with GITHUB_TOKEN
- Queried for issues with "duplicate-candidate" label
- Found no matching issues (expected, since workflow not yet deployed)
- Returned valid JSON output
- Completed without errors in dry-run mode
Test 4: Pre-commit validation
Ran pre-commit on all changed files:
uv run pre-commit run --files .github/workflows/issue-duplicate-checker.yml .github/workflows/remove-duplicate-candidate-label.yml .github/workflows/pr-review-by-openhands.yml scripts/issue_duplicate_check_openhands.py scripts/auto_close_duplicate_issues.py tests/cross/test_issue_duplicate_scripts.pyResult:
Format YAML files........................................................Passed
Ruff format..............................................................Passed
Ruff lint................................................................Passed
PEP8 style check (pycodestyle)...........................................Passed
Type check with pyright..................................................Passed
Check import dependency rules............................................Passed
Check Tool subclass registration.........................................Passed
Interpretation: All code quality checks pass:
- YAML files properly formatted
- Python code passes Ruff linting and formatting
- PEP8 compliant
- Type hints validated by pyright
- Import dependencies correct
Test 5: Workflow YAML validation
Verified workflow files are syntactically valid:
- Checked
issue-duplicate-checker.ymlstructure: 3 jobs (smoke-clone, issue-duplicate-check, auto-close-duplicates) - Checked
remove-duplicate-candidate-label.ymlstructure: 1 job with proper bot filtering - Verified trigger conditions, permissions, environment variables
- Confirmed token fallback logic:
secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token
Result: All workflows are syntactically valid and follow GitHub Actions best practices.
Interpretation: Workflows will activate correctly after merge.
Test 6: CI status check
Checked PR CI status:
gh pr view 2909 --json statusCheckRollupResult:
- ✅ Pre-commit checks: SUCCESS
- ✅ SDK tests: SUCCESS
- ✅ Tools tests: SUCCESS
- ✅ Agent server tests: SUCCESS
- ✅ Workspace tests: SUCCESS
- ✅ Cross tests: SUCCESS (includes new duplicate automation tests)
- ✅ Docstrings: SUCCESS
- ✅ API breakage checks: SUCCESS
- ⏳ Build jobs: IN_PROGRESS (Docker image builds, not blocking)
Interpretation: All required checks pass. The cross-tests suite includes the new 55 duplicate automation tests.
Issues Found
None.
Summary: This PR is production-ready. The duplicate detection automation is well-tested, handles edge cases correctly, and follows the repository's code quality standards. After merge, the workflow can be verified in smoke-clone mode as documented in the test instructions.
all-hands-bot
left a comment
There was a problem hiding this comment.
Comprehensive automation feature with solid structure. Found critical workflow issue and several logic/test gaps that must be addressed before merge.
Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands /codereview |
|
I'm on it! enyst can track my progress at all-hands.dev |
enyst
left a comment
There was a problem hiding this comment.
🔴 Needs improvement
[CRITICAL ISSUES]
-
[.github/workflows/issue-duplicate-checker.yml:205-442] Wrong-repo mutation in the manual
issue-checkpath: the workflow accepts arepositoryinput and passes it into the Python analyzer, but every GitHub REST call in thePost duplicate overlap noticestep still usescontext.repo.owner/repo. That means a manualworkflow_dispatchrun targetingOpenHands/agent-sdkwill analyze anagent-sdkissue and then validate the canonical issue, add/remove labels, and post the duplicate notice onOpenHands/software-agent-sdkissue#<issue_number>instead. This is a real correctness bug, not a hypothetical edge case—the workflow UI explicitly invites cross-repo use by exposingrepositoryand defaulting it toOpenHands/agent-sdk.The simplest fix is to pick one model and use it consistently:
- Repo-local only: remove the cross-repo
repositoryinput and keep the whole workflow bound togithub.repository, or - Actually cross-repo: thread the target owner/repo into the GitHub Script step and use that owner/repo for all issue/label/comment reads and writes.
- Repo-local only: remove the cross-repo
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM
The bug is limited to the manualissue-checkpath, but when it happens it causes user-visible writes to the wrong repository’s issues, labels, and comments.
VERDICT:
❌ Needs rework: The workflow’s read path is repo-aware, but its write path is still hard-wired to the current repository.
KEY INSIGHT:
The analysis step and the side-effect step disagree about which repository they are operating on.
This review was created by an AI assistant (OpenHands) on behalf of the user.
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
|
Addressed the wrong-repo manual issue-check concern in d029322 by removing the input entirely and binding the duplicate-check workflow to for analysis, comments, labels, and auto-close behavior. The workflow is now intentionally repo-local only, which also matches the least-privilege direction we discussed.\n\nThis comment was created by an AI assistant (OpenHands) on behalf of the user. |
|
Clarifying the previous note: this is addressed in d029322 by removing the This comment was created by an AI assistant (OpenHands) on behalf of the user. |
enyst
left a comment
There was a problem hiding this comment.
🔴 Needs improvement
[CRITICAL ISSUES]
- [scripts/issue_duplicate_check_openhands.py:297-299, 321-324, 529-540] Secret exposure in failure paths: the error handling still dumps raw OpenHands payload dicts into exception strings (
json.dumps(item)/{start_task}/{ready_task}). The same conversation payload type is later read forsession_api_keyat lines 551-557, so a failed or malformed conversation response can leak that ephemeral session credential into public GitHub Actions logs. Don’t log whole OpenHands payloads here; redact sensitive fields or log only the specific keys/status values you need for debugging. - [.github/workflows/issue-duplicate-checker.yml:91-92, 199, 449-450; .github/workflows/remove-duplicate-candidate-label.yml:29] Duplicate automation still prefers a broad PAT over
github.token: after the repo-local fix, these workflows only need same-repo issue/comment/label/close operations, which fit the declaredissues: write/contents: readjob permissions. PreferringOPENHANDS_BOT_GITHUB_PAT_PUBLICunnecessarily widens the blast radius of automation triggered by untrusted public issue text. For the duplicate-check / label-removal / auto-close workflows, usegithub.tokendirectly unless you can point to a concrete capability gap. - [.github/workflows/issue-duplicate-checker.yml:201-223, 327-363] Untrusted LLM output is posted verbatim: the workflow verifies the canonical issue number, but it still renders the model-provided
summary,candidate.title, andcandidate.urldirectly into the public duplicate notice. Because the input prompt comes from arbitrary issue text, a prompt-injected report can steer the model into publishing external links, misleading candidate titles, or@mentionsin a maintainer-branded bot comment. Derive displayed candidate metadata from GitHub after validating issue numbers, and treat free-form model summary text as untrusted output that needs escaping or tighter constraints before posting.
[TESTING GAPS]
- There is strong branch coverage around parsing and reaction handling, but nothing currently proves the new failure-path redaction behavior or candidate-metadata validation because neither safeguard exists yet.
- The PR description evidence is now stale: it still says the targeted suite passed with
37 passedand claims branchworkflow_dispatchis impossible, but the suite is now 76 tests and we have already runsmoke-clonesuccessfully on this PR branch. Update the evidence so reviewers aren’t validating against outdated runtime facts.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM
The repo-local routing and review-thread fixes are good, but this is still AI-driven issue automation running on untrusted public inputs with write access to issues. The remaining trust-boundary gaps are: possible OpenHands session credential leakage into Actions logs, over-broad GitHub token preference, and public bot comments built from unvalidated model output.
VERDICT:
❌ Needs rework: the core workflow logic is much better now, but the security boundary between untrusted issue text, OpenHands responses, and maintainer-authorized GitHub side effects is still too loose.
KEY INSIGHT:
The correctness bugs are mostly gone; the remaining problems are trust-boundary bugs — which tokens and which model output are allowed to cross from random issue text into durable repo mutations.
This review was created by an AI assistant (OpenHands) on behalf of the user.
|
@OpenHands see this review comment: #2909 (review) Keep the PAT and keep the LLM Output posted verbatim. Those are fine. Fix the rest. |
|
I'm on it! enyst can track my progress at all-hands.dev |
Co-authored-by: openhands <openhands@all-hands.dev>
|
Addressed the remaining items from review in bd4cdf3.
Local verification rerun:
Branch CI is running on the new commit; This comment was created by an AI assistant (OpenHands) on behalf of the user. |
This comment was marked as duplicate.
This comment was marked as duplicate.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
This PR successfully adds OpenHands issue duplicate automation workflows with proper token handling and comprehensive test coverage.
Does this PR achieve its stated goal?
Yes. The PR delivers on all stated objectives:
-
Issue duplicate check workflow — The new
.github/workflows/issue-duplicate-checker.ymlprovides three distinct operational modes (smoke-clone, issue-check, auto-close) with proper concurrency controls and error handling. -
Token fallback strategy — All three workflows (issue-duplicate-checker, remove-duplicate-candidate-label, and pr-review-by-openhands) now use
${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token }}, ensuring the automation works even when the custom PAT is unavailable. -
Production-ready scripts — Both
scripts/issue_duplicate_check_openhands.pyandscripts/auto_close_duplicate_issues.pyhave robust CLI interfaces, proper error handling, and graceful degradation when environment variables are missing. -
Comprehensive test coverage — 77 regression tests in
tests/cross/test_issue_duplicate_scripts.pycover edge cases including empty polling payloads, fenced JSON parsing, deleted-user-safe reactions, label-filtered scanning, and malformed payload handling.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully |
| CI Status | ✅ All checks passed (sdk-tests, cross-tests, pre-commit, etc.) |
| Functional Verification | ✅ Scripts validated, workflows tested |
Functional Verification
Test 1: Script CLI Interfaces
Verified both scripts have proper help output:
Ran python scripts/issue_duplicate_check_openhands.py --help:
usage: issue_duplicate_check_openhands.py [-h] --repository REPOSITORY --issue-number ISSUE_NUMBER [--output OUTPUT] [--poll-interval-seconds POLL_INTERVAL_SECONDS] [--max-wait-seconds MAX_WAIT_SECONDS]
✓ Script provides clear CLI interface with all required parameters
Ran python scripts/auto_close_duplicate_issues.py --help:
usage: auto_close_duplicate_issues.py [-h] --repository REPOSITORY [--close-after-days CLOSE_AFTER_DAYS] [--dry-run]
✓ Script provides proper CLI with dry-run option
Test 2: Environment Variable Validation
Verified scripts handle missing credentials correctly:
Ran auto_close_duplicate_issues.py without GITHUB_TOKEN:
error: GITHUB_TOKEN environment variable is required
RuntimeError: GITHUB_TOKEN environment variable is required
✓ Script properly validates required environment variables
Test 3: Dry-Run Mode
Verified auto-close script executes in dry-run mode without making changes:
Ran python scripts/auto_close_duplicate_issues.py --repository OpenHands/software-agent-sdk --close-after-days 3 --dry-run:
{
"repository": "OpenHands/software-agent-sdk",
"results": []
}✓ Script runs successfully in dry-run mode, returns expected JSON output format
Test 4: Workflow YAML Validation
Verified all workflow files are syntactically valid:
python -c "import yaml; yaml.safe_load(open('.github/workflows/issue-duplicate-checker.yml').read()); print('✓')"
# ✓ issue-duplicate-checker.yml is valid YAML
python -c "import yaml; yaml.safe_load(open('.github/workflows/remove-duplicate-candidate-label.yml').read()); print('✓')"
# ✓ remove-duplicate-candidate-label.yml is valid YAML
python -c "import yaml; yaml.safe_load(open('.github/workflows/pr-review-by-openhands.yml').read()); print('✓')"
# ✓ pr-review-by-openhands.yml is valid YAML✓ All workflow files parse correctly
Test 5: Token Fallback Implementation
Verified token fallback pattern in all workflows:
.github/workflows/issue-duplicate-checker.yml (line 92):
GITHUB_TOKEN: ${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token }}.github/workflows/issue-duplicate-checker.yml (line 450, auto-close job):
GITHUB_TOKEN: ${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token }}.github/workflows/remove-duplicate-candidate-label.yml (line 29):
github-token: ${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token }}.github/workflows/pr-review-by-openhands.yml (line 56):
github-token: ${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token }}✓ Token fallback correctly implemented in all four workflow locations
Test 6: CI Status
Checked CI status:
gh pr checks 2909Key results:
- ✅ cross-tests: pass (1m20s) — validates the 77 new regression tests
- ✅ sdk-tests: pass
- ✅ tools-tests: pass
- ✅ workspace-tests: pass
- ✅ agent-server-tests: pass
- ✅ pre-commit: pass
- ✅ Python API: pass
- ✅ check-pr-artifacts: pass
✓ All essential CI checks passed
Test 7: Test Coverage
Verified comprehensive test suite:
grep -c "^def test_" tests/cross/test_issue_duplicate_scripts.py
# 77✓ 77 regression tests covering edge cases per PR description (empty polling payloads, fenced JSON parsing, deleted-user-safe reactions, label-filtered issue scanning, malformed payload handling, agent-final-response path)
Issues Found
None.
all-hands-bot
left a comment
There was a problem hiding this comment.
Solid automation implementation with comprehensive test coverage (77 tests). Found 1 critical security issue and 3 important bugs that should be fixed before merge.
|
@OpenHands Read this #2909 (review) and the unsolved inline review threads. Note that I already left comments inline too, in reply to the bot. Read all of them. Then fix according to the replies too. |
|
I'm on it! enyst can track my progress at all-hands.dev |
Co-authored-by: openhands <openhands@all-hands.dev>
This comment was marked as duplicate.
This comment was marked as duplicate.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste - Solid automation implementation with comprehensive test coverage and proper error handling.
Verdict: ✅ Worth merging
This PR successfully adds OpenHands-powered issue duplicate detection automation with:
- Comprehensive test coverage: 2271 lines of tests covering edge cases (malformed data, network errors, pagination, bot filtering, deleted users, veto mechanisms, etc.)
- Robust error handling: Scripts handle all failure modes gracefully (per-issue errors don't abort batch, network failures are caught, API errors are properly wrapped)
- Proper validation: Repository format validation, canonical issue validation, pagination limits
- Security: Token handling via environment variables, no secret exposure in error paths, input validation
- Production-ready: Concurrency controls, timeouts, manual veto mechanism (👎 reaction), 3-day waiting period for auto-close
All previous review concerns have been thoroughly addressed. The code follows repo conventions (no # type: ignore, proper type safety, clear error messages), and the evidence section demonstrates both local and live GitHub verification.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW
This is new automation functionality that does not affect agent behavior, prompts, tools, or evaluation harness. The workflows can be disabled if issues arise. The auto-close mechanism has multiple safety mechanisms (3-day delay, veto reactions, manual override). Comprehensive test coverage and extensive review history provide confidence in correctness.
KEY INSIGHT:
Excellent example of thorough feature development - the 2271-line test suite (nearly 2x the production code) demonstrates commitment to reliability, and the iterative review process systematically addressed every edge case.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Automated duplicate issue detection and management workflows verified successfully.
Does this PR achieve its stated goal?
Yes. This PR successfully adds automated duplicate issue detection and management infrastructure. The workflows correctly orchestrate OpenHands Cloud API calls for duplicate analysis, handle GitHub issue labeling and auto-closing, and implement proper error handling and validation as specified in the updated AGENTS.md guidance.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully, venv created |
| CI Status | ✅ 31/34 checks passing (1 pending: this QA, 2 skipped) |
| Functional Verification | ✅ Scripts executable, workflows valid, integration points verified |
Functional Verification
Test 1: Script CLI Interfaces
Verification:
Ran both scripts with --help to verify they execute correctly:
$ uv run python scripts/issue_duplicate_check_openhands.py --help
usage: issue_duplicate_check_openhands.py [-h] --repository REPOSITORY --issue-number ISSUE_NUMBER [--output OUTPUT] ...
Start an OpenHands Cloud conversation that checks a GitHub issue for duplicates.$ uv run python scripts/auto_close_duplicate_issues.py --help
usage: auto_close_duplicate_issues.py [-h] --repository REPOSITORY [--close-after-days CLOSE_AFTER_DAYS] [--dry-run]
Auto-close issues previously flagged as duplicate candidates.Result: ✅ Both scripts run and provide proper help output.
Test 2: Input Validation
Verification:
Tested repository format validation:
$ uv run python scripts/auto_close_duplicate_issues.py --repository "invalid repo name" --dry-run
error: Invalid repository format: invalid repo name
ValueError: Invalid repository format: invalid repo nameTested required environment variable enforcement:
$ unset GITHUB_TOKEN && uv run python scripts/auto_close_duplicate_issues.py --repository "OpenHands/software-agent-sdk" --dry-run
error: GITHUB_TOKEN environment variable is required
RuntimeError: GITHUB_TOKEN environment variable is requiredResult: ✅ Scripts correctly validate inputs per AGENTS.md guidance (repository pattern, required env vars).
Test 3: Workflow YAML Syntax
Verification:
Validated workflow files parse correctly:
$ python3 -c "import yaml; yaml.safe_load(open('.github/workflows/issue-duplicate-checker.yml')); print('Valid ✓')"
Valid ✓
$ python3 -c "import yaml; yaml.safe_load(open('.github/workflows/remove-duplicate-candidate-label.yml')); print('Valid ✓')"
Valid ✓Result: ✅ All workflow YAML files are syntactically valid.
Test 4: Workflow Integration Points
Verified workflow calls scripts with correct arguments:
issue-duplicate-checker.yml(line 96-99): Callsissue_duplicate_check_openhands.pywith--repository,--issue-number,--outputfrom GitHub context variablesissue-duplicate-checker.yml(line 453-455): Callsauto_close_duplicate_issues.pywith--repository,--close-after-daysand pipes output to summary file- Token fallback implemented:
${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.token }}(lines 92, 199, 450) - Validation step checks OPENHANDS_API_KEY before execution (lines 79-86)
Result: ✅ Workflows correctly integrate with scripts and implement proper env var handling.
Test 5: Smoke Clone Workflow
Verification:
Tested the smoke-clone workflow step locally:
$ git clone --depth 1 "https://github.com/OpenHands/software-agent-sdk.git" /tmp/test-smoke-clone
Cloning into '/tmp/test-smoke-clone'...
Remote: Total 1302 (delta 81), reused 603 (delta 54)
Receiving objects: 100% (1302/1302), 2.28 MiB | 6.53 MiB/s, done.
Smoke clone test: SUCCESSResult: ✅ Smoke-clone workflow step executes successfully.
Test 6: AGENTS.md Guidance Implementation
Verified scripts implement the documented guidance:
- Repository validation: Both scripts use
REPOSITORY_PATTERN = re.compile(r"^[a-zA-Z0-9_.-]+/[a-zA-Z0-9_.-]+$")and validate before API calls - Per-issue error handling:
auto_close_duplicate_issues.pycatches RuntimeError per-issue and continues batch processing:except RuntimeError as exc: print(f"Error processing issue #{issue_number}: {exc}", file=sys.stderr) summary.append({"issue_number": issue_number, "action": "failed", "error": str(exc)})
- app_conversation_id paths: Correctly unquoted in f-string interpolation (line 381):
f"/api/v1/conversation/{app_conversation_id}/events/search?limit={EVENT_SEARCH_LIMIT}"
Result: ✅ Scripts correctly implement all guidance from AGENTS.md update.
Issues Found
None.
Why
We prototyped an OpenHands-powered GitHub issue duplicate workflow in
enyst/playgroundand want the SDK repo to use the same automation pattern with repo-appropriate token handling.Summary
pr-review-by-openhands.ymlto fall back togithub.tokenwhenOPENHANDS_BOT_GITHUB_PAT_PUBLICis absent, without auto-running the AI review on every follow-up commitIssue Number
N/A
How to Test
make build.uv run pytest tests/cross/test_issue_duplicate_scripts.py.uv run pre-commit run --files .github/workflows/issue-duplicate-checker.yml .github/workflows/remove-duplicate-candidate-label.yml .github/workflows/pr-review-by-openhands.yml scripts/issue_duplicate_check_openhands.py scripts/auto_close_duplicate_issues.py tests/cross/test_issue_duplicate_scripts.py.all-hands-botexplicitly instead of relying onpull_request.synchronize..github/workflows/issue-duplicate-checker.ymlinsmoke-clonemode from the default branch.Evidence
Local verification
Observed results:
77 passedLive GitHub verification
Observed results:
pull_request.synchronizeis intentionally not enabled for the AI reviewer in this repo, fresh AI review is now requested explicitly via reviewer request instead of on every pushissue-duplicate-checker.ymlhas already been dispatched successfully from this PR branch insmoke-clonemode; the successful workflow-dispatch runs are visible in Actions history forenyst/issue-duplicate-automationVideo/Screenshots
None.
Type
Notes
OPENHANDS_API_KEYfor OpenHands Cloud andOPENHANDS_BOT_GITHUB_PAT_PUBLIC || github.tokenfor GitHub access in this repo.Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:47d216c-pythonRun
All tags pushed for this build
About Multi-Architecture Support
47d216c-python) is a multi-arch manifest supporting both amd64 and arm6447d216c-python-amd64) are also available if needed