Skip to content

Add instruction to verify PR status before pushing#2696

Merged
enyst merged 1 commit intomainfrom
add-pr-status-check-instruction
Apr 3, 2026
Merged

Add instruction to verify PR status before pushing#2696
enyst merged 1 commit intomainfrom
add-pr-status-check-instruction

Conversation

@xingyaoww
Copy link
Copy Markdown
Collaborator

@xingyaoww xingyaoww commented Apr 3, 2026

Adds a new line to the <PULL_REQUESTS> section of the system prompt instructing the agent to verify that an existing PR is still open before pushing to its branch. If the PR has been closed or merged, the agent should create a new branch and open a new PR instead.

This prevents the agent from pushing commits to branches of already-closed/merged PRs.


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:3715f62-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-3715f62-python \
  ghcr.io/openhands/agent-server:3715f62-python

All tags pushed for this build

ghcr.io/openhands/agent-server:3715f62-golang-amd64
ghcr.io/openhands/agent-server:3715f62-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:3715f62-golang-arm64
ghcr.io/openhands/agent-server:3715f62-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:3715f62-java-amd64
ghcr.io/openhands/agent-server:3715f62-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:3715f62-java-arm64
ghcr.io/openhands/agent-server:3715f62-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:3715f62-python-amd64
ghcr.io/openhands/agent-server:3715f62-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:3715f62-python-arm64
ghcr.io/openhands/agent-server:3715f62-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:3715f62-golang
ghcr.io/openhands/agent-server:3715f62-java
ghcr.io/openhands/agent-server:3715f62-python

About Multi-Architecture Support

  • Each variant tag (e.g., 3715f62-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 3715f62-python-amd64) are also available if needed

Instructs the agent to check if an existing PR is still open before
pushing to its branch. If the PR has been closed or merged, the agent
should create a new branch and open a new PR instead.

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@xingyaoww xingyaoww marked this pull request as ready for review April 3, 2026 18:01
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable with eval verification needed

This prompt change solves a real problem (agents pushing to closed/merged PR branches) but requires benchmark verification per repo policy. The instruction is clear and pragmatic—just needs confirmation it doesn't cause unintended behavior.

@xingyaoww xingyaoww added the run-eval-50 Runs evaluation on 50 SWE-bench instances label Apr 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Evaluation Triggered

@all-hands-bot
Copy link
Copy Markdown
Collaborator

🎉 Swebench Evaluation Complete

Evaluation: 23956613410-claude-son
Model: litellm_proxy/claude-sonnet-4-5-20250929
Dataset: benchmark (test)
Commit: 574398690ef6dcefbfc2a6d44adf2a5455b8fbf6
Timestamp: 26-04-03-20-50
Triggered by: @xingyao Wang

📊 Results

  • Total instances: 500
  • Submitted instances: 50
  • Resolved instances: 37
  • Unresolved instances: 13
  • Empty patch instances: 0
  • Error instances: 0
  • Eval limit: 50
  • Success rate: 37/50 (74.0%)

🔗 Links

Full Archive

Copy link
Copy Markdown
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Yes I've had to specify this, or the agent is pushing blindly.

On a side note, sometimes I wonder if github PRs belong in the system prompt... maybe a question for another day! This makes sense now.

@enyst enyst merged commit 4ad68fd into main Apr 3, 2026
51 checks passed
@enyst enyst deleted the add-pr-status-check-instruction branch April 3, 2026 22:19
@enyst
Copy link
Copy Markdown
Collaborator

enyst commented Apr 3, 2026

37/50 looks OK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-eval-50 Runs evaluation on 50 SWE-bench instances

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants