Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 29 additions & 13 deletions .github/run-eval/ADDINGMODEL.md
Original file line number Diff line number Diff line change
Expand Up @@ -240,29 +240,44 @@ cd .github/run-eval
MODEL_IDS="your-model-id" GITHUB_OUTPUT=/tmp/output.txt python resolve_model_config.py
```

## Step 6: Run Integration Tests (Required Before PR)
## Step 6: Create Draft PR

**Mandatory**: Integration tests must pass before creating PR.
Push your branch and create a draft PR. Note the PR number returned - you'll need it for the integration tests.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: You say "create a draft PR" but don't show how. Add the command:

gh pr create --draft --title "Add model: your-model-id" --body "Fixes #<issue-number>"

Then users can actually capture the PR number from the output.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Step 6 instructs users to "Note the PR number returned" but doesn't show the command to create a draft PR.

Current state: Users might create the PR via web UI where the number isn't explicitly "returned".

Suggestion: Consider adding the actual command for clarity:

Suggested change
Push your branch and create a draft PR. Note the PR number returned - you'll need it for the integration tests.
Push your branch and create a draft PR:
```bash
gh pr create --draft --title "Add your-model-id" --body "WIP: Adding new model"
# Note the PR number returned - you'll need it for the integration tests

This would make the workflow fully CLI-based and consistent with Steps 7 and 9. However, this is acceptable as-is since developers working on model additions likely know how to create PRs.


### Via GitHub Actions
## Step 7: Run Integration Tests

1. Push branch: `git push origin your-branch-name`
2. Navigate to: https://github.com/OpenHands/software-agent-sdk/actions/workflows/integration-runner.yml
3. Click "Run workflow"
4. Configure:
- **Branch**: Select your branch
- **model_ids**: `your-model-id`
- **Reason**: "Testing model-id"
5. Wait for completion
6. **Save run URL** - required for PR description
Trigger integration tests on your PR branch:

```bash
gh workflow run integration-runner.yml \
-f model_ids=your-model-id \
-f reason="Testing new model from PR #<pr-number>" \
-f issue_number=<pr-number> \
--ref your-branch-name
```

Results will be posted back to the PR as a comment.

### Expected Results

- Success rate: 100% (or 87.5% if vision test skipped)
- Duration: 5-10 minutes per model
- Tests: 8 total (basic commands, file ops, code editing, reasoning, errors, tools, context, vision)

## Step 7: Create PR
## Step 8: Fix Issues and Rerun (if needed)

If tests fail, see [Common Issues](#common-issues) below. After fixing:

1. Push the fix: `git add . && git commit && git push`
2. Rerun integration tests with the same command from Step 7 (using the same PR number)

## Step 9: Mark PR Ready

When tests pass, mark the PR as ready for review:

```bash
gh pr ready <pr-number>
```

### Required in PR Description

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Since workflow results are now auto-posted as PR comments (line 263), is the manual "Integration test results" link still required in the PR description? Either:

  1. Update this section to say the run URL is optional (workflow posts it automatically), or
  2. Clarify that contributors should still add it manually for visibility

Right now it's unclear whether this is redundant with the automated comment.

Expand Down Expand Up @@ -379,3 +394,4 @@ Fixes #[issue-number]
- Recent model additions: #2102, #2153, #2207, #2233, #2269
- Common issues: #2147 (hangs), #2137 (parameters), #2110 (vision), #2233 (variants), #2193 (preflight)
- Integration test workflow: `.github/workflows/integration-runner.yml`
- Integration tests can be triggered via: `gh workflow run integration-runner.yml --ref <branch>`
Loading