Skip to content

Commit 9309b05

Browse files
Change eval_limit from choice to free-form string input
The choice dropdown only allowed specific values (1, 50, 100, 200, 500), making it annoying to debug with custom instance counts (e.g., 10). Change to a string input so any positive integer can be entered. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 2d3d96d commit 9309b05

File tree

2 files changed

+3
-9
lines changed

2 files changed

+3
-9
lines changed

.agents/skills/run-eval.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ curl -X POST \
3232

3333
**Key parameters:**
3434
- `benchmark`: `swebench`, `swebenchmultimodal`, `gaia`, `swtbench`, `commit0`, `multiswebench`
35-
- `eval_limit`: `1`, `50`, `100`, `200`, `500`
35+
- `eval_limit`: Any positive integer (e.g., `1`, `10`, `50`, `200`)
3636
- `model_ids`: See `.github/run-eval/resolve_model_config.py` for available models
3737
- `benchmarks_branch`: Use feature branch from the benchmarks repo to test benchmark changes before merging
3838

.github/workflows/run-eval.yml

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -32,16 +32,10 @@ on:
3232
default: false
3333
type: boolean
3434
eval_limit:
35-
description: Number of instances to run
35+
description: Number of instances to run (any positive integer)
3636
required: false
3737
default: '1'
38-
type: choice
39-
options:
40-
- '1'
41-
- '100'
42-
- '50'
43-
- '200'
44-
- '500'
38+
type: string
4539
model_ids:
4640
description: Comma-separated model IDs to evaluate. Must be keys of MODELS in resolve_model_config.py. Defaults to first model in that
4741
dict.

0 commit comments

Comments
 (0)