-
Notifications
You must be signed in to change notification settings - Fork 471
multi-env evals config #734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| raw_multi_env_config = [{"env_id": env_id} for env_id in env_ids] | ||
| else: | ||
| # single-eval env | ||
| raw_multi_env_config = [{"env_id": args.env_id_or_path}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing TOML path gives confusing module-not-found error
Low Severity
When a user provides a path ending in .toml but the file doesn't exist (e.g., typo in path like config/debug.toml instead of configs/debug.toml), is_toml_config returns False because Path.is_file() fails. The code then falls through to treating the path as an environment ID, causing a confusing "module not found" error instead of "TOML config file not found". Since no valid environment ID would end in .toml, paths with this extension should check for file existence and give a clear error message when missing.
Description
This PR implements evaluating multiple environments in parallel via
vf-eval. For more details check the updated docs.This PR is mainly concerned with the config system. Cosmetic updates will be shipped separately, e.g see #735
Examples
By default, we still evaluate a single env with no changes to the interface
To configure multi-environment training, specify a comma-separated list of env ids
Note, that all environments use their default configuration. Since CLI arguments apply to all enviroments one can only change values for all environments at the same time. To have more fine-grained configurability, check below.
To configure multi-environment training with (potentially) different arguments for each specify a path to a TOML config file
Type of Change
Testing
uv run pytestlocally.Checklist
Additional Notes
Note
Introduces parallel multi-environment evaluation and a more flexible CLI.
env_id_or_pathnow accepts a single env ID, a comma-separated list, or a TOML file; per-env settings resolve with precedence: TOML > CLI > env defaults > globalMultiEvalConfigandrun_multi_evaluation()execute all envs concurrently; refactors single-run flow and centralizes result printing/performance reportingis_toml_config()andload_toml_config()with validation; simplifiesprint_resultsand moves event loop lag monitoring to multi-run; reduces lag monitor log level to debugprint_resultsfromEvalConfig; retains existing flags/behavior for single-env runsconfigs/evals/debug.tomlWritten by Cursor Bugbot for commit c4d690d. This will update automatically on new commits. Configure here.