Skip to content

feat: add parameter sweep command for variant generation and ranking#118

Open
claytonlin1110 wants to merge 5 commits intollmsresearch:mainfrom
claytonlin1110:feat/cli-parameter-sweep
Open

feat: add parameter sweep command for variant generation and ranking#118
claytonlin1110 wants to merge 5 commits intollmsresearch:mainfrom
claytonlin1110:feat/cli-parameter-sweep

Conversation

@claytonlin1110
Copy link
Copy Markdown
Contributor

@claytonlin1110 claytonlin1110 commented Mar 25, 2026

Summary

Closes #119

  • add new paperbanana sweep CLI command to run a cartesian sweep across providers, models, iterations, and optimization/auto-refine modes
  • add paperbanana/core/sweep.py with structured variant planning, CSV axis parsing, ranking, and summary helpers
  • persist sweep outputs under sweep_<id>/variant_<id>/ and write sweep_report.json with per-variant status, runtime, and ranked results
  • include dry-run mode to preview planned variants without API calls
  • add tests for sweep core helpers and CLI dry-run / validation behavior

Motivation

One diagram depends on many settings (providers, models, iterations, optimize/auto-refine). Changing flags and comparing runs by hand is slow and easy to lose track of. A sweep runs those combinations in one go, writes a single ranked report, and supports dry-run so you can plan before spending API quota.

@claytonlin1110
Copy link
Copy Markdown
Contributor Author

@dippatel1994 Would you please review this?

@claytonlin1110
Copy link
Copy Markdown
Contributor Author

@dippatel1994 please review

@claytonlin1110
Copy link
Copy Markdown
Contributor Author

@dippatel1994 Any update about this feature?

Copy link
Copy Markdown
Member

@dippatel1994 dippatel1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI passes, nice work. A few things to fix:

  1. Settings constructed per-variant inside the loopload_dotenv() should be called once before the loop, and Settings should be built once then copied with model_copy(update=overrides) per variant. Currently re-parses YAML on every iteration.

  2. Missing --pdf-pages option — Every other command that accepts --input with PDF support also exposes --pdf-pages. The sweep command calls load_methodology_source(input_path) without it.

  3. No non-dry-run test — Only --dry-run and validation are tested. Add a test that mocks the pipeline and verifies the sweep report structure (status, ranked results, timing). test_ablate_retrieval_writes_report is a good template.

Non-blocking: The quality proxy formula (100 - 12.5 * suggestions) is undocumented and fragile — consider at minimum documenting it in --help. Also missing --budget and --auto-download-data flags for parity with generate/batch.

@claytonlin1110
Copy link
Copy Markdown
Contributor Author

Thank you for yuor feedback, @dippatel1994
Just updated PR

Copy link
Copy Markdown
Member

@dippatel1994 dippatel1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 3 points addressed. Settings built once with model_copy per variant, --pdf-pages added, non-dry-run test added. CI green. LGTM.

@claytonlin1110
Copy link
Copy Markdown
Contributor Author

Thanks @dippatel1994
ready to be merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add CLI parameter sweep to compare provider/model and pipeline settings

2 participants