Problem or motivation
Problem
Tuning PaperBanana (VLM/image providers and models, refinement iterations, --optimize, --auto) means many manual runs and informal comparison of outputs/run_* folders. There is no single artifact that lists what was tried and how variants ranked.
Proposal
Add a paperbanana sweep command that:
- Takes one
--input + --caption and sweeps over configurable axes (e.g. providers, models, iterations, optimize/auto).
- Writes each variant under
sweep_<id>/variant_<nnn>/ and produces sweep_report.json with per-variant status, timing, and a ranked summary.
- Supports
--dry-run to preview the variant matrix without API calls.
Proposed solution
- Add paperbanana sweep: one input + caption; axes as comma-separated lists (providers, models, iterations, optimize/auto); optional --max-variants and --dry-run.
- Add paperbanana/core/sweep.py: build the variant matrix (Cartesian product), parse values, rank successes, summarize.
- Per variant: merge into Settings, set output_dir to sweep_/variant_/, run PaperBananaPipeline.generate(), record timing + a simple score from the final critic suggestions; failures get an error field.
- Output: sweep_report.json with results and ranked list; dry-run writes planned variants only, no API calls.
- Tests: sweep helpers + CLI dry-run / bad axis input.
Area
Pipeline / agents
Alternatives considered
No response
Willingness to contribute
Problem or motivation
Problem
Tuning PaperBanana (VLM/image providers and models, refinement iterations,
--optimize,--auto) means many manual runs and informal comparison ofoutputs/run_*folders. There is no single artifact that lists what was tried and how variants ranked.Proposal
Add a
paperbanana sweepcommand that:--input+--captionand sweeps over configurable axes (e.g. providers, models, iterations, optimize/auto).sweep_<id>/variant_<nnn>/and producessweep_report.jsonwith per-variant status, timing, and a ranked summary.--dry-runto preview the variant matrix without API calls.Proposed solution
Area
Pipeline / agents
Alternatives considered
No response
Willingness to contribute