Add plan runner mode (run multiple benchmarks unattended) #92

etiennedi · 2026-01-06T13:58:09Z

Add Plan Runner for Automated Benchmarking

This PR adds plan_runner.py, a new automation tool for running benchmarks across multiple configurations and comparing control vs candidate branches.

What's New

Plan Runner (plan_runner.py) automates the pattern of:

Ingesting data on a specified branch
Running query benchmarks on both control and candidate branches
Generating visualizations comparing the results

All configuration is defined in a YAML plan file that specifies:

Control and candidate branches to compare
Global parameters shared across all runs
Individual run configurations with specific parameters
Optional async indexing per run

Backward Compatibility

No changes to existing workflows. The benchmarker and visualizer can still be used independently for single runs. This tool is purely additive.

Getting Started

Copy plan.yml.example to plan.yml
Adjust branches, parameters, and runs to your needs
Run: python3 plan_runner.py plan.yml

Use --dry-run to preview what will be executed without running anything.

Features

Automatic branch switching and Weaviate rebuilding
Result archiving in results_archive/
Visualizations in visualizations/ with:
- Run name as title
- All parameters as subtitle
- Branch information at bottom
Per-run async indexing configuration
Process cleanup - kills stale Weaviate instances
Graceful error handling and logging