ARCHITECTURE MAP: Read
architecture.mdbefore making structural changes. Keep it updated when you introduce new top-level modules, flows, or dependency directions.
DECISION LOG: When making a significant design or architecture decision, always add an entry to
decisions.mdusing theDEC-NNNformat. Next ID: DEC-018.
ExcelBench is an objective, reproducible benchmark suite that scores Python Excel libraries on feature fidelity (correctness) and optionally performance (speed/memory). It tests 17 features across 12+ adapters using real Excel-generated fixtures as ground truth.
# Install deps
uv sync
# Run tests
uv run pytest
# Run fidelity benchmark (requires pre-built fixtures)
uv run excelbench benchmark --tests fixtures/excel --output results
# Run both xlsx + xls profiles
uv run excelbench benchmark-profiles --output results
# Run performance benchmark
uv run excelbench perf --tests fixtures/excel --output results
# Regenerate reports from existing results.json (no re-run needed)
uv run excelbench report --input results/xlsx/results.json --output results/xlsx
# Generate heatmap PNG + SVG
uv run excelbench heatmap
# Generate combined fidelity + performance dashboard
uv run excelbench dashboard
# Generate single-file interactive HTML dashboard
uv run excelbench html
# Generate fidelity-vs-throughput scatter plots
uv run excelbench scatter
# Lint + typecheck
uv run ruff check
uv run mypyuv sync --extra rust # installs wolfxl from PyPI (pre-built wheel)
uv run python -c "import wolfxl; print(wolfxl.__version__)"For local development of wolfxl itself, see https://github.com/SynthGL/wolfxl.
Note: umya/basic calamine adapters still require local maturin develop from rust/excelbench_rust/.
uv run excelbench generate --output fixtures/excel # .xlsx (xlwings + Excel)
uv run excelbench generate-xls --output fixtures/excel_xls # .xlssrc/excelbench/
cli.py # Typer CLI: generate, benchmark, perf, report
models.py # Core dataclasses: CellValue, CellFormat, BorderInfo, etc.
generator/
base.py # FeatureGenerator protocol
generate.py # xlsx generation orchestrator (xlwings)
generate_xls.py # xls generation (xlwt)
features/ # 17 feature generators (one file per feature)
harness/
runner.py # Benchmark orchestrator: read tests + write verification
adapters/
base.py # ExcelAdapter ABC + ReadOnlyAdapter/WriteOnlyAdapter
__init__.py # Registry: get_all_adapters() with optional-import guards
openpyxl_adapter.py # Reference adapter (R+W, full fidelity)
xlsxwriter_adapter.py # Write-only, full fidelity
calamine_adapter.py # python-calamine (read-only)
... # 12 more adapters (see registry)
rust_*.py / umya_*.py # Rust/PyO3 adapters (require wolfxl._rust)
perf/
runner.py # Performance measurement (wall/cpu/rss, phase breakdown)
renderer.py # Perf results to markdown/CSV/JSON
results/
renderer.py # Fidelity results to markdown/CSV/JSON
html_dashboard.py # Single-file interactive HTML dashboard generator
scatter.py # Fidelity-vs-throughput scatter plots (PNG/SVG)
rust/excelbench_rust/ # Local-only PyO3 crate for ExcelBench-specific backends
src/lib.rs # Module entry + build_info()
src/calamine_backend.rs # Basic calamine read bindings (no styles)
src/umya_backend.rs # umya-spreadsheet R+W bindings
# WolfXL (standalone — https://github.com/SynthGL/wolfxl)
# Installed from PyPI: `uv sync --extra rust` or `pip install wolfxl`
# Core backends: calamine-styled (read), rust_xlsxwriter (write), XlsxPatcher (modify)
fixtures/
excel/ # Canonical .xlsx fixtures (git-tracked, Excel-generated)
excel_xls/ # Canonical .xls fixtures
throughput_xlsx/ # Scale fixtures for perf benchmarks
tests/ # pytest + pytest-cov
- Create
src/excelbench/harness/adapters/<name>_adapter.py - Subclass
ExcelAdapter(orReadOnlyAdapter/WriteOnlyAdapter) - Implement required abstract methods (see
base.py) - Add optional-import guard + registration in
__init__.py→get_all_adapters() - Add dependency to
pyproject.tomlif needed
- Create generator in
src/excelbench/generator/features/<name>.py(subclassFeatureGenerator) - Register in
features/__init__.pyandgenerate.py - Add read/write handling in
harness/runner.py - Add adapter methods to
base.pyprotocol + implement in each adapter - Regenerate fixtures:
uv run excelbench generate --output fixtures/excel
- 3 = all tests pass (complete fidelity)
- 2 = >=80% pass
- 1 = >=50% pass
- 0 = <50% pass
- Per-test importance weights:
basic(must-pass) vsedge(bonus)
- Primary: Excel via xlwings (write verification)
- Fallback: openpyxl (CI/headless)
- Performance mode skips oracle entirely
- Python 3.11+, type hints required on function signatures
- Package manager:
uv(lockfile:uv.lock) - Linter:
ruff(line-length 100, rules: E/F/I/N/W/UP) - Type checker:
mypy --strict(subset of files; expand incrementally) - Test runner:
pytest -v --cov=excelbench - Build backend: hatchling (pure Python); Rust via maturin (separate crate)
- Commit style: conventional commits (
feat/fix/refactor/test/docs/chore)
- Fidelity: Tier 0 (3) + Tier 1 (6) + Tier 2 (8, incl. pivot_tables=N/A) = 17 features tested, 16 with scores
- Adapters: 12 Python xlsx + 2 xls + 5 Rust/PyO3 = 19 total
- WolfXL (hybrid): calamine read + rust_xlsxwriter write → R:17/18 + W:17/18 green (S- tier)
- calamine-styled: R:17/18 green (borders=1 diagonal, images=0)
- rust_xlsxwriter: W:17/18 green (images=0)
- WolfXL modify mode:
load_workbook(path, modify=True)→ surgical ZIP patching, 10-14x vs openpyxl - pyumya: R:13/W:15 green (alignment indent, hyperlinks tooltip, images read = upstream)
- Performance: Runner + renderer + throughput dashboard operational; bulk read/write methods on Rust adapters
- Per-cell read (10K, release): WolfXL 995K/s vs openpyxl 284K/s → 3.5x faster
- Per-cell styled read (1K, release): WolfXL 624-742K/s vs openpyxl 131-137K/s → 4-5x faster
- Bulk read (10K, release): WolfXL 1.26M/s vs openpyxl 372K/s → 3.4x faster
- Bulk write (100K, release): WolfXL 1.73M/s vs openpyxl 347K/s → 5.0x faster
- Hybrid optimization: fast formula XML parser + Python cell cache + sheet XML caching
- Rust adapters have
read_sheet_values()andwrite_sheet_values()for bulk ops
- Visualizations: Heatmap (PNG/SVG), combined fidelity+perf dashboard, tier list, scatter plots
- Rust adapters: Built locally via maturin; pyo3 0.24; calamine fork at SynthGL/calamine#styles
- Features:
calamine,rust_xlsxwriter,umya,wolfxl(WolfXL patcher)
- Features:
- CLI commands: generate, generate-xls, benchmark, benchmark-profiles, perf, report, heatmap, dashboard, html, scatter
- CI/CD: GitHub Actions CI (lint/test/benchmark on all pushes) + deploy-dashboard (auto-deploys HTML to Vercel on results/generator changes)
docs/plans/2026-02-04-excelbench-design.md— Original design docdocs/plans/phase1-implementation.md— Phase 1 plan (done)docs/plans/phase3-rust-integration.md— Rust integration plan (mostly done)docs/plans/2026-02-08-performance-benchmarks.md— Perf track designdocs/trackers/library-expansion-tracker.md— Adapter inventory + scoresdocs/trackers/performance-benchmarks.md— Perf implementation trackerdocs/trackers/performance-benchmark-runs.md— Run log
- Dashboard URL: https://excelbench.vercel.app
- Auto-deploy workflow:
.github/workflows/deploy-dashboard.yml- Triggers on pushes to
masterthat changeresults/xlsx/results.json,results/perf/results.json,results/xlsx/*.svg, orsrc/excelbench/results/html_dashboard.py - Also supports manual trigger via
workflow_dispatch - Generates HTML dashboard with
uv run excelbench html --output deploy/index.html, then deploys to Vercel - Requires 3 GitHub secrets:
VERCEL_TOKEN,VERCEL_ORG_ID,VERCEL_PROJECT_ID
- Triggers on pushes to
uv syncclobbers locally-installed Rust extension — always rerunmaturin developafter- xlwings requires Excel installed + macOS accessibility permissions for fixture generation
- Pivot tables need a Windows-generated fixture (macOS Excel limitation)
test_files/is gitignored scratch space;fixtures/excel/is the canonical tracked setresults_dev_*directories are ephemeral benchmark outputs (gitignored)