This file is the high-level map of the ExcelBench codebase: what lives where, the allowed dependency direction, and the main runtime/data flows.
If you are starting a new session:
- Read this file (
architecture.md) to orient. - Read
CLAUDE.mdfor commands, workflows, and repo conventions. - Check active trackers under
docs/trackers/for current status and run history.
ExcelBench has two complementary tracks:
- Fidelity (correctness): "Does this library preserve Excel semantics for feature X?"
- Performance (speed/memory): "How fast is it at feature X or a scaled workload?"
A key design principle is reproducibility:
- Canonical fixtures are generated by real Excel and committed to git.
- Benchmarks produce JSON results as the source of truth, and render all other views from it.
At a high level, ExcelBench is split into five layers:
- Fixtures + generator (Excel as ground truth)
- Fidelity harness (adapters + scoring + diagnostics)
- Performance harness (throughput workloads + best-effort memory)
- Rendering + publishing (markdown/csv + HTML dashboard + plots)
- Optional Rust acceleration (PyO3 extension + Rust-backed adapters)
One-way dependencies only:
models.pydefines the core contracts.- Adapters depend on models.
- Runners depend on adapters and models.
- Renderers depend on runner output schemas.
In practice:
models
^
adapters
^
harness runner (fidelity)
^
results renderer / visualizations
perf runner (performance) -> perf renderer
Rust extension (optional) is called by Rust-backed adapters; it should not depend on Python code.
WolfXL is an external PyPI dependency — it does not depend on ExcelBench.
Rule of thumb: keep adapters thin and deterministic. Any cross-library normalization should live in runner utilities, not in adapter implementations.
Most-touched top-level directories:
-
src/excelbench/cli.py: Typer CLI entrypoint (excelbench ...)models.py: dataclasses/contracts (CellValue, CellFormat, BorderInfo, ...)generator/: fixture generation (xlwings + Excel)harness/: fidelity benchmark runner + adaptersperf/: performance runner + rendererresults/: fidelity result renderers (md/csv) + dashboards/plots
-
fixtures/excel/: canonical .xlsx fixtures (git-tracked, Excel-generated)excel_xls/: canonical .xls fixturesthroughput_xlsx/: scale fixtures for perf/throughput workloads
-
rust/excelbench_rust/(optional, local-only)- PyO3 crate for ExcelBench-specific Rust backends (umya-spreadsheet, basic calamine)
- The core WolfXL backends (calamine-styled, rust_xlsxwriter, xlsx patcher) are now in the
standalone
wolfxlpackage on PyPI (pip install wolfxl)
-
WolfXL (external dependency,
pip install wolfxl)- Standalone repo: https://github.com/SynthGL/wolfxl
- Openpyxl-compatible API:
load_workbook,Workbook,Font,PatternFill, etc. - Three modes: read (calamine-styles), write (rust_xlsxwriter), modify (XlsxPatcher)
- Installed as optional dependency:
uv sync --extra rust
-
tests/: pytest suites (fidelity + adapter unit tests + visualization smoke tests) -
docs/: plans and trackers (treat as source of truth for methodology and run logs)
Important "scratch" conventions:
test_files/is local scratch (gitignored).results_dev_*directories are local/ephemeral benchmark outputs (often gitignored).
Common starting points by intent:
-
Add a new adapter:
src/excelbench/harness/adapters/base.pysrc/excelbench/harness/adapters/__init__.py
-
Add a new scored feature:
- Generator:
src/excelbench/generator/features/ - Harness exercise/scoring:
src/excelbench/harness/runner.py
- Generator:
-
Extend Tier 2/3 OOXML parsing:
- WolfXL (external): calamine-styled + ooxml utilities live in wolfxl repo
- ExcelBench-local Rust:
rust/excelbench_rust/src/(basic calamine, umya bindings)
-
Performance track:
- Runner:
src/excelbench/perf/runner.py - Renderer:
src/excelbench/perf/renderer.py - Throughput driver:
scripts/run_throughput_dashboard.py - Memory profiling:
scripts/memory_profile.py
- Runner:
-
Dashboards/plots:
- HTML dashboard:
src/excelbench/results/html_dashboard.py - Scatter plots:
src/excelbench/results/scatter.py
- HTML dashboard:
xlwings -> Excel
-> writes feature workbooks
-> writes manifest.json
-> fixtures committed to git
Command: uv run excelbench generate --output fixtures/excel
fixtures + manifest
-> runner loads adapters
-> adapters read/write
-> oracle verification (Excel via xlwings; fallback openpyxl)
-> results.json + diagnostics
-> renderers produce README.md/matrix.csv/plots
Command: uv run excelbench benchmark --tests fixtures/excel --output results
fixtures + throughput fixtures
-> perf runner executes workloads (no oracle)
-> wall/cpu/rss (and optional phase breakdown)
-> perf/results.json
-> perf renderer produces markdown/csv
Command: uv run excelbench perf --tests fixtures/excel --output results
- Local:
uv run excelbench html,uv run excelbench scatter,uv run excelbench heatmap - CI:
.github/workflows/deploy-dashboard.ymlauto-builds and deploys the HTML dashboard to Vercel
Update architecture.md when:
- A new top-level module/directory is introduced.
- Dependency direction changes (new allowed imports / new shared utilities).
- A new CLI command or major runner mode is added.
- A new dashboard/output becomes a supported interface.