Architecture Map (Read This Before Editing)

This file is the high-level map of the ExcelBench codebase: what lives where, the allowed dependency direction, and the main runtime/data flows.

If you are starting a new session:

Read this file (architecture.md) to orient.
Read CLAUDE.md for commands, workflows, and repo conventions.
Check active trackers under docs/trackers/ for current status and run history.

Big Picture

ExcelBench has two complementary tracks:

Fidelity (correctness): "Does this library preserve Excel semantics for feature X?"
Performance (speed/memory): "How fast is it at feature X or a scaled workload?"

A key design principle is reproducibility:

Canonical fixtures are generated by real Excel and committed to git.
Benchmarks produce JSON results as the source of truth, and render all other views from it.

Core Layers

At a high level, ExcelBench is split into five layers:

Fixtures + generator (Excel as ground truth)
Fidelity harness (adapters + scoring + diagnostics)
Performance harness (throughput workloads + best-effort memory)
Rendering + publishing (markdown/csv + HTML dashboard + plots)
Optional Rust acceleration (PyO3 extension + Rust-backed adapters)

Dependency Direction (No Cycles)

One-way dependencies only:

models.py defines the core contracts.
Adapters depend on models.
Runners depend on adapters and models.
Renderers depend on runner output schemas.

In practice:

models
  ^
adapters
  ^
harness runner (fidelity)
  ^
results renderer / visualizations

perf runner (performance) -> perf renderer

Rust extension (optional) is called by Rust-backed adapters; it should not depend on Python code.

WolfXL is an external PyPI dependency — it does not depend on ExcelBench.

Rule of thumb: keep adapters thin and deterministic. Any cross-library normalization should live in runner utilities, not in adapter implementations.

Repo Map (Where Things Live)

Most-touched top-level directories:

src/excelbench/
- cli.py: Typer CLI entrypoint (excelbench ...)
- models.py: dataclasses/contracts (CellValue, CellFormat, BorderInfo, ...)
- generator/: fixture generation (xlwings + Excel)
- harness/: fidelity benchmark runner + adapters
- perf/: performance runner + renderer
- results/: fidelity result renderers (md/csv) + dashboards/plots
fixtures/
- excel/: canonical .xlsx fixtures (git-tracked, Excel-generated)
- excel_xls/: canonical .xls fixtures
- throughput_xlsx/: scale fixtures for perf/throughput workloads
rust/excelbench_rust/ (optional, local-only)
- PyO3 crate for ExcelBench-specific Rust backends (umya-spreadsheet, basic calamine)
- The core WolfXL backends (calamine-styled, rust_xlsxwriter, xlsx patcher) are now in the standalone wolfxl package on PyPI (pip install wolfxl)
WolfXL (external dependency, pip install wolfxl)
- Standalone repo: https://github.com/SynthGL/wolfxl
- Openpyxl-compatible API: load_workbook, Workbook, Font, PatternFill, etc.
- Three modes: read (calamine-styles), write (rust_xlsxwriter), modify (XlsxPatcher)
- Installed as optional dependency: uv sync --extra rust
tests/: pytest suites (fidelity + adapter unit tests + visualization smoke tests)
docs/: plans and trackers (treat as source of truth for methodology and run logs)

Important "scratch" conventions:

test_files/ is local scratch (gitignored).
results_dev_* directories are local/ephemeral benchmark outputs (often gitignored).

Key Entry Points (Jump List)

Common starting points by intent:

Add a new adapter:
- src/excelbench/harness/adapters/base.py
- src/excelbench/harness/adapters/__init__.py
Add a new scored feature:
- Generator: src/excelbench/generator/features/
- Harness exercise/scoring: src/excelbench/harness/runner.py
Extend Tier 2/3 OOXML parsing:
- WolfXL (external): calamine-styled + ooxml utilities live in wolfxl repo
- ExcelBench-local Rust: rust/excelbench_rust/src/ (basic calamine, umya bindings)
Performance track:
- Runner: src/excelbench/perf/runner.py
- Renderer: src/excelbench/perf/renderer.py
- Throughput driver: scripts/run_throughput_dashboard.py
- Memory profiling: scripts/memory_profile.py
Dashboards/plots:
- HTML dashboard: src/excelbench/results/html_dashboard.py
- Scatter plots: src/excelbench/results/scatter.py

Main Flows

1) Fixture generation (ground truth)

xlwings -> Excel
  -> writes feature workbooks
  -> writes manifest.json
  -> fixtures committed to git

Command: uv run excelbench generate --output fixtures/excel

2) Fidelity benchmark (correctness)

fixtures + manifest
  -> runner loads adapters
  -> adapters read/write
  -> oracle verification (Excel via xlwings; fallback openpyxl)
  -> results.json + diagnostics
  -> renderers produce README.md/matrix.csv/plots

Command: uv run excelbench benchmark --tests fixtures/excel --output results

3) Performance benchmark (speed/memory)

fixtures + throughput fixtures
  -> perf runner executes workloads (no oracle)
  -> wall/cpu/rss (and optional phase breakdown)
  -> perf/results.json
  -> perf renderer produces markdown/csv

Command: uv run excelbench perf --tests fixtures/excel --output results

4) Publishing

Local: uv run excelbench html, uv run excelbench scatter, uv run excelbench heatmap
CI: .github/workflows/deploy-dashboard.yml auto-builds and deploys the HTML dashboard to Vercel

Updating This Map

Update architecture.md when:

A new top-level module/directory is introduced.
Dependency direction changes (new allowed imports / new shared utilities).
A new CLI command or major runner mode is added.
A new dashboard/output becomes a supported interface.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture Map (Read This Before Editing)

Big Picture

Core Layers

Dependency Direction (No Cycles)

Repo Map (Where Things Live)

Key Entry Points (Jump List)

Main Flows

1) Fixture generation (ground truth)

2) Fidelity benchmark (correctness)

3) Performance benchmark (speed/memory)

4) Publishing

Updating This Map

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

Architecture Map (Read This Before Editing)

Big Picture

Core Layers

Dependency Direction (No Cycles)

Repo Map (Where Things Live)

Key Entry Points (Jump List)

Main Flows

1) Fixture generation (ground truth)

2) Fidelity benchmark (correctness)

3) Performance benchmark (speed/memory)

4) Publishing

Updating This Map