Skip to content

Latest commit

 

History

History
176 lines (125 loc) · 5.77 KB

File metadata and controls

176 lines (125 loc) · 5.77 KB

Architecture Map (Read This Before Editing)

This file is the high-level map of the ExcelBench codebase: what lives where, the allowed dependency direction, and the main runtime/data flows.

If you are starting a new session:

  1. Read this file (architecture.md) to orient.
  2. Read CLAUDE.md for commands, workflows, and repo conventions.
  3. Check active trackers under docs/trackers/ for current status and run history.

Big Picture

ExcelBench has two complementary tracks:

  • Fidelity (correctness): "Does this library preserve Excel semantics for feature X?"
  • Performance (speed/memory): "How fast is it at feature X or a scaled workload?"

A key design principle is reproducibility:

  • Canonical fixtures are generated by real Excel and committed to git.
  • Benchmarks produce JSON results as the source of truth, and render all other views from it.

Core Layers

At a high level, ExcelBench is split into five layers:

  1. Fixtures + generator (Excel as ground truth)
  2. Fidelity harness (adapters + scoring + diagnostics)
  3. Performance harness (throughput workloads + best-effort memory)
  4. Rendering + publishing (markdown/csv + HTML dashboard + plots)
  5. Optional Rust acceleration (PyO3 extension + Rust-backed adapters)

Dependency Direction (No Cycles)

One-way dependencies only:

  • models.py defines the core contracts.
  • Adapters depend on models.
  • Runners depend on adapters and models.
  • Renderers depend on runner output schemas.

In practice:

models
  ^
adapters
  ^
harness runner (fidelity)
  ^
results renderer / visualizations

perf runner (performance) -> perf renderer

Rust extension (optional) is called by Rust-backed adapters; it should not depend on Python code.

WolfXL is an external PyPI dependency — it does not depend on ExcelBench.

Rule of thumb: keep adapters thin and deterministic. Any cross-library normalization should live in runner utilities, not in adapter implementations.

Repo Map (Where Things Live)

Most-touched top-level directories:

  • src/excelbench/

    • cli.py: Typer CLI entrypoint (excelbench ...)
    • models.py: dataclasses/contracts (CellValue, CellFormat, BorderInfo, ...)
    • generator/: fixture generation (xlwings + Excel)
    • harness/: fidelity benchmark runner + adapters
    • perf/: performance runner + renderer
    • results/: fidelity result renderers (md/csv) + dashboards/plots
  • fixtures/

    • excel/: canonical .xlsx fixtures (git-tracked, Excel-generated)
    • excel_xls/: canonical .xls fixtures
    • throughput_xlsx/: scale fixtures for perf/throughput workloads
  • rust/excelbench_rust/ (optional, local-only)

    • PyO3 crate for ExcelBench-specific Rust backends (umya-spreadsheet, basic calamine)
    • The core WolfXL backends (calamine-styled, rust_xlsxwriter, xlsx patcher) are now in the standalone wolfxl package on PyPI (pip install wolfxl)
  • WolfXL (external dependency, pip install wolfxl)

    • Standalone repo: https://github.com/SynthGL/wolfxl
    • Openpyxl-compatible API: load_workbook, Workbook, Font, PatternFill, etc.
    • Three modes: read (calamine-styles), write (rust_xlsxwriter), modify (XlsxPatcher)
    • Installed as optional dependency: uv sync --extra rust
  • tests/: pytest suites (fidelity + adapter unit tests + visualization smoke tests)

  • docs/: plans and trackers (treat as source of truth for methodology and run logs)

Important "scratch" conventions:

  • test_files/ is local scratch (gitignored).
  • results_dev_* directories are local/ephemeral benchmark outputs (often gitignored).

Key Entry Points (Jump List)

Common starting points by intent:

  • Add a new adapter:

    • src/excelbench/harness/adapters/base.py
    • src/excelbench/harness/adapters/__init__.py
  • Add a new scored feature:

    • Generator: src/excelbench/generator/features/
    • Harness exercise/scoring: src/excelbench/harness/runner.py
  • Extend Tier 2/3 OOXML parsing:

    • WolfXL (external): calamine-styled + ooxml utilities live in wolfxl repo
    • ExcelBench-local Rust: rust/excelbench_rust/src/ (basic calamine, umya bindings)
  • Performance track:

    • Runner: src/excelbench/perf/runner.py
    • Renderer: src/excelbench/perf/renderer.py
    • Throughput driver: scripts/run_throughput_dashboard.py
    • Memory profiling: scripts/memory_profile.py
  • Dashboards/plots:

    • HTML dashboard: src/excelbench/results/html_dashboard.py
    • Scatter plots: src/excelbench/results/scatter.py

Main Flows

1) Fixture generation (ground truth)

xlwings -> Excel
  -> writes feature workbooks
  -> writes manifest.json
  -> fixtures committed to git

Command: uv run excelbench generate --output fixtures/excel

2) Fidelity benchmark (correctness)

fixtures + manifest
  -> runner loads adapters
  -> adapters read/write
  -> oracle verification (Excel via xlwings; fallback openpyxl)
  -> results.json + diagnostics
  -> renderers produce README.md/matrix.csv/plots

Command: uv run excelbench benchmark --tests fixtures/excel --output results

3) Performance benchmark (speed/memory)

fixtures + throughput fixtures
  -> perf runner executes workloads (no oracle)
  -> wall/cpu/rss (and optional phase breakdown)
  -> perf/results.json
  -> perf renderer produces markdown/csv

Command: uv run excelbench perf --tests fixtures/excel --output results

4) Publishing

  • Local: uv run excelbench html, uv run excelbench scatter, uv run excelbench heatmap
  • CI: .github/workflows/deploy-dashboard.yml auto-builds and deploys the HTML dashboard to Vercel

Updating This Map

Update architecture.md when:

  • A new top-level module/directory is introduced.
  • Dependency direction changes (new allowed imports / new shared utilities).
  • A new CLI command or major runner mode is added.
  • A new dashboard/output becomes a supported interface.