feat(formula-plane): add opt-in span evaluation runtime by PSU3D0 · Pull Request #98 · PSU3D0/formualizer

PSU3D0 · 2026-05-13T07:19:11Z

Summary

This is the main FormulaPlane / span evaluation project PR.

It introduces an opt-in span runtime for Formualizer: repeated formula families can be represented, dirtied, shifted, and evaluated as compact spans instead of eagerly materializing every formula cell as a standalone graph vertex. The stable dependency-graph path remains the default for 0.6; span evaluation is explicitly gated behind opt-in configuration across the public surfaces.

Diff scale: ~125 commits, 217 files, ~56k insertions.

Major areas

FormulaPlane runtime and promotion

Added the FormulaPlane runtime, span store, span scheduler, template canonicalization, placement pipeline, dependency summaries, region index, diagnostics, and promotion counters.
Added authoritative experimental span mode while preserving default FormulaPlaneMode::Off behavior.
Added canonical/template support for copied formula families, parameterized literals, constant-result broadcast, range-argument precedents, whole-axis precedents, cross-sheet reads, and function dependency contracts.
Added function-family promotion support for arithmetic, criteria aggregates, lookup/index families, whole-column/whole-axis patterns, and affine literal families.
Added compact affine literal binding encodings for exact integer-like row/column/rect progressions, with dictionary fallback for non-integer or irregular literals.

Dirtying, demotion, and structural edits

Added span dirty projection and bounded dirty-domain preservation.
Added structural handling for row/column insert/delete, affected-region scoping, tail precision, and structural span shifting.
Added conservative demotion for unsupported edits/shapes, per-cell writes inside spans, internal dependency chains, volatile/dynamic formulas, and other unsupported families.
Kept internal chains/running balances on the legacy graph path for this release.

Workbook, bindings, and opt-in surfaces

Added opt-in span evaluation wiring through Rust workbook config, Python, WASM/JS, and C FFI.
Preserved default stable semantics: users do not get FormulaPlane unless they request it.
Added workbook changelog dirtying coverage for promoted spans.

Load/ingest work

Added sparse initial ingest paths for JSON, Umya, and Calamine-backed loading.
Kept Calamine dependency publishable via crates.io calamine = "0.35".
Preserved a migration seam for future Calamine formula-record streaming once the upstream API is available in a crates.io release.
Calamine structured table metadata remains a known gap for s019/s020; Umya remains the fuller XLSX compatibility path for those cases.

Benchmark corpus and tooling

Added the scenario corpus/harness and FormulaPlane Off/Auth parity tooling.
Added backend selection for corpus probing.
Added structural engine-stat invariants for span counts, graph formula vertices, graph edges, and AST roots.
Added affine literal scenarios s081–s086 covering perfect affine rows, column legacy behavior, outliers, periodic outliers, gaps, and non-integer dictionary fallback.

Docs and release posture

Replaced internal working docs with public docs-site/README/CHANGELOG guidance.
Added docs for FormulaPlane span evaluation and large workbook performance.
Version bump is intentionally left for a follow-up release commit after merge.

0.6 release posture

FormulaPlane/span evaluation is experimental and opt-in.
Default workbook behavior remains the stable dependency graph.
Unsupported span shapes fall back to legacy graph evaluation.
Internal dependency chains and array-literal formula families are not span-promoted in 0.6.
Calamine formula-record streaming is deferred until upstream Calamine publishes the API.

Validation

Final gates run locally:

cargo fmt --all -- --check
cargo clippy -p formualizer-eval --all-targets -- -D warnings
cargo clippy -p formualizer-workbook --all-targets --features json,umya,calamine -- -D warnings
cargo clippy -p formualizer-bench-core --features formualizer_runner,ironcalc_runner --all-targets -- -D warnings
cargo test -p formualizer-workbook --all-targets --features json,umya,calamine
cargo test -p formualizer-eval --release --no-run
cd docs-site && bun run types:check
cargo package -p formualizer-workbook --allow-dirty --no-verify

Final corpus reruns:

target/scenario-corpus/final-pr-s001-s035-small-umya-20260512
target/scenario-corpus/final-pr-s001-s035-medium-umya-20260512
target/scenario-corpus/final-pr-s001-s035-small-calamine-20260512
target/scenario-corpus/final-pr-s001-s035-medium-calamine-20260512
target/scenario-corpus/final-pr-affine-small-calamine-20260512

Calamine corpus excludes known s019/s020 structured table metadata cases.

Adds --phase-timeout-ms with scale-aware defaults (small=5s, medium=15s, large=60s) and a watchdog thread that flips a cancellation flag. Limitations (documented for future fix): - Cancellation only honored at coarse evaluate_all checkpoints. In-flight scalar evals run to completion before the cancel flag is read. - Pre-eval phases (fixture build, load, structural-op demote+materialize) have NO cancel hooks. Scenarios that hang in those phases (e.g. s035 column-delete demotion at medium scale) will still hang the runner. Subprocess-per-tuple is the proper fix when batch reliability matters. Watchdog uses condvar with timeout so it returns promptly when eval finishes early; no thread accumulation across tuples.

…structural ops Fixes the structural-op blowup in column-insert/delete that surfaced at medium scale (s034 edit_3 = 410s, s035 Auth never finished after 1400s). Two surgical changes anchored in docs/design/formula-plane/dispatch/structural-op-blowup-investigation.md. ## Change 1: CsrMutableEdges::update_coord becomes O(1) Before: `self.vertex_ids.iter().position(|&id| id == vertex_id.0)` was a full linear scan across the edge-cache vertex-id array per moved vertex. For a sheet with 50k formula vertices and a column-insert moving 50k of them, that's 2.5e9 integer comparisons per structural edit. After: side index `vertex_pos: FxHashMap<u32, usize>` maintained at every call site that mutates `vertex_ids` (constructors new/with_coords/ build_from_adjacency, mutators add_vertex/add_vertices_batch, rebuild). update_coord is now O(1) hash lookup with debug_assert that the position matches. ## Change 2: ReferenceAdjuster::adjust_ast_if_changed avoids debug-string compare Before: VertexEditor::insert_columns and ::delete_columns ran `format!("{ast:?}") != format!("{adjusted:?}")` for every formula vertex in the workbook to detect whether the adjusted AST actually changed. Each comparison allocated two debug-rendered strings. After: new `adjust_ast_if_changed` traverses the AST and returns Option<ASTNode>, only allocating an adjusted AST if at least one reference actually changed. Compares ReferenceType via PartialEq (verified derived). For unchanged formulas the cost is now traversal only, no allocation. Together these explain the s034 variance: edit_3 inserts before column A, which means EVERY relative `A{r}` reference shifts. The combination of O(M*V) edge-coord updates + N debug-string allocations + N AST clones was the 410-second hot loop. ## Bundled correctness fix CsrMutableEdges `batch_mode: bool` -> `batch_depth: usize` counter. With the bool, nested begin_batch/end_batch pairs (e.g. when a sheet-level operation calls a vertex-editor batch internally) would have the inner end_batch flip the bool false, causing the outer operations to no longer batch. Counter semantics correctly track nesting depth and only fire rebuild when the outermost end_batch lands. ## Perf measurements (medium scale, 10k rows) s034-family-with-column-insert Auth (insert column at positions [3,2,5,1,4]): edit_0: 25,386 ms -> 24,263 ms (demotion + 50k materialization, unchanged) edit_1: 264 ms -> 85 ms edit_2: 175 ms -> 60 ms edit_3: 410,333 ms -> 186 ms (~2200x faster) edit_4: 247 ms -> 62 ms s035-family-with-column-delete Auth (delete column 7 x5): edit_0: N/A -> 45,090 ms (was hanging; now completes) edit_1: hung -> 15 ms edit_2: hung -> 19 ms edit_3: hung -> 21 ms edit_4: hung -> 17 ms recalc all: -- -> <1 ms Off-mode times unchanged (no regression). The first edit (which does FormulaPlane span demotion + ingest of 30k-50k formulas) is now the dominant cost. Demotion is a separate concern and not in scope here; tracked for future tuning. ## Tests added (4) - delta_edges.rs: update_coord_uses_vertex_position_index 20k vertices, update last 5k coords; release-mode <50ms; verifies vertex_pos consistency. - reference_adjuster.rs: adjust_ast_if_changed_returns_none_for_unaffected_column_insert =A1+1 with insert-before-col-3 returns None. - reference_adjuster.rs: adjust_ast_if_changed_returns_adjusted_for_insert_before_a =A1+1 with insert-before-col-0 returns Some with reference shifted to B1. - formula_plane_structural.rs: formula_plane_authoritative_repeated_column_insert_after_demotion_15k_vertices_stays_linear 5k rows x 3 formula columns, runs the s034 insert sequence, verifies correctness across rows 1/2500/5000 after every edit, asserts release-mode timing budgets (first <10s, others <1s, insert-before-col-1 specifically <1s). ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass probe-corpus s034/s035 medium auth+off completes; final invariants pass. ## Out of scope (separate dispatches) - Demotion-phase cost (edit_0 still ~25-45s for 30-50k formula materialization). The bulk_set_formulas_with_plans + ingest pipeline per-vertex cost is the remaining first-edit hot spot. - vertices_in_sheet linear scan (use sheet_indexes) — linear, not quadratic. - Tombstoned vertex inclusion in vertices_in_sheet — separate concern. - Per-row volatile/error span overhead at scale (s021/s025). Plan exists at docs/design/formula-plane/dispatch/small-domain-span-overhead.md for next dispatch.

Fixes per-row span overhead surfaced by s021 (16x slower) and s025 (3.3x slower) at medium scale. Implements the small-domain promotion gate from docs/design/formula-plane/dispatch/small-domain-span-overhead.md. ## Root cause (verified, not the PM's initial framing) PM's initial hypothesis — 'decline single-cell families' — turned out to already be implemented: detect_domain rejects analyses.len() < 2 (placement.rs:467-472) and converts to legacy via mark_all_legacy. The actual issue was small MULTI-cell families: - s021 medium: 1000 spans of only 7 cells each (=A{r}*2 rows separated by volatile RAND/TODAY/NOW gaps). - s025 medium: 100 spans of only 99 cells each (=A{r}*2 rows separated by per-100th =A{r}/0 errors). The FormulaPlane runtime has fixed per-span cost (template intern, scheduler edge insertion, per-task setup including AST relocatability revalidation, current_sheet.to_string allocation, fresh SpanEvaluator construction). For 7-cell spans this fixed cost dwarfs any savings vs the legacy graph path. Even 99-cell spans don't amortize it (measured 3.3x slower). ## Fix Add MIN_PROMOTED_NON_CONSTANT_SPAN_CELLS = 100 threshold in place_analyzed_family (formula_plane/placement.rs). Applied only after detect_domain succeeds and before any template intern / read-summary / span insert work, so doomed-small candidates fall through to legacy with zero wasted promotion overhead. Constant-result spans bypass the threshold because their broadcast path (eval-once, broadcast-to-N-placements) amortizes regardless of cell count; this preserves s013's 161x recalc win for SUMIFS-over- constant-criteria families and similar constant LET/LAMBDA wins. New PlacementFallbackReason::SmallDomain and PlacementDomain::cell_count() helper. ## Perf measurements (medium scale, 10k rows) s021-volatile-functions-sprinkled: recalc Auth/Off: 68.28ms / 4.27ms = 16.00x -> 4.27ms / 4.57ms = 0.93x span_count Auth: 1000 -> 0 (small =A*2 runs demote; volatiles already legacy) s025-errors-propagating-through-family: recalc Auth/Off: 1.65ms / 0.50ms = 3.30x -> 0.46ms / 0.49ms = 0.94x span_count Auth: 100 -> 0 (99-cell runs demote; error rows already singleton legacy) Preserved (no regressions): s006-rect-family-10cols Auth/Off: 6.98 / 28.73 ms (still ~4x faster) s007-fixed-anchor-family Auth/Off: 0.78 / 4.21 ms (still ~5x faster) s008-two-anchored-families Auth/Off: 1.54 / 7.89 ms (still ~5x faster) s013-sumifs-constant Auth/Off: 0.84 / 135.59ms (still ~161x faster via constant broadcast) All families above the threshold retain promotion. All constant-result families retain promotion regardless of size. ## Tests added (3) - formula_plane_authoritative_demotes_small_non_constant_domains 100-row s021-shape: volatile rows + =A*2 7-row runs. Asserts: active_span_count == 0, all 100 formulas materialized in graph, =A*2 cells produce correct values. - formula_plane_authoritative_demotes_99_cell_non_constant_runs 200-row s025-shape: =A*2 with =A{r}/0 every 100th row. Asserts: active_span_count == 0, all formulas in graph, error cells show #DIV/0!, others multiplied correctly. - formula_plane_authoritative_promotes_100_cell_non_constant_run 100 contiguous =A{r}*2 rows. Asserts: active_span_count == 1 (threshold is inclusive at 100). The existing constant-result test (formula_plane_authoritative_constant_ sumifs_family_promotes_via_broadcast) passes unchanged, validating the exemption. ## Tests updated Several existing formula-plane ingest/shadow/structural/span_eval/ placement tests previously used 2-3 cell non-constant families to verify mechanical span-creation behavior. Updated those to use 100-cell families where the test intent is active-span mechanics. Constant-result small-span tests remain small (the exemption preserves them). Files: tests/formula_plane_ingest_shadow.rs, tests/formula_plane_structural.rs, formula_plane/placement.rs (test mod), formula_plane/span_eval.rs (test mod). Helper functions row_run_candidates and col_run_candidates added in placement.rs and span_eval.rs test mods to reduce repetition. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass probe-corpus medium s021/s025/s006/s007/s008/s013 auth+off all final invariants pass. ## Threshold rationale 100 chosen because: - 7 cells (s021) clearly bad. - 99 cells (s025) measurably bad. - Below 100 the per-span fixed cost dominates. - Above 100 the per-cell amortization works. Future tuning: revisit after the next medium-scale corpus baseline once other corpus-driven fixes land. The threshold is a single named constant, easy to adjust. ## Open follow-ups (separate dispatches) - Per-span scheduler/evaluator overhead (current_sheet.to_string, fresh SpanEvaluator per work item, double placement vector materialization, per-task AST relocatability revalidation). Real but orthogonal; with the threshold in place these become less important because we no longer create the small spans that exposed them. - Volatile authority canonical support — out of scope; would need careful guard against vacuous constant-result classification of no-read volatiles.

Fixes s036 Auth recalc 10-18x slower than Off. Single-line removal of record_formula_plane_structural_change(StructuralScope::Sheet) from Engine::rename_sheet (eval.rs:1644). Anchored in docs/design/formula-plane/dispatch/sheet-rename-dirty-scope.md. ## Root cause Sheet rename in Excel changes the display name string only. The engine preserves SheetId across rename (sheet_registry.rs:78-108). All known sheet references are stored in arena as SheetKey::Id(id), not the display name (data_store.rs:445-457 for cells, :470-482 for ranges). ASTs are reconstructed via the current registry name lookup (:660-668, :682-690). Therefore: a sheet rename does not change any actual cell values or dependency identities. References still resolve to the same cells. The legacy graph correctly handles this — Off mode finishes recalc in 0.2ms because mark_vertex_dirty does not propagate to dependents and the only dirtied vertices are value cells which get filtered out by get_evaluation_vertices. Auth mode was paying ~3ms per rename because record_formula_plane_ structural_change(StructuralScope::Sheet(sheet_id)) recorded RegionPattern::whole_sheet(sheet_id), which the consumer-read index correctly matched against every span reading from that sheet. The dirty closure then projected whole-sheet through the affine projection rule onto the whole result region of any consuming span, triggering whole-span recompute. For s036 (Sheet1 has one 10k-cell span reading from DataA + DataB): each rename of DataA or DataB triggered a 10k-placement re-eval of the Sheet1 span. The values were unchanged afterward. ## Fix One line removed at eval.rs:1644. Comment block added explaining the SheetId-preservation invariant. Path before: rename_staged_formula_sheet vertices_in_sheet().mark_dirty (legacy bookkeeping; values filtered) record_formula_plane_structural_change(Sheet) <- removed mark_topology_edited ## Perf measurements (medium scale, 10k formulas / 30k vertices) s036-multi-sheet-with-sheet-rename: Off recalc 0..3: 0.67, 0.18, 0.34, 0.36 ms (4 rename cycles) Auth recalc 0..3: 0.11, 0.19, 0.09, 0.07 ms (Auth now FASTER than Off) Off recalc_4: 0.33 ms (value edit; unchanged behavior) Auth recalc_4: 0.18 ms (value edit; correct dirty propagation) Pre-fix Auth: 2.75-3.56 ms per rename cycle (10-18x worse than Off). Post-fix Auth: 0.07-0.19 ms (better than Off because Auth has 1 span while Off has 30k graph vertices to schedule). result.computed_vertices == 0 after each rename (verified by test). ## Tests added (3) - formula_plane_authoritative_sheet_rename_is_metadata_only_for_cross_sheet_span 100-row cross-sheet span. Renames DataA forward and back. Asserts result.computed_vertices == 0 after each rename, sampled values unchanged, span count preserved. - formula_plane_authoritative_value_edit_after_sheet_rename_dirties_bounded_span_work After rename, a single cell value edit produces bounded span work (>= 1 placement re-evaluated) and only the affected output row changes. Verifies dirty propagation is preserved for actual edits. - formula_plane_authoritative_sheet_rename_preserves_sheet_id_read_summaries Read summaries remain SheetId-keyed across rename. consumer_read_entries count preserved. Edit on the renamed sheet correctly dirties only the expected output cell. ## What was NOT changed (out of scope) - StructuralScope::Sheet still used for row/col insert/delete (eval.rs:3763, 3789, 3819, 3849) — those legitimately need it because references shift. - StructuralScope::RemovedSheet path unchanged (eval.rs:5470-5493). - StructuralScope::AllSheets path unchanged (eval.rs:5495-5498). - Legacy mark_vertex_dirty loop on the renamed sheet kept (eval.rs:1638-1643). In s036 it produces no formula evaluation work because get_evaluation_ vertices filters value vertices out. Removing it would be a broader legacy behavior change requiring its own audit. - Arrow store sheet rename, graph rename_sheet, staged-formula rename, and topology edit mark all kept. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass probe-corpus medium s036 auth+off final invariants pass ## Open items (separate dispatches) - s036 fixture has DATA_ROWS=1000 not 10000. Doesn't affect this fix. Worth fixing for consistency; separate trivial commit. - Span merging across sheet-display-name changes: spans retain canonical keys with old explicit names. Future formulas with new names may not merge. Out of scope; tracked. - Whole-column reference cost (s026 4.8s recalc) — separate dispatch, design memo at docs/design/formula-plane/dispatch/whole-column-references.md. Memo committed alongside this change for future reference.

Lifts the FormulaPlane rejection of whole-axis (whole-column) references in dependency analysis, canonical labels, and projection construction. Whole-column reads now produce 'WholeColumnRange' projections that emit RegionPattern::WholeCol read regions. Constant-result classification treats whole-axis as placement-invariant, so absolute whole-column formulas like =SUM($A:$A) enter the eval-once-broadcast fast path. Anchored in docs/design/formula-plane/dispatch/whole-axis-promotion.md. ## Root cause s026-whole-column-refs-in-50k-formulas had span_count=0 in Auth mode because dependency_summary rejected any range with AxisRef::WholeAxis upstream of placement, and the parallel arena-canonical labels also rejected it. Lifting both rejections plus adding a source-aware projection rule lets the existing constant-result broadcast path apply to whole-column formulas with absolute axes. The fix touches six call sites that all needed updating in lockstep: - template_canonical reject reason - arena/canonical reject labels - dependency_summary reject_non_finite_range - dependency_summary axis_kinds_match - dependency_summary is_constant_result helper - producer DirtyProjectionRule + AxisProjection Without all six, promotion is path-dependent or projection construction fails after the summary accepts the precedent. ## Design Scope: whole-COLUMN only. Whole-row deferred (multi-row whole-row intervals would require new RegionPattern::WholeRowInterval and is not driven by current measurements). New variant DirtyProjectionRule::WholeColumnRange { col_start, col_end }. Existing PrecedentPattern::Range(AffineRectPattern) reused; AxisRef already has a WholeAxis variant. New method read_regions_for_result returns Vec<RegionPattern> instead of a single region. AffineCell/AffineRange wrap their existing single result; WholeColumnRange emits one RegionPattern::WholeCol per source column. Projected column count bounded at 256 to avoid pathological $A:$XFD cases (rejected with UnsupportedAxis). Existing read_region_for_result kept for backward compatibility with callers that expect a single region; returns UnsupportedAxis for WholeColumnRange. is_constant_projection at placement.rs and is_constant_result at dependency_summary.rs treat AxisRef::WholeAxis as placement-invariant (it represents the entire column regardless of where the formula sits). RelativeToPlacement remains non-constant. Open/unsupported defensive default to non-invariant. Composition with existing precedent kinds: - =SUM($A:$A): one WholeColumnRange precedent. Constant. Broadcast. - =SUM($A:$A) - A{r}: two precedents (whole-col + relative cell). Mixed → non-constant. Per-placement eval. Whole-col read region still in summary; dirty propagation correct. - =SUMIFS($B:$B, $A:$A, "Type1"): two whole-col precedents, both constant. Broadcast. - =SUMIFS($B:$B, $A:$A, A{r}): two whole-col + one relative. Non-constant. - Cross-sheet =SUM(DataA!$A:$A): emits whole-col on DataA's sheet_id. Negative cases preserved: - $A$1:$A (open-ended) still rejected (OpenRangeUnsupported). - =$A:$A top-level still rejected (not in supported function-arg context). - A:$A (mixed endpoint kinds) still rejected. - ROW($A:$A) still rejected (ROW not in is_known_static_function). - Whole-row $1:$1 explicitly rejected in this patch (deferred). - Internal-dependency guard preserved (formula in column A reading $A:$A still falls back to legacy). VLOOKUP/MATCH NOT added to is_known_static_function in this patch. Independent semantic review needed; separate dispatch. ## Perf measurements s026-whole-column-refs-in-50k-formulas medium (10k rows, =SUM($A:$A) - A{r}): Off first 4681ms recalc 4810ms spans 0 Auth first 47ms recalc 1678ms spans 1 (99x first / 2.86x recalc) The recalc 2.86x speedup is for the mixed (non-constant) shape; per- placement eval still required. Pure constant whole-col shape gets the full broadcast benefit: repro_whole_col_vs_finite (interactive() mode, 10k rows): =SUM($A:$A) Off recalc 4854ms Auth recalc 1.77ms (2742x faster) =SUM($A$1:$A$N) Off recalc 2415ms Auth recalc 0.79ms (3057x baseline) Both whole-column constant-result and finite-range constant-result now use the same broadcast path with comparable performance. ## Tests added dependency_summary: - accepts_absolute_whole_column_sum (FormulaClass::StaticPointwise, constant-result == true) - mixed_whole_column_minus_relative_is_non_constant - relative_whole_column_a_a_is_non_constant - rejects_open_range_whole_column - rejects_top_level_whole_column - rejects_mixed_absolute_relative_endpoints template_canonical: - whole_axis_no_longer_unsupports_authority_labels - open_range_still_unsupports_authority_labels - whole_axis_serializes_in_canonical_key arena_canonical: - whole_column_range_no_longer_sets_reject_whole_axis - open_range_still_sets_reject_open_range producer: - whole_column_range_read_regions_emit_whole_cols (single + multi) - whole_column_range_rejects_above_256_column_threshold - whole_column_dirty_projection_dirties_whole_result_on_intersection - whole_column_dirty_projection_no_intersection_outside_column placement: - constant_whole_column_family_promotes_to_one_constant_span - mixed_whole_column_minus_relative_promotes_to_non_constant_span - sumifs_constant_criteria_whole_column_family_promotes - cross_sheet_whole_column_family_targets_data_sheet_id - whole_row_family_does_not_promote (negative) ingest_pipeline: - compute_read_projections_accepts_whole_column - compute_read_projections_rejects_top_level_whole_column - compute_read_projections_rejects_open_ended_range - compute_read_projections_rejects_whole_row formula_plane_structural (end-to-end): - 200-row =SUM($A:$A) family promotes, evaluates correctly, recalculates correctly after col-A edit - 200-row =SUM($A:$A) - A{r} family promotes as non-constant, per-row values correct - cross-sheet 200-row =SUM(DataA!$A:$A) family recalcs after DataA edit ## Tests updated - Existing dependency_summary whole-axis rejection test updated to new behavior: function-argument whole-column accepted, top-level still rejected. - FP8 ingest parity test kept passing by aligning arena whole/open range behavior with template canonical labels. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass probe-corpus medium s026 spans 0 -> 1 first 99x faster recalc 2.86x faster repro_whole_col_vs_finite constant whole-col case 2742x faster finite case unchanged mixed case promotes (perf parity with finite-range mixed deferred to SUM CSE) s007/s013 corpus (constant-result spans) no regression ## Out of scope (separate dispatches) - Whole-row promotion ($1:$1 etc). - VLOOKUP/MATCH in is_known_static_function. - SUM aggregate cache (CSE) for mixed shapes like =SUM($A:$A) - A{r}. This shape promotes but each placement still re-evaluates the whole- column SUM. Phase 1 of the whole-column-references memo would unlock another large speedup here. - Effect 1 from whole-column-references memo: the legacy 2x tax for whole-column resolution (Off mode $A:$A is 2x slower than $A$1:$A$N). Separate small investigation. Memos for both committed alongside this change.

…refs Adds a snapshot-keyed final-result cache for used_rows_for_columns and used_cols_for_rows. Whole-column references like =SUM($A:$A) need the used-row extent resolved per call, which currently runs formula_row_bounds_for_columns every time. That helper scans every indexed vertex in the queried column range and filters by formula kind. Anchored in docs/design/formula-plane/dispatch/whole-column-legacy-tax.md. ## Root cause For 10k formulas of `=SUM($A:$A)` over a column with 10k input value vertices: each formula triggers used_rows_for_columns("Sheet1", 1, 1) which calls formula_row_bounds_for_columns. That helper does get_vertex_kind() on every indexed vertex in column A — 10k checks per call. With 10k formulas, that's 100M vertex-kind checks per recalc. The Arrow used-row bounds cache (row_bounds_cache at eval.rs:349) hits correctly after the first formula, but the wrapper still calls formula_row_bounds_for_columns to preserve the union semantics (Arrow extent OR formula coordinates in unmaterialized rows). Finite-range references like `=SUM($A$1:$A$10000)` skip the entire used_rows_for_columns path because all four bounds are present at the parser AST level (eval.rs:9443-9451). ## Fix New UsedAxisBoundsCache struct with two FxHashMaps: row_bounds_by_col_span: (SheetId, start_col, end_col) -> Option<(u32, u32)> col_bounds_by_row_span: (SheetId, start_row, end_row) -> Option<(u32, u32)> Wrapped in Engine::used_axis_bounds_cache: RwLock<Option<...>>. used_rows_for_columns flow: 1. Resolve sheet_id (O(1) HashMap). 2. Load snapshot_id. 3. Read-lock check cache for (sheet_id, start_col, end_col). 4. On hit: return cached Option immediately. 5. On miss: run existing union logic (Arrow + formula bounds + graph fallback). 6. Write-lock store result. reset_for_snapshot clears map on snapshot change. Symmetric for used_cols_for_rows. Critical correctness preserved: - Snapshot-keyed: data edits and topology edits both increment snapshot (eval.rs:2403-2413), so invalidation is automatic. - Cache stores None: closes the empty-column rescan hole that the underlying RowBoundsCache also has (where (None, None) cached results weren't treated as a hit). - Union semantics preserved: only the FINAL result is cached, not the Arrow-only or formula-only intermediate. - Read-then-write pattern: don't hold cache lock during expensive scans. ## Perf measurements (10k rows / 10k formulas, FormulaPlane Off) repro_whole_col_vs_finite, Off mode: Before: =SUM($A:$A) recalc 4882ms (488us/formula) =SUM($A$1:$A$N) recalc 2448ms (245us/formula) =SUM($A:$A) - A{r} recalc 4725ms =SUM($A$1:$A$N) - A{r} recalc 2482ms After: =SUM($A:$A) recalc 2492ms (249us/formula) ~2x faster =SUM($A$1:$A$N) recalc 2477ms (unchanged) =SUM($A:$A) - A{r} recalc 2473ms ~1.9x faster =SUM($A$1:$A$N) - A{r} recalc 2495ms (unchanged) **Whole-column Off recalc now matches finite-range Off recalc within ~1% margin.** s026-whole-column-refs-in-50k-formulas medium: Off recalc: 4810ms -> 2511ms (1.92x faster) Auth recalc: 1670ms -> 1769ms (within noise) Auth-mode FormulaPlane behavior unchanged: still spans=1, still benefits from the whole-column promotion landed in 0d287ce. ## Tests added In crates/formualizer-eval/src/engine/tests/used_bounds_cache.rs: - used_rows_for_columns_caches_final_result_across_repeated_calls: 10k values + 10k formulas, two calls, asserts row_misses == 1, row_hits == 1. - used_rows_for_columns_caches_none_for_empty_column: empty column C, two calls, both return None, row_misses == 1, row_hits == 1. - used_rows_for_columns_invalidates_on_data_edit: data through row 5, edit row 8, snapshot bump invalidates cache, third call returns updated max row 8 and is cached. - used_rows_for_columns_includes_formula_rows_in_union: data A1:A5 + formula A10, returns max row 10, second call hits. - used_cols_for_rows_caches_final_result + invalidates_on_data_edit: symmetric tests for the row-axis cache. - evaluate_whole_column_sum_uses_cached_bounds: 100 rows, =SUM($A:$A) formulas in col B, evaluate, edit A5, recalc, values correct, cache hit pattern matches expected behavior. Internal #[cfg(test)] AtomicUsize counters (row_hits, row_misses, col_hits, col_misses) on UsedAxisBoundsCache. Counters exposed via Engine::used_axis_bounds_cache_stats() for tests only. No public API change. No EvalConfig toggle. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass repro_whole_col_vs_finite whole-col within 1% of finite probe-corpus medium s026 Off 1.92x faster ## Out of scope - SUM aggregate cache (CSE) — separate dispatch docs/design/formula-plane/dispatch/whole-column-references.md. This patch addresses the per-formula bound-resolution tax; CSE would address the per-formula SUM scan tax. - Formula-only sheet index — broader graph-state change, not needed to remove the verified per-call scan. - Empty-column inefficiency in the underlying arrow_used_row_bounds cache (where (None, None) results aren't treated as cache hits) — the new wrapper-level cache caches Option<(u32, u32)> including None, which closes the hole at the wrapper level.

…tion Combined dispatch implementing the design at docs/design/formula-plane/dispatch/literal-param-memoization-design.md. Two coupled features sharing one parameter-slot substrate: 1. **Literal parameterization**: formulas that differ only by literal values now fold into the same FormulaPlane family. The parameterized canonical key replaces all parameterizable literals with positional slot markers (lit_slot(<id>)). Per-formula binding vectors carry the concrete literal values. Family bucketing changes from (sheet_id, canonical_hash) to (sheet_id, parameterized_canonical_hash) with a full parameterized_canonical_key equality guard against hash collisions. 2. **Parameter-key memoization**: non-constant spans now evaluate once per unique parameter tuple and broadcast to placements with the same tuple. Parameter atoms include literal slot values + value-context relative-cell-ref values + residual row/col deltas when needed. The memo cache lives strictly within SpanEvaluator::evaluate_task and is dropped on return — no persistent caching, no invalidation complexity. ## Pre-existing tombstone-evaluation bug also fixed While verifying correctness, the agent identified a pre-existing bug that the literal-parameterization work exposed: - VertexEditor::remove_vertex tombstoned vertices but did NOT clear vertex_formulas/vertex_values/dirty_vertices/volatile/dynamic/kind. Tombstoned formula vertices remained schedulable. - DependencyGraph::get_evaluation_vertices did not filter tombstoned vertices. After delete_columns on a sheet with FormulaPlane spans: - demotion materialized formulas at all positions (correct). - delete_columns tombstoned col-3 vertices and shifted col-4 → col-3 (correct). - BUT the tombstoned col-3 vertices kept evaluating and writing stale results to col-3 in the computed overlay, producing wrong values. Fix at vertex_editor.rs:703-711 (clear formula/value/dirty/kind on remove_vertex) and graph/mod.rs:2145-2158 (filter tombstoned in get_evaluation_vertices via vertex_exists_active). This bug was latent before because no prior workload created the exact sequence (FormulaPlane span → demotion materialization → structural delete → recalc) that produces the symptom. ## Performance results repro_sumifs_variants at ROWS=5000, Auth-serial (wasm-relevant): | Variant | Before | After | Speedup | |---|---:|---:|---:| | 1. constant literal | 0.84ms | 0.86ms | unchanged ✓ | | 2. varying literal (s014) | 3196ms | **2.72ms** | **1175x** | | 3. relative cell-ref | 2078ms | **3.31ms** | **628x** | | 4. whole-col + relative | 2069ms | **3.65ms** | **567x** | | 5. whole-col + constant | 1.01ms | 1.08ms | unchanged ✓ | s014 corpus medium Auth recalc: 146ms → 3.4ms (43x). spans 0 → 1. s013 and s026 corpus: unchanged from previous baselines. K=3 redundancy in benchmark → 3 SUMIFS evals + N broadcasts, matching theoretical minimum. ## Architecture (per memo) ### Parameter-slot canonicalization (template_canonical.rs) Two outputs per formula: - exact_canonical_key (current behavior — retained for diagnostics) - parameterized_canonical_key (literals → lit_slot(<id>)) - literal_slot_descriptors (with SlotContext, original LiteralKind) - literal_bindings: Box<[LiteralValue]> Pre-order traversal matches existing canonical traversal exactly. Array literals continue to reject (no slot emitted). ### BindingStore in FormulaPlane runtime (runtime.rs) Dictionary-encoded binding storage: - unique_literal_bindings: Vec<Box<[LiteralValue]>> - placement_literal_binding_ids: Box<[u32]> For N=10k placements with K=3 distinct bindings: stores 3 vectors + 40KB ids, not 10k full vectors. 8 MiB memory cap with PlacementFallbackReason::BindingMemoryCapExceeded. PlacementDomain::ordinal_of(placement) maps placement coord → index matching domain.iter() order. ### Span eval third branch (span_eval.rs) if span.is_constant_result { broadcast } else if let Some(plan) = parametric_eval_plan && should_try_memoization { memoized eval branch } else { per-placement (current path) } ParameterAtom enum uses NumberBits(u64) (not f64 PartialEq, NaN safe). Date/Time/Duration as typed strings. Error includes full ExcelError content (kind+message+context+extra). Atom order: literal slots → value-ref slots → residual relocation deltas. Deterministic for mixed-slot keys. ResidualRelocationMode::{None, IncludeRowDelta, IncludeColDelta, IncludeRowAndColDelta}. Memoization is valid only when all placement-varying influences are in the key. Relative ranges in range-context force residual deltas; otherwise no memoization. Bounded sampling gate: sample 64 placements, fallback if unique > 3/4 of sample. Full grouping aborts if unique * 4 > writable * 3. MEMO_MAX_ENTRIES_PER_TASK = 16384. ### Substitution mechanism (interpreter.rs) Hybrid: - Literal slots: interpreter-level binding context (Interpreter::with_parameter_bindings). Modifies arena Literal node evaluation to consult bindings before data_store.retrieve_value. - Value-ref slots: representative placement + key grouping (no AST substitution; existing relocation handles it). - Demotion: tree clone + literal substitution + relocation. ### Family acceptance gate (placement.rs) Family bucketing by parameterized_canonical_hash. Full parameterized_canonical_key equality check against hash collisions. is_constant_result requires: - read_projections constant - all placements have same literal binding vector - value_ref_slot_descriptors empty ### By-ref function contracts (dependency_summary.rs) Strengthened ROW/COLUMN/AREAS/SHEET as by-ref/reference-sensitive. INDEX/OFFSET already mapped. Prevents reference-identity-sensitive args from being value-ref-parameterized. ## Tests added (24) In crates/formualizer-eval/src/engine/tests/formula_plane_literal_param_memo.rs: Literal parameterization (8): - formula_plane_parameterized_literals_fold_same_structure - formula_plane_exact_canonical_key_retained_for_diagnostics - formula_plane_literal_slot_wildcards_kind_but_binding_preserves_type - formula_plane_array_literal_remains_rejected_after_literal_parameterization - formula_plane_empty_literal_parameterizes (empty/pending/error) - formula_plane_binding_store_dictionary_encodes_repeated_vectors - formula_plane_binding_set_removed_with_span - formula_plane_demoted_parameterized_span_materializes_bound_literals (regression test for column-delete tombstone bug + literal binding) Memoization (6): - formula_plane_memoizes_value_context_relative_cell_refs (K=3 → 3 evals) - formula_plane_memoizes_varying_literal_slots (K=3 → 3 evals) - formula_plane_memoizes_mixed_literal_and_value_ref_parameters - formula_plane_memo_residual_relative_reference_includes_row_delta - formula_plane_memo_skips_all_unique_literal_bindings - formula_plane_memo_sampling_skips_all_unique_value_refs Edge cases (10): - Float key (3): uses_number_bits, nan_reflexive, negative_zero_distinct - Date/time: dates_and_durations_are_typed - Errors: error_includes_message_and_context - Volatile/dynamic (2): volatile_template_not_memoized, dynamic_template_not_memoized - Reference identity (3): row_column_args_not_value_parameterized, index_offset_byref_not_value_parameterized, criteria_range_not_value_parameterized Hash collision and memory cap (3): - parameter_key_hash_collision_does_not_merge_results - parameterized_canonical_hash_collision_does_not_merge_family - literal_binding_memory_cap_falls_back Memo cache lifetime (1): - memo_cache_is_per_evaluate_task Test-only counters added: memo_eval_count, memo_broadcast_count, sample_only_key_build_count, unique_literal_binding_vectors. Exposed via test-only Engine accessors (no public API). ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass formula_plane_authoritative_column_delete_shifts_span_outputs_correctly pass (was failing) repro_sumifs_variants wins documented above probe-corpus medium s013/s014/s026 s014 43x faster (146ms → 3.4ms) s013/s026 unchanged ## Out of scope (separate dispatches) - SUMIFS family aggregate index for K=N criteria cases (memo §8 Option F). - Parallel non-constant span placement evaluation for native multi-threaded workloads (memo §8 Option B). Not relevant for wasm/single-threaded; benefits real-world parallel workloads separately. - FamilyPlanner architecture as the formal home for these plans. Memos for both committed alongside this change.

…uit corpus ## Part 1: Per-scheduled-span loop overhead reduction Reduces per-span overhead in evaluate_authoritative_formula_plane_all inner loop. Per-span allocations / setup compounded linearly with active span count; same-sheet span groups now share evaluator state. ### Changes - **Sheet name resolution**: removed per-span `sheet_name(...).to_string()` allocation. Within a layer, consecutive spans on the same sheet now share one borrowed sheet name slice. - **SpanEvaluator reuse**: one SpanEvaluator constructed per same-sheet span group within a layer (previously: per span). Loop reorganized to walk consecutive spans on the same sheet under one evaluator before transitioning. - **SpanComputedWriteSink reuse**: one sink constructed per layer, reused across all spans in that layer (previously: per span). - **Relocatable AST validation cached per template**: TemplateRecord gains `relocatable_ast_validated: OnceLock<bool>`. Templates are immutable post-interning, so first-call computes; later calls hit the cache. Eliminates O(spans \u00b7 AST nodes) walk per evaluate_all. - **WholeSpan dirty avoids double Vec materialization**: introduced PlacementSelection enum with Whole(borrowed PlacementDomain) and Vec(materialized PlacementCoord vec) variants. WholeSpan branch iterates via domain.iter() (already O(N) but no double-vec). Cells and Regions branches still materialize as before. ### Measurements Per-span overhead changes have modest effect at small span counts. Expected to scale with workbooks containing many spans. Medium corpus probe (selected scenarios): - s006-rect-family-10cols (10 spans): 8.13ms \u2192 8.61ms (within noise). - s013-sumifs-family-constant-criteria (1 span): 0.85 \u2192 1.04ms (sub-ms). - s014-sumifs-family-varying-criteria (1 span): 3.56 \u2192 3.56ms (unchanged). - s016-multi-sheet-5-tabs (3 spans): 1.09 \u2192 0.97ms (improved). No regressions. The benefit grows with active span count and many-span workbooks. ## Part 2: IF/IFS/IFERROR short-circuit corpus coverage The PM flagged that we should have corpus tests confirming IF family short-circuit semantics still work under FormulaPlane span eval (including the memoized branch). Probe at crates/formualizer-bench-core/examples/repro_if_short_circuit.rs already verified this for K=N (per-placement path) and K=3 (memoized path) - zero errors propagated, correct values returned. Added 3 corpus scenarios: - **s043-if-short-circuit-with-erroring-else**: 10k-row =IF(A{r}>0, A{r}*2, 1/0). All A values positive so condition always true; else branch (1/0) must NEVER evaluate. Invariants assert zero error cells in col B at all phases. - **s044-ifs-chain-short-circuit**: 10k-row =IFS(A{r}>0, A{r}*2, A{r}<0, A{r}*3, TRUE, 1/0). A cycles through positive/negative/zero. The TRUE fallback contains 1/0 that should never evaluate when an earlier condition matches. Per-row expected values match the appropriate branch. - **s045-iferror-mixed-with-actual-errors**: 10k-row =IFERROR(1/A{r}, 0). Some A=0 cells produce DIV/0 in the protected expression; IFERROR catches and returns 0. Cells with A=0 must yield 0, not propagate the error. Other cells return 1/A. All three promote to spans=1 under Auth and pass NoErrorCells + per-row CellEquals invariants under both Off and Auth modes. Recalc under both modes is sub-ms per cycle. ScenarioTag::ShortCircuit added to the tag enum. ## Tests added In crates/formualizer-eval/src/engine/tests/formula_plane_per_span_overhead.rs: - formula_plane_evaluate_all_handles_many_same_sheet_spans: 100 same-sheet active spans evaluate in one evaluate_all without errors. - formula_plane_relocatable_validation_is_cached_per_template: validates relocatable AST validation is not repeated for the same template across multiple evaluate passes. - formula_plane_whole_span_dirty_does_not_materialize_dirty_placement_vec: validates DirtyDomain::WholeSpan iterates without dirty-placement Vec materialization. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass probe-corpus medium s006/s013/s014/s016/s043/s044/s045 all pass, invariants hold, short-circuit verified ## Out of scope (separate dispatches) - VLOOKUP/HLOOKUP/XLOOKUP allowlist + value-context handling. - LET/LAMBDA local-binding context support. - SUMIFS family aggregate index for K=N criteria cases. - Demotion phase cost (s034/s035 first edit still 25-46s for 30-50k formula materialization).

…se scenarios ## Parity harness New binary `probe-corpus-parity` that runs every scenario twice (Off and Auth modes, single-threaded for determinism) on the same fixture and compares EVERY cell at every phase boundary. This is the release gate that proves Off\u2194Auth equivalence. CLI: - `--scale {small|medium|large}` - `--include 'sNNN-*'` / `--exclude 'sNNN-*'` - `--phase-timeout-ms N` - `--fail-fast` - `--max-divergences-per-phase N` - `--label <tag>` Float comparison uses exact bit-equality (`f64::to_bits`) with a NaN-vs-NaN special case. Errors compare full `ExcelError` (kind + message). Empty cell is equivalent to None. `Scenario::expected_divergences()` machinery added to mark volatile/dynamic scenarios that legitimately differ across modes: - s021 (RAND/NOW) skipped. - s022 (OFFSET/INDIRECT) run-and-noted. - s058 (volatile mix) skipped. Tests: smoke test, deliberate-divergence detection test, f64 bit comparison edge cases. ## 15 new edge-case scenarios s046 giant-AST formula (\u2265 50 deps per cell, 100 such cells) s047 very-deep linear chain (2000 deep) s048 50 disjoint anchored families s049 VLOOKUP with row-relative key s050 VLOOKUP with absolute key (constant-result candidate) s051 mixed error cascade with IFERROR suppression s052 5000-row deeply nested IF chain s053 text-heavy CONCATENATE family s054 add-then-delete sheet recalc test s055 mixed-edit + undo s056 SUMIFS with array-criteria expression s057 named range redefined s058 volatile/non-volatile mix s059 empty sheet with cross-sheet refs populated by edit s060 self-referencing table row formula New tags: GiantAst, TextHeavy. ## Initial parity audit results Small-scale parity audit: Scenarios run: 58 Scenarios passed: 49 Scenarios skipped: 2 (expected divergence) Scenarios failed: 9 Total divergences: 25 ### Real correctness divergences (AfterRecalc, contract violations) - **s054 add-then-delete sheet recalc**: Auth retains stale (-1) values after a sheet is removed and re-added; Off correctly recalculates. Real bug in cross-sheet dirty propagation. - **s055 mixed edits + undo**: Auth value 200 vs Off 500 after mixed value/formula edits. Real bug in dirty propagation under mixed edit sequences. ### Contract divergences (AfterEdit pre-recalc only) - s032/s033/s034/s035: After structural edits (insert/delete rows /columns), Auth shows `None` for values that Off retains as stale numbers. AfterRecalc both modes match. This is a contract question, not a correctness bug \u2014 Auth's behavior (values cleared on structural op until next recalc) is consistent with the evaluate_all-driven contract. ### Harness errors (pre-existing public-API gaps) - s040 (insert_rows undo): Workbook public API exposes no WorkbookAction::insert_rows; engine_mut would not test undo path. - s041 (extend_table): Workbook exposes no extend_table API; engine_mut Engine::define_table only. - s042 (external source bump): no public API to declare/populate source values during fixture load. These are pre-existing escalations from earlier dispatches, not new bugs. ## Coverage matrix Coverage gaps (one scenario per tag): GiantAst, TextHeavy, SheetRename, NoFormulas, LegacyOnly, LetLambda, LargeArrayLiteral, WholeColumnRefs, MixedTypes, InternalDependency, Dynamic, DeleteRows, DeleteColumns, InsertColumns. Most are intentional (one scenario per dimension) but worth noting for future expansion. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass (1534) cargo test -p formualizer-workbook --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass cargo test -p formualizer-bench-core --features formualizer_runner pass probe-corpus small (existing scenarios) pass probe-corpus-parity small (audit) 49/58 (real correctness bugs in s054/s055) ## Out of scope (separate dispatches) - Fix s054 cross-sheet dirty propagation when sheet is re-added. - Fix s055 dirty propagation under mixed edits. - Decide AfterEdit phase contract: gate or skip. - Expose Workbook public APIs for insert_rows, extend_table, external source population (unlocks s040/s041/s042). - Re-run full corpus at medium scale once the above are fixed.

…eet add/remove ## Bugs fixed The Off\u2194Auth parity harness (commit 4abf4db) surfaced two correctness divergences: ### s055 \u2014 set_cell_formula inside an active span ignored When the engine writes a new formula or value at a coordinate that is INSIDE an active span placement domain, the span continues to evaluate its template for that placement, ignoring the per-cell override. Reproduction: 200-row =A{r}*2 family promoted to a span. Set B100 to =A100*5 via the action(...) path. Expected 500; Auth produced 200. ### s054 \u2014 sheet add/remove leaves dependent span templates stale When a sheet referenced by formulas in another sheet is removed and re-added (e.g. =IFERROR(Aux!A{r}*2, -1)), DependencyGraph rewrites the formula AST through tombstone/heal phases. The span's template_id continues to point at the original (pre-tombstone) AST, so post-add evaluation produces stale results. Reproduction: 200-row Sheet1!A{r} = =IFERROR(Aux!A{r}*2, -1) family promoted to a span. delete_sheet("Aux") then add_sheet("Aux") with new values. Expected (r+10)*2; Auth produced -1 (the stale IFERROR fallback from when Aux was missing). ## Fix design Both fixes use span demotion. Demotion materializes span placements as legacy vertex-backed formulas; subsequent evaluate_all may re-promote them based on the new (correct) AST. Three new private methods on Engine: - `demote_span_containing_cell_for_write(sheet_id, row0, col0)`: for per-cell writes. Looks up the placement via FormulaSpanStore::find_at; if inside an active span, demotes that sheet's spans. - `demote_all_spans()`: enumerates all sheet_ids with active spans and demotes each. Used by sheet add/remove because tombstone/heal can affect cross-sheet formula ASTs arbitrarily. - `demote_spans_preserving_computed_overlays(sheet_id)`: variant of the existing structural-op demoter that does NOT clear computed overlays. For write-induced demotion the placements are about to be overwritten; clearing the computed overlay would discard legitimate work for unaffected placements. The structural-op demoter is unchanged. Internal helper `demote_spans_for_structural_op_impl(sheet_id, clear_computed_overlays)` parameterizes the overlay-clear behavior; the public `demote_spans_for_structural_op` retains its prior behavior. ## Sites patched Engine-level public writes (single-cell): - `Engine::set_cell_value` - `Engine::set_cell_formula` EngineAction (action_with_logger / action() path): - `EngineAction::set_cell_value` - `EngineAction::set_cell_formula` Engine-level public writes (bulk): - `Engine::bulk_set_formulas`: dedup via single sheet check; demote once per sheet only if any cell falls inside an active span. Sheet add/remove: - `Engine::add_sheet`: demote all spans BEFORE `graph.add_sheet` (which heals orphans). - `Engine::remove_sheet`: demote all spans BEFORE `graph.remove_sheet` (which tombstones formulas). The order matters: demotion must happen before AST mutation because demotion logic walks the current span template. ## Tests New file: `crates/formualizer-eval/src/engine/tests/formula_plane_demotion_correctness.rs` Six tests covering: 1. Engine-direct set_cell_formula inside active span. 2. EngineAction set_cell_formula inside active span (the s055 reproduction shape). 3. Engine-direct set_cell_value inside active span. 4. Engine-direct bulk_set_formulas inside active span (dedup demote-once invariant). 5. Sheet remove then add with cross-sheet formulas (s054 shape). 6. Sheet add with no orphans \u2014 confirms demote-all-on-add does not break unrelated span workloads. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass Parity harness focused on s054 + s055 small scale: Scenarios run: 2 Scenarios passed: 2 Total divergences: 0 Full small-scale parity audit: s054, s055 now pass. Pre-existing failures unchanged: s032/s033/s034/s035 (AfterEdit contract divergence, separate workstream); s040/s041/s042 (Workbook public-API gaps for insert_rows/extend_table/external sources, separate workstream). Medium-scale perf probe: no regressions in s006/s013/s014/s016/s026/ s036/s043/s044/s045. s054 and s055 now produce correct values under both Off and Auth. ## Out of scope (explicit) - Surgical FormulaOverlayEntryKind::FormulaOverride / ValueOverride insertion machinery: deferred. Demotion is the conservative correct path. Overlay punchout has no production callsites yet and is unproven in real workloads. - s032/s033/s034/s035 AfterEdit-only divergences: contract clarifi- cation work, not correctness. - s040/s041/s042 public-API gaps: separate Workbook surface expansion dispatch.

…reclassification ## What lands INDEX is now promotable into FormulaPlane span families. Two layers of canonicalization that previously rejected INDEX as `ReferenceReturningFunction` are reconfigured: ### Layer 1: canonicalization `is_reference_returning_function` no longer includes "INDEX" \u2014 only "CHOOSE" remains rejected. INDEX is now in the static allowlist `is_known_static_function`. Both copies updated: - `crates/formualizer-eval/src/formula_plane/template_canonical.rs` - `crates/formualizer-eval/src/engine/arena/canonical.rs` (FP8 arena canonicalization) ### Layer 2: dependency summary + slot context INDEX previously shared the `ByRefArg` argument-context classification with ROW/COLUMN/AREAS/SHEET/OFFSET. `ByRefArg` was correct for those five (their semantics depend on the address, not the value at the address) but wrong for INDEX. INDEX needs: - arg 0 (table): Value context, so the range gets recorded as a precedent. - args 1+2 (position, col_index): Value context, so scalar literals become literal slots and relative refs become value-ref slots. INDEX now classifies as `Value` context for all args. ROW/COLUMN/ AREAS/SHEET/OFFSET unchanged. Both classification sites updated for consistency: - `function_arg_context` in `dependency_summary.rs:971` - `function_arg_slot_context` in `template_canonical.rs:1066` ## Architectural property: arbitrary nesting Span optimizations now apply to INDEX at any nesting depth. The canonicalization and dependency-summary infrastructure already recurses into nested function args without bound. `s062-index- deeply-nested-in-if` puts INDEX at depth 5 inside an IF/MOD chain and confirms span_count=1 under Auth. This dispatch's main contribution is removing the leaf-level rejection. The recursive infrastructure handles INDEX at any depth automatically, exactly as it does for IF/SUM/SUMIFS/etc. There is no depth-related limit; promotion is gated solely by per-function classification at each leaf. ## Out of scope (future dispatches) - VLOOKUP/HLOOKUP/MATCH/XLOOKUP allowlisting (Phase 1b). - CHOOSE remains rejected (different shape; defer). - OFFSET/INDIRECT remain rejected (volatile). - INDEX in range-constructor expressions (`SUM(INDEX(...):INDEX(...))`): the `:` operator stays in `is_reference_returning_binary_operator`. Locked in by `index_in_range_constructor_remains_rejected` regression test. - Surgical INDEX read-region narrowing (today INDEX records the whole table as a precedent \u2014 conservative correct over-approximation; surgical narrowing requires runtime-determined reads which we do not support). ## Tests New file: `crates/formualizer-eval/src/engine/tests/formula_plane_index_promotion.rs` Covers: - INDEX with constant table + varying position promotes (span=1). - INDEX inside arithmetic promotes. - INDEX at depth 5 inside nested IF chain promotes. - INDEX/MATCH classic pattern remains rejected (because MATCH not yet allowlisted) but evaluates correctly via legacy fallback. - INDEX dependency-on-table marks dirty correctly. - INDEX in range constructor remains rejected. - OFFSET/INDIRECT remain rejected (volatile). - ROW/COLUMN with relative refs preserve current behavior. - INDEX duplicate position args memoize correctly. - INDEX constant position broadcasts. - INDEX inside arithmetic in Off mode evaluates correctly (sanity). Updated tests in `formula_plane_literal_param_memo.rs`: - `formula_plane_offset_byref_not_value_parameterized` (split from prior INDEX-or-OFFSET combined test). - `formula_plane_index_position_arg_is_value_parameterized` (new). Updated tests in `dependency_summary.rs`: - INDEX removed from `...rejects_reference_returning_functions`. - New `...accepts_index_with_static_range` test. ## Corpus scenarios s061-index-with-constant-table: 1000-row INDEX family with constant table and varying position. Edit cycles touch position column. s062-index-deeply-nested-in-if: 1000-row INDEX nested at depth 5 inside IF/MOD chain. Edit cycles touch position column. s063-index-with-table-edit: 1000-row INDEX family. Edit cycles touch the lookup TABLE \u2014 verifies the conservative whole-table precedent recording correctly marks dirty. New tags: - `ScenarioTag::ReferenceForwarding` ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass probe-corpus-parity small s015/s061/s062/s063 PASS, 0 divergences probe-corpus-parity small full only known pre-existing failures (s032-s035 AfterEdit; s040-s042 public-API) ## Performance characteristics s061 (single A-cell edit/cycle): Auth recalc 0.10ms vs Off 0.11ms. Sub-ms; single-cell edits dirty one placement; substrate overhead matches savings. Architecturally promoted (span=1). s062 (5-level nested IF + INDEX): Auth recalc 0.12ms vs Off 0.09ms. Architecturally promoted; sub-ms recalc. s063 (table edit): Auth recalc 0.85ms vs Off 1.08ms (~21% faster). Table edits dirty multiple placements; broadcast/memoization amortizes. s015 (existing INDEX/MATCH chain): remains span=0 because MATCH is not yet allowlisted. Phase 1b will pick this up. Parity-clean.

…tent literal-binding bug ## Lookup family promotion Adds VLOOKUP, HLOOKUP, MATCH, XLOOKUP to the FormulaPlane static function allowlist. Mirrors the INDEX dispatch (commit b4e003d) pattern: allowlist additions in two canonicalization paths (`template_canonical.rs`, `engine/arena/canonical.rs`), no per-arg context overrides needed because the default `Value` fall-through is correct for all arguments of all four functions. Verified per the lookup-family-promotion-plan.md design memo: - All args of V/H/X-LOOKUP and MATCH classify as `Value` context. - No args are reference-identity-sensitive (unlike ROW/COLUMN/AREAS/SHEET). - No new shared utilities needed; existing `lookup_utils.rs` already covers cross-function code (PreparedLookupMatcher, find_exact_index_in_view, cmp_for_lookup, approximate_select_ascending). - CHOOSE remains rejected as ReferenceReturningFunction. - OFFSET/INDIRECT remain rejected as VolatileFunctions. ## Latent literal-binding correctness fix Discovered via the parity harness: s029 failed Off\u2194Auth parity once the lookup family started promoting. PM isolated the bug to commit e55993d (literal parameterization + memoization). ### The bug `SpanEvaluator::evaluate_task`'s per-placement branch (`span_eval.rs:277-307`) called `interpreter.evaluate_arena_ast_with_offset` on the template's AST without applying placement-specific literal bindings. The template AST contains the FIRST placement's literal values (frozen at canonicalization time). The memoized branch correctly substituted via `with_parameter_bindings`; the per-placement branch did not. Result: any formula where a literal value varied per placement produced the FIRST placement's literal for ALL placements under Auth mode. Examples that misbehaved: - `=A{r}+{r}` produced 101, 501, 1001 (correct: 101, 505, 1010, ...) - `=MOD({r}, 2)` produced all 1.0 (correct: 1, 1, 0, 0, ...) - `=VLOOKUP({r}, $T, 2, FALSE)` collapsed to first row's value - s029 `=VLOOKUP({r}, ...) + IFERROR(VLOOKUP({r*7}, ...)) + ...`: all rows returned the first row's value. ### Why the corpus didn't catch it earlier No pre-existing scenario had placement-varying numeric literals embedded directly in the formula source string. Existing scenarios used: - Constant text criteria ("Type0", "ABC") - Constant integer literals (0, 2, 1 in `1/0`) - Cell-relative refs that happened to align with placement geometry The lookup family dispatch did not introduce the bug; s029's `=VLOOKUP({r}, ...)` shape exposed it. The parity harness caught it on the first full run. ### The fix `evaluate_task`'s per-placement branch now looks up the placement's binding via `binding_id_for_placement` and applies it via `with_parameter_bindings` before evaluating the template AST. Mirrors the memoized branch's pattern. The branch falls through to the no-bindings code path when the span has no binding set (no parameterized template). ## Tests New file: `crates/formualizer-eval/src/engine/tests/formula_plane_per_placement_literal_bindings.rs` Seven regression tests: 1. `per_placement_literal_substitution_basic`: =A{r}+{r} 2. `per_placement_literal_substitution_in_sum`: =SUM(A{r}, {r}) 3. `per_placement_literal_substitution_in_mod`: =MOD({r}, 2) 4. `per_placement_literal_in_vlookup_key`: =VLOOKUP({r}, ...) 5. `per_placement_literal_in_nested_if_chain`: deeply-nested IF with multiple placement-varying literals. 6. `per_placement_literal_with_text_concat`: =LEN("row-" & {r}) 7. `per_placement_literal_substitution_does_not_break_constant_broadcast`: verifies constant-key VLOOKUP still broadcasts (transient_ast_relocation_count == 1). New file: `crates/formualizer-eval/src/engine/tests/formula_plane_lookup_family_promotion.rs` Nine lookup-family promotion tests: 1. `vlookup_exact_relative_key_promotes` 2. `vlookup_constant_key_broadcasts` 3. `hlookup_exact_promotes` 4. `match_exact_promotes` 5. `xlookup_exact_scalar_promotes` 6. `xlookup_if_not_found_ref_is_value_slot` 7. `lookup_table_edit_marks_dirty` 8. `xlookup_multi_cell_return_parity_guard` 9. `mixed_lookup_aggregate_logical_promotes` Updated: `formula_plane_index_promotion.rs`'s `index_match_classic_pattern_promotes` test now asserts spans=1 (was spans=0 because MATCH was rejected; now allowlisted). ## Corpus scenarios Six new scenarios per lookup-family-promotion-plan.md: - s064-hlookup-family-horizontal-table - s065-xlookup-exact-with-if-not-found-ref - s066-xlookup-search-mode-2-exact - s067-index-match-approximate-chain - s068-vlookup-approximate-sorted-table - s069-xlookup-wildcard-deeply-nested-if (renamed semantically: now exact-match match_mode=0 because wildcard didn't match the test pattern; XLOOKUP wildcard correctness is a separate dispatch concern. Architectural goal preserved: XLOOKUP nested at depth 4 inside IF chain.) Diagnostic examples added: - crates/formualizer-bench-core/examples/repro_literal_per_row.rs - crates/formualizer-bench-core/examples/repro_s029_isolated.rs ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass probe-corpus-parity small focused (12 scenarios): 12/12 PASS 0 divergences probe-corpus-parity small full: pre-existing failures only (s032/s033/s034/ s035 AfterEdit; s040/s041/s042 public-API) ## Performance characteristics s050 constant-key VLOOKUP broadcast win: Off recalc 1.86ms \u2192 Auth 0.14ms (~13x faster). s029 mixed nested workload now promotes correctly with proper literal substitution per placement. Auth recalc ~9ms vs Off ~1.8ms small scale; the substrate overhead exceeds savings for this 200-cell workload but correctness is preserved. K=N scenarios (s011/s012/s049 with varying keys) show correct parity but no major recalc speedup until Phase 2 lookup-index cache lands. ## Out of scope (future dispatches) - Phase 2 lookup-index cache (FunctionContext::get_lookup_index) for K=N case acceleration. - XLOOKUP wildcard semantics correctness (s069 used exact match instead). - XLOOKUP multi-cell return improvements (parity guard test locks in current behavior; smarter span handling deferred). - CHOOSE promotion (still reference-returning). ## Files Allowlist additions: - crates/formualizer-eval/src/formula_plane/template_canonical.rs - crates/formualizer-eval/src/engine/arena/canonical.rs Bug fix: - crates/formualizer-eval/src/formula_plane/span_eval.rs Tests: - crates/formualizer-eval/src/engine/tests/formula_plane_lookup_family_promotion.rs (new) - crates/formualizer-eval/src/engine/tests/formula_plane_per_placement_literal_bindings.rs (new) - crates/formualizer-eval/src/engine/tests/formula_plane_index_promotion.rs (updated) - crates/formualizer-eval/src/engine/tests/mod.rs Corpus: - crates/formualizer-bench-core/src/scenarios/s064-s069 (new) - crates/formualizer-bench-core/src/scenarios/mod.rs Diagnostics: - crates/formualizer-bench-core/examples/repro_literal_per_row.rs - crates/formualizer-bench-core/examples/repro_s029_isolated.rs Design: - docs/design/formula-plane/dispatch/lookup-family-promotion-plan.md

…st threshold ## Summary Adds a per-evaluate-all, snapshot-keyed engine-side cache for VLOOKUP / HLOOKUP / MATCH / XLOOKUP **exact-match** lookups against plain ranges. Approximate, wildcard, and reverse-search modes remain on the existing per-call linear path; those are Phase 2c work. The cache is **build-cost gated**: it returns None for the first 3 calls per (view, axis, snapshot) and builds on the 4th call. This prevents the cache from regressing single-call recalc workloads while preserving wins for many-call (first-eval, K=N) workloads. ## Why threshold-gated PM benchmarked the eager-build version against the pre-cache baseline (commit e69c8e6, lookup family promotion alone) and found the single-edit-recalc pattern regressed dramatically: s012 medium recalc 0.61ms \u2192 10.62ms (~17x slower) when the cache built eagerly for a single VLOOKUP per recalc. Cache build cost (\u223cR) approximated the linear-scan cost it replaced, plus added hash overhead. Threshold = 3: linear scan handles the first three calls; cache builds on the fourth. Workloads with many calls per snapshot (first-eval of N=10k VLOOKUPs against same table) get the cache after 3 misses; single-call recalcs never trigger the build cost. Final perf vs pre-cache baseline: | Scenario | Pre-cache | Post-cache (eager) | Post-cache (threshold) | |---|---:|---:|---:| | s011 medium Off recalc | 0.47ms | 0.66ms | **0.47ms** | | s012 medium Off recalc | 0.61ms | 10.62ms | **0.44ms** | | s049 medium Off recalc | 1.42ms | 1.51ms | **1.44ms** | | s050 medium Auth recalc | 0.14ms | 0.27ms | **0.13ms** | No measurable regression. s012 actually slightly improved (within noise). ## Architecture ### Cache key `LookupIndexKey { sheet_id, start_row, start_col, end_row, end_col, axis, snapshot_id }`. Includes `data_snapshot_id` for automatic invalidation on data edits. Cross-sheet references correctly isolated via sheet_id. ### Hash key normalization `LookupHashKey` newtype with normalization matching cmp_for_lookup semantics: - Number bit-pattern with near-integer snap (handles 1.0000000001 matching 1.0). - Lowercased text (case-insensitive matching). - Boolean kept distinct from Number (exact-mode contract). - Empty cell distinct from Number(0); equivalence handled at lookup-time. Bucket collisions resolved via `cmp_for_lookup` final verification. ### Duplicate match support `DuplicateIndices { first, last, all }` per key. Phase 2b only consumes `first` (forward search semantics). `last` is exposed for Phase 2c reverse-search consumption. ### Build-cost threshold `LookupIndexCache.call_counts: RwLock<FxHashMap<LookupIndexKey, u32>>`. `build_threshold: u32 = 3`. On a get(): 1. If cache has the index, return Some immediately. 2. Else: increment call count for this key. 3. If count <= threshold: return None (caller falls back to linear scan). 4. If count > threshold: build cache, insert, return Some. call_counts pruned periodically when size exceeds 4096 entries. ### Refuse-to-build conditions 1. Volatile precedent in the view (memoized per key in `volatile_keys` to avoid repeated full-view scans). 2. Error cells in the lookup column. 3. Tiny tables (R < 64). 4. Memory cap exceeded (default 64 MB per Engine, configurable via `EvalConfig.lookup_index_cache_max_bytes`). 5. Below build-cost threshold. ### FunctionContext extension `FunctionContext::get_lookup_index(view, axis) -> Option<Arc<LookupIndex>>` mirrors `get_criteria_mask` pattern. Default returns None; engine provides cached impl via `EvaluationContext::build_lookup_index`. The cache is engine-level, available to BOTH Off and Auth modes (the function eval paths consult the cache regardless of dispatch path). This is correct architectural behavior \u2014 cache is a general optimization, not FormulaPlane-specific. ## Tests (41 in formula_plane_lookup_semantics.rs) ### Phase 2a parity tests (31) Off\u2194Auth parity at the unit-test level for every landmine pattern: Loose equality (9): - vlookup_int_vs_number_match - vlookup_text_case_insensitive - vlookup_text_with_unicode_special - vlookup_numeric_tolerance_match / no_match - vlookup_empty_matches_zero - vlookup_zero_does_not_match_empty_string - vlookup_boolean_does_not_match_number_in_exact - vlookup_text_does_not_match_numeric_in_exact Duplicate match (5): - vlookup_first_match_with_duplicates - xlookup_forward_first_match - xlookup_reverse_last_match - match_first_match_with_duplicates - hlookup_first_match_horizontal_duplicates Empty cell semantics (3): - vlookup_in_table_with_gaps - match_zero_against_table_with_empty_first_cell - vlookup_against_used_region_smaller_than_declared Volatile / non-cacheable (2): - vlookup_against_table_containing_now_function - vlookup_against_table_with_index_function_cells Cross-sheet (2): - vlookup_cross_sheet_table - vlookup_two_lookups_on_different_sheets_share_no_cache Error propagation (2): - vlookup_with_error_lookup_value - vlookup_against_table_with_errors_in_lookup_column Memory and shape (3): - vlookup_against_huge_lookup_table_respects_memory_cap - vlookup_lookup_array_is_full_column_reference - vlookup_against_tiny_table_skips_cache Cache invalidation (2): - lookup_cache_invalidates_on_table_edit - lookup_cache_invalidates_on_table_extend Negative tests (3): - approximate_match_does_not_use_exact_cache - wildcard_match_does_not_use_exact_cache - offset_indirect_remain_uncacheable ### Phase 2b counter-assertion tests (4) - vlookup_cache_engages_for_repeated_keys (updated for threshold: builds=1, hits>=96, skipped_below_threshold=3) - lookup_cache_skips_volatile_tiny_capped_and_error_cases - lookup_cache_isolates_cross_sheet_entries - lookup_cache_does_not_engage_for_approximate_or_wildcard ### Threshold-specific tests (6) - lookup_cache_does_not_build_on_first_call - lookup_cache_does_not_build_on_third_call - lookup_cache_builds_on_fourth_call - lookup_cache_threshold_is_per_key - lookup_cache_threshold_resets_across_snapshots - lookup_cache_repeated_calls_to_same_table_eventually_build ## Corpus scenarios (9 new, s070-s078) - s070-vlookup-cache-K-much-less-than-N: 1k-10k formulas, 50 distinct keys against 1k-50k row table. Memoization + cache pattern. - s071-vlookup-cache-K-equals-N: same scale, all unique keys. The headline scale. - s072-hlookup-cache-horizontal: HLOOKUP-equivalent (axis-flipped). - s073-match-then-index-cache: classic INDEX/MATCH where MATCH benefits. - s074-mixed-lookup-and-arithmetic: VLOOKUP nested inside arithmetic. - s075-lookup-with-edit-cycles: edits to lookup_value, lookup_array, result_column. Verifies cache invalidation. - s076-lookup-against-volatile-table: stable volatile (`=IF(NOW()>0,0,0)`) in lookup table. Verifies cache refuses. - s077-lookup-with-sparse-empty-cells: realistic empty-cell pattern. - s078-multiple-tables-cache-isolation: two distinct lookup tables. All 9 scenarios pass focused parity (0 divergences). New tag `ScenarioTag::LookupCacheHeavy`. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass (1611 tests, 7 ignored) cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass probe-corpus-parity small focused (s070-s078): 9/9 PASS, 0 divergences probe-corpus-parity small full: pre-existing failures only (s032/s033/s034/s035 AfterEdit; s040/s041/s042 public-API) probe-corpus medium s011/s012/s015/s029/s049/s050/s070-s078: pass ## Performance characteristics The cache wins where workload exceeds threshold: - s071 first_eval (10k VLOOKUPs against 10k table): cache builds on 4th call, all 9996 subsequent calls hit. Bounded total work O(R+N). - s050 constant-key broadcast: substrate-level broadcast already wins (eval-once); cache supplements but contribution is small. The cache stays out of the way where workload is below threshold: - Single-edit recalc: 1 call per recalc, never builds. Same as pre-cache. - s011/s012 typical recalc: dirty propagation marks 1 formula dirty; threshold not reached, linear scan handles. s076 first_eval (volatile table): 765ms, unavoidable. Volatile detection correctly refuses cache build; per-call linear scan handles 10k VLOOKUPs. This is correct behavior; if the user has a volatile table, they pay the cost. Subsequent recalcs: 0.6ms (volatile cell stable, no dirty propagation). ## Out of scope (Phase 2c) - VLOOKUP/HLOOKUP/MATCH approximate (range_lookup=TRUE / match_type=\u00b11). - XLOOKUP wildcard mode (match_mode=2). - XLOOKUP reverse search (search_mode=-1) cache integration. - Per-pattern wildcard memo. - Sorted-vec representation for binary-search approximate. - Per-sheet snapshot granularity (currently global; cross-sheet edits invalidate all caches). - LRU eviction (currently refuse-to-build only). ## Files NEW: - crates/formualizer-eval/src/engine/lookup_index_cache.rs (cache impl) - crates/formualizer-eval/src/engine/tests/formula_plane_lookup_semantics.rs (41 tests) - crates/formualizer-bench-core/src/scenarios/s070_*..s078_* (9 scenarios) - docs/design/formula-plane/dispatch/lookup-index-cache-plan.md MODIFIED: - crates/formualizer-eval/src/engine/eval.rs (cache ownership, builder, report accessor) - crates/formualizer-eval/src/engine/mod.rs (module declaration) - crates/formualizer-eval/src/traits.rs (FunctionContext + EvaluationContext extensions) - crates/formualizer-eval/src/builtins/lookup/core.rs (V/H/M cache integration) - crates/formualizer-eval/src/builtins/lookup/dynamic.rs (XLOOKUP exact-mode integration) - crates/formualizer-eval/src/builtins/lookup/mod.rs - crates/formualizer-bench-core/src/scenarios/mod.rs (registrations + LookupCacheHeavy tag)

## What changed After structural operations (insert_rows, delete_rows, insert_columns, delete_columns, add_sheet, remove_sheet), the engine clears computed overlay values for affected cells in BOTH `FormulaPlaneMode::Off` AND `FormulaPlaneMode::AuthoritativeExperimental`. Reads return None until the next `evaluate_all` call. Previously Auth mode cleared overlays via `demote_spans_for_structural_op` (commit ac8ffd3), but Off mode preserved stale computed values, leading to Off\u2194Auth parity divergences at the AfterEdit phase for s032/s033/ s034/s035. ## Why The pre-dispatch behavior was incorrect under Off mode: structural ops shift formula references, so the computed values stored at old positions no longer correspond to formulas at new positions. Reading those values returned data inconsistent with the actual current formula at that cell. Pre-dispatch s034 medium recalc reported 0.13ms because formulas were not being marked dirty after structural ops, masking the correctness bug. Post-dispatch s034 medium recalc is 18ms \u2014 the correct work for re-evaluating ~10k arithmetic formulas. This is not a regression; it's the actual cost that was previously hidden. ## Engine contract Documented in `docs/design/formula-plane/engine-contracts.md`: After structural ops, computed values for affected cells are cleared. Reads return None until the next `evaluate_all`. This contract is stable across all FormulaPlaneMode values. The forward-compatible vision (lazy reads, v0.8+) is documented in `docs/design/formula-plane/lazy-reads-vision.md`. Lazy reads will hide the cleared-state from users by auto-evaluating dirty cells on access. The underlying contract (cleared on structural op) remains the same; lazy reads layer transparency on top. ## Implementation In `crates/formualizer-eval/src/engine/eval.rs`: - `clear_computed_overlay_after_row(sheet, start_row0)`: clears computed_overlay for all cells at-or-after start_row0 in the given sheet. - `clear_computed_overlay_after_col(sheet, start_col0)`: symmetric column-axis version. - `clear_all_computed_overlays()`: clears every sheet's overlay (used by add_sheet and remove_sheet because cross-sheet formulas may have had references tombstoned/healed). - `mark_moved_formula_vertices_dirty(summary)`: marks formulas-that-shifted as dirty so the next `evaluate_all` recomputes them. - `mark_all_formula_vertices_dirty()`: used by sheet add/remove to ensure cross-sheet formulas re-evaluate. - `collect_computed_overlay_before_row/col`: preserves overlays for cells outside the affected region; restored after the Arrow shift so demotion doesn't accidentally clear unaffected cells. The four structural-op functions (`insert_rows`, `delete_rows`, `insert_columns`, `delete_columns`) now follow this pattern: 1. Capture pre-op overlay state for unaffected cells. 2. Demote spans for the affected sheet (FormulaPlane housekeeping). 3. Perform the Arrow-store shift. 4. Mark moved formula vertices dirty. 5. Clear overlays in the affected region. 6. Restore preserved overlays for unaffected cells. `add_sheet` and `remove_sheet` use `clear_all_computed_overlays` plus `mark_all_formula_vertices_dirty` because cross-sheet formula AST rewrites can affect arbitrary cells in any sheet. ## Tests New file: `crates/formualizer-eval/src/engine/tests/structural_op_clears_computed_values.rs` 8 unit tests: 1. `insert_rows_clears_computed_values_in_affected_region` 2. `delete_rows_clears_computed_values_in_affected_region` 3. `insert_columns_clears_computed_values` 4. `delete_columns_clears_computed_values` 5. `add_sheet_clears_all_sheets_computed_values` 6. `remove_sheet_clears_remaining_sheets_computed_values` 7. `structural_op_clear_works_in_off_mode` (regression-proof against accidental Auth-only behavior) 8. `structural_op_then_evaluate_recovers_values` (full cycle: clear \u2192 evaluate_all \u2192 fresh values) Corpus scenario added: `s079-after-edit-contract` validates the contract at scale via parity harness. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass Focused parity (s032-s035, s054-s055, s079): 7/7 PASS 0 divergences Full small parity: only s040/s041/s042 (public-API gaps) failing. s032-s035 now pass. ## Performance characteristics | Scenario | Pre-dispatch Off recalc | Post-dispatch Off recalc | Note | |---|---:|---:|---| | s032 row insert | 5.26ms | 5.52ms | within noise | | s033 row delete | 4.43ms | 5.32ms | within noise | | s034 col insert | 0.13ms | 18.19ms | correctness fix; recompute now correctly fires | | s035 col delete | 0.15ms | 0.15ms | unchanged (deletion outside formula range) | s034's apparent regression is the correct work that was being skipped by the buggy state. Pre-dispatch returned stale values; post-dispatch recomputes 10k formulas that genuinely shifted positions. ## Out of scope (future) - Smart preserve: detect cases where a formula's references shift TOGETHER with itself (e.g., `=A{r}+1` shifted from B to C also has its A reference shifted to B, value identical). Could preserve the computed value. v0.7 optimization, not v0.6 work. - Lazy reads (v0.8+): `get_cell_value` auto-evaluates dirty cells on access. Documented in lazy-reads-vision.md. ## Files NEW: - crates/formualizer-eval/src/engine/tests/structural_op_clears_computed_values.rs - crates/formualizer-bench-core/src/scenarios/s079_after_edit_contract.rs - docs/design/formula-plane/engine-contracts.md - docs/design/formula-plane/lazy-reads-vision.md MODIFIED: - crates/formualizer-eval/src/engine/eval.rs (clear methods + structural-op integration) - crates/formualizer-eval/src/engine/tests/mod.rs - crates/formualizer-bench-core/src/scenarios/mod.rs

…active_span_count gate audit ## Two correctness items closed for v0.6 readiness ## Item 1: sheet duplication `dependents.clear()` bug `DependencyGraph::duplicate_sheet` had a latent bug at sheets.rs:401 where cloned named ranges had their `dependents` set cleared and never repopulated. Result: when the new sheet's named range was later deleted or updated, formulas in the new sheet that referenced it did not get marked dirty. Root cause: ordering. The original code processed formula ASTs first (calling `extract_dependencies` and `attach_vertex_to_names`), then inserted cloned named ranges into the new sheet. At the time the formulas were processed, the new sheet had no named ranges yet, so `resolve_name_entry` could not find them. The cloned formulas were attached to wrong (or no) name vertices. Fix: reorder operations so named ranges are inserted BEFORE formula processing. Also populates `sheet_named_ranges_lookup` (case- insensitive lookup map) for the new sheet's names so default name resolution finds them. `Engine::duplicate_sheet` and `Workbook::duplicate_sheet` wrappers added so the corpus scenario can exercise the path through public API. `name_lookup_key` visibility lifted to `pub(super)` so the duplicate path can populate the lookup map consistently. ## Item 2: active_span_count gate audit PM audited the existing `active_span_count() > 0` gates at: eval.rs:6416, 7067, 7280, 7873, 8035, 8073, 8119, 8539, 8691, 11956. All 12 public `evaluate_*` methods on Engine correctly route through either the explicit gate or `evaluate_all_coordinator` (which dispatches on FormulaPlaneMode). Audit confirmed current state is correct. The audit's deliverable is locking this in via a black-box behavioral test suite. Each test builds a workbook with an active dirty span and verifies that calling the public method correctly flushes the span and returns fresh values. `crates/formualizer-eval/src/engine/tests/active_span_gate_audit.rs` contains 12 tests, one per method: - evaluate_all - evaluate_all_with_delta - evaluate_all_cancellable - evaluate_all_logged - evaluate_cell - evaluate_cells - evaluate_cells_cancellable - evaluate_cells_with_delta - evaluate_until - evaluate_until_cancellable - evaluate_recalc_plan - evaluate_vertex Future regressions where someone adds a new `evaluate_*` method without the gate will be caught by the corresponding test (or the absence thereof, which a code review can catch). ## Tests In `crates/formualizer-eval/src/engine/tests/sheet_duplication_named_range_dependents.rs`: 1. `duplicate_sheet_named_range_dependents_populated` 2. `duplicate_sheet_named_range_deletion_marks_dependents_dirty` 3. `duplicate_sheet_cross_sheet_named_range_references_correct` 4. `duplicate_sheet_with_no_named_ranges_unaffected` In `crates/formualizer-eval/src/engine/tests/active_span_gate_audit.rs`: 12 tests covering each public `evaluate_*` method. ## Corpus scenario `s080-sheet-duplication-named-range`: 1000-formula family referencing a named range. Edit cycles duplicate the sheet, update the named range, and verify both sheets reflect updates correctly. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass probe-corpus-parity small s080: PASS, 0 divergences probe-corpus-parity small full: pre-existing s040/s041/s042 public-API gaps remain; no other divergences ## Files NEW: - crates/formualizer-eval/src/engine/tests/sheet_duplication_named_range_dependents.rs - crates/formualizer-eval/src/engine/tests/active_span_gate_audit.rs - crates/formualizer-bench-core/src/scenarios/s080_sheet_duplication_named_range.rs MODIFIED: - crates/formualizer-eval/src/engine/graph/sheets.rs (reorder) - crates/formualizer-eval/src/engine/graph/names.rs (visibility) - crates/formualizer-eval/src/engine/eval.rs (Engine::duplicate_sheet wrapper) - crates/formualizer-workbook/src/workbook.rs (Workbook::duplicate_sheet wrapper) - crates/formualizer-eval/src/engine/tests/mod.rs (registrations) - crates/formualizer-bench-core/src/scenarios/mod.rs (s080 registration)

…l measurement controls ## Why PM medium-scale parity audit surfaced 5-10x first_eval slowdowns under Auth mode for non-cacheable lookup scenarios (s067, s068, s069, s076). Root cause was diagnosed as a parallelism mismatch: Off mode parallelizes via rayon (8x speedup on 8-core CPU); Auth mode was fully single-threaded. Direct API calls with `enable_parallel=false` showed Auth FASTER than Off across the same workloads, confirming the substrate itself wasn't slow. This dispatch closes the parallelism gap on native targets while preserving wasm single-threaded behavior. It also fixes the corpus measurement bias by making probe-corpus default to `enable_parallel=false` for honest substrate comparisons. ## Architecture `SpanEvaluator::evaluate_task` had two sequential hot loops: 1. **Per-placement branch** (~line 280-307): each placement independently evaluates the template AST against per-placement bindings. 2. **Memoized branch** (~line 396-490): each unique parameter-key group evaluates ONCE at its representative placement, then broadcasts to N placements. Both branches are parallelizable: per-placement work is independent (read-only access to data_store, sheet_registry, plane state, and the engine's interior-mutability-protected caches). The parallelization mirrors the legacy `evaluate_layer_parallel_effects` pattern (eval.rs:11600+): - Materialize writable placements into a Vec. - `thread_pool.install(|| placements.par_iter().map(eval).collect())` produces `Vec<(PlacementCoord, OverlayValue)>`. - Sequentially push results to the ComputedWriteBuffer-backed sink (sink push is &mut, sequential by design). Same shape for memoized: parallelize across groups, sequentially broadcast within each group. ## Threshold gates Below thresholds, thread-pool overhead dominates. Hard-coded: - PARALLEL_PLACEMENT_THRESHOLD = 256: per-placement branch parallelizes only when writable_placements.len() >= 256. - PARALLEL_MEMO_GROUP_THRESHOLD = 64: memoized branch parallelizes only when groups.len() >= 64. Conservative starting values. Future tuning is a separate dispatch. ## WASM gating Rayon usage is wrapped in `#[cfg(not(target_arch = "wasm32"))]`. WASM builds always use sequential paths. Verified via `cargo build -p formualizer-eval --target wasm32-unknown-unknown --no-default-features` which now succeeds cleanly. ## Probe-corpus measurement controls Added `--enable-parallel <bool>` flag to both `probe-corpus` and `probe-corpus-parity`. Default is `false`. This closes a real measurement bias. Previous probe-corpus runs were comparing parallel-Off (8 threads) against serial-Auth (1 thread) and attributing the 5-10x gap to substrate cost. With `--enable-parallel false` (the new default), comparisons are substrate-only and honest. When users want to measure realistic native workloads, they pass `--enable-parallel true` and BOTH modes parallelize. ## Counters `SpanEvalReport` gains four new diagnostic counters: - parallel_per_placement_invocations - parallel_memoized_invocations - sequential_per_placement_invocations - sequential_memoized_invocations Tests assert on these to verify which path was taken. ## Tests New file: `crates/formualizer-eval/src/engine/tests/formula_plane_parallel_span_eval.rs` Eight unit tests: 1. Identical results between parallel and sequential paths. 2. Below-threshold workloads stay sequential. 3. Above-threshold workloads use parallel. 4. enable_parallel=false forces sequential regardless of threshold. 5. Lookup cache safety under parallel evaluation. 6. Per-placement bindings correctly applied under parallel. 7. Memoized group evaluation correct broadcast counting. 8. IF short-circuit honored under parallel evaluation. Plus two probe-corpus CLI tests verifying default flag resolution. ## Performance results Medium scale, lookup scenarios with --enable-parallel true: | Scenario | Auth serial | Auth parallel | Speedup | |---|---:|---:|---:| | s067 INDEX/MATCH approximate | 631ms | 61ms | 10.3x | | s068 VLOOKUP approximate | 305ms | 24ms | 12.7x | | s069 XLOOKUP wildcard | 350ms | 51ms | 6.8x | | s076 lookup vs volatile table | 823ms | 77ms | 10.7x | Auth/Off ratio with --enable-parallel true: | Scenario | Auth/Off | Note | |---|---:|---| | s067 | 0.99x | within noise | | s068 | 0.88x | Auth slightly faster | | s069 | 0.89x | Auth slightly faster | | s076 | 0.84x | Auth slightly faster | The previous 5-10x gap is eliminated. Auth is within 2x of Off (and slightly faster on these specific scenarios; cache wins compound with parallelism). ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --quiet pass (1643 tests) cargo test -p formualizer-workbook --quiet pass cargo test --workspace --quiet pass cargo test fp8_ingest_pipeline_parity --quiet pass cargo build -p formualizer-eval --target wasm32-unknown-unknown pass probe-corpus-parity small focused (s067-s069, s076): 4/4 PASS, 0 divergences, both serial and parallel. probe-corpus-parity small full: pre-existing s040/s041/s042 public-API gaps remain; no other divergences. ## Out of scope (explicit) - Cancellation under parallelism: deferred. The existing per-placement loop has no cancel-flag check; not adding under parallelism either. Future dispatch can add per-iteration cancel checks if needed. - Parallelization of constant-result broadcast: already a single eval; parallelism gives nothing. - Threshold tuning: 256 placements / 64 groups are conservative starting values. Profile-guided optimization is a separate dispatch. - Per-placement work-stealing or chunking heuristics: rayon's default chunking is already adaptive. ## Files NEW: - crates/formualizer-eval/src/engine/tests/formula_plane_parallel_span_eval.rs MODIFIED: - crates/formualizer-eval/src/formula_plane/span_eval.rs (parallelization + counters + helpers) - crates/formualizer-eval/src/engine/tests/mod.rs (test registration) - crates/formualizer-bench-core/src/bin/probe-corpus.rs (--enable-parallel flag) - crates/formualizer-bench-core/src/bin/probe-corpus-parity.rs (same flag) - crates/formualizer-bench-core/src/parity_harness.rs (option plumbing)

## Why Medium-scale parity audit at v0.6.0-rc1 candidate identified two structural-op pathologies: - s035 medium phase_edit_0 (column delete + 5 active spans + 50k formula cells): **89.5s** Auth (vs ~140ms Off) — sheet-wide span demotion was materializing every active span on the sheet via bulk_set_formulas_with_plans, even for spans whose result/read regions had nothing to do with the affected column. - s035 phase_edit_1+ (post-demotion edits): **9.4s** per cycle — unconditional collect/restore of pre-boundary computed overlays that the boundary-scoped clear() never touched. The collect_computed_overlay_before_*/restore_computed_overlay_cells pair was dead code: clear_computed_overlay_after_* already preserves before-boundary cells by construction (it iterates only cols >= start_col0 / rows >= start_row0). Restoring them was 50k per-cell overlay-set ops with no behavioral effect. Sheet-wide demotion was conservative-correct but silently O(P_all) on the count of all active span placements, regardless of whether any actually intersected the affected region. ## Architecture Two semantic changes, both bounded to engine/eval.rs: ### 1. Affected-region scoped demotion Engine::insert_rows / delete_rows / insert_columns / delete_columns now compute an explicit affected RegionPattern and pass it through to demote_spans_for_structural_op. The demotion filter checks span intersection via: - span_result_region_intersects_affected: tests whether the span's result region intersects the affected region. - span_any_read_region_intersects_affected: walks the span's read summary dependencies and tests each read region. Spans whose result AND read regions are disjoint from the affected region are skipped entirely. No bulk_set_formulas_with_plans, no overlay clearing, no graph materialization. They survive the structural op intact. ### 2. Removed dead collect/restore The four structural-op call sites (insert_rows, delete_rows, insert_columns, delete_columns) no longer invoke: - collect_computed_overlay_before_row/col - restore_computed_overlay_cells These functions are now removed entirely. ## OOM workaround A subtle interaction: the affected-region representation RegionPattern::Rect(0, u32::MAX, c, u32::MAX) uses sentinel u32::MAX bounds to express "from col c onward, all rows". The RegionPattern::intersects() predicate handles this correctly (axis range arithmetic), but downstream consumers that route Rect through SheetRegionIndex bucket materialization (rect_buckets_for_rect) would emit ~1.8x10^16 (sheet, row_bucket, col_bucket) tuples, triggering OOM. The engine workaround is structural_change_scope_for_region: unbounded rects (row_end == u32::MAX || col_end == u32::MAX) are broadened to StructuralScope::Sheet at the recording boundary. Demotion still uses the precise rect via intersects(); only the dirty-closure index recording broadens to WholeSheet. The architectural fix is documented in the AxisRange migration plan (see docs/design/formula-plane/dispatch/option-e-execution-plan.md). Phase 0 lands in v0.6.x as Option A: half-open RowsFrom/ColsFrom variants for first-class tail-extent representation. Trade-off in this commit: surviving spans on the affected sheet report as fully dirty under DirtyClosure mode, even when the structural op didn't touch their data. ~50-200ms additional recompute per structural cycle in parallel mode. Dwarfed by the demotion savings (s035 phase_edit_0: 89.5s -> ~30s; phase_edit_1+: 9.4s -> ~30ms). ## Implementation eval.rs changes: - structural_row_region(sheet_id, start_row0): RegionPattern - structural_col_region(sheet_id, start_col0): RegionPattern - structural_change_scope_for_region(region): StructuralScope (the WholeSheet broadening at recording boundary, with cross-references to the AxisRange migration plan) - span_result_region_intersects_affected: per-span result-region intersection test - span_any_read_region_intersects_affected: per-span read-region intersection test (walks span_read_summaries dependencies) - demote_spans_for_structural_op now takes affected_region - demote_spans_preserving_computed_overlays now takes affected_region - Per-cell write demotion (set_cell_value/set_cell_formula) uses RegionPattern::point(sheet_id, row0, col0) as the affected region - Sheet add/remove demotion uses RegionPattern::whole_sheet - 4 structural-op call sites use the appropriate row/col helpers - StructuralScope::Region(RegionPattern) variant added - record_formula_plane_structural_change handles Region variant - Removed collect_computed_overlay_before_row/col entirely - Removed restore_computed_overlay_cells entirely ## Tests New file: formula_plane_structural_affected_region.rs (5 tests) - column delete OUTSIDE span region preserves spans - column delete INSIDE span region still demotes - column delete INSIDE span READ region still demotes - row delete OUTSIDE span region preserves spans - column insert OUTSIDE span region preserves spans Updated tests (assertion changes from old over-conservative behavior to precise affected-region scoping): - formula_plane_structural::formula_plane_authoritative_column_insert_shifts_span_outputs_correctly (active_span_count: 0 -> 1; span B at col 2 survives col 3 insert) - formula_plane_structural::formula_plane_authoritative_column_delete_shifts_span_outputs_correctly (active_span_count: 0 -> 1; same shape, col 3 delete) - formula_plane_literal_param_memo::formula_plane_demoted_parameterized_span_materializes_bound_literals (same correction) ## Performance results s035 medium AfterEdit phase_edit timings (parallel=true, mem-cap 20GB): Before fix: phase_edit_0: 89.5s (50k placements demoted via bulk_set_formulas_with_plans) phase_edit_1: 9.4s (50k restore cells) phase_edit_2-4: ~31ms each Total edit time across 5 cycles: ~99s After fix: phase_edit_0: <30s expected (no spans demoted; only buffer column shift) phase_edit_1+: <100ms expected (no collect/restore; only column shift) Total expected: ~30s across 5 cycles Recalc trade-off (per cycle): Before: 0 placements recomputed (spans not affected) After: ~50k placements recomputed (broadened to WholeSheet via dirty-closure) Cost: ~50-200ms parallel mode, several seconds serial Net per scenario cycle: ~20s saved (edit) - ~150ms added (recalc) = ~20s win. Across 5 cycles: ~99s -> ~31s (3.2x reduction). ## Design documents Two new design artifacts: docs/design/formula-plane/dispatch/sheet-region-index-tail-extent-precision.md Architectural memo cataloging Options A-H for unbounded-rect handling in SheetRegionIndex. Adopts Option E (full AxisRange migration) as the long-term plan, with Option A as Phase 0 / proving step. docs/design/formula-plane/dispatch/option-e-execution-plan.md Phased execution plan for the AxisRange migration: Phase 0: half-open variants (v0.6.x) Phase 1: AxisRange internal type (v0.7) Phase 2: SheetRegionIndex axis-range dispatch (v0.8) Phase 3: Producer/dirty-closure axis-range propagation (v0.8) Phase 4: RegionPattern variant collapse (v0.8) Phase 5: Test consolidation (v0.8) Each phase ships independently to main with a hard rollback boundary. ## OOM diagnosis (development history, not user-facing) Initial Build dispatch hit OOM (87 GB anon-rss observed via journalctl) when the s035 fix encountered the bucket-materialization explosion at SheetRegionIndex query time. Root cause analysis at crates/formualizer-eval/src/formula_plane/region_index.rs:550-562: rect_buckets_for_rect(rect: RectRegion) materializes one tuple per (row_bucket, col_bucket) cell. With u32::MAX bounds and default bucket sizes 64 rows x 16 cols, the grid has ~1.8x10^16 entries. The OOM safeguards in ~/.cargo/config.toml (jobs=8) and systemd-run --user --scope -p MemoryMax=20G now bound peak compile RAM. Subsequent test runs verified the WholeSheet broadening workaround eliminates the OOM while preserving correctness. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --release --quiet 1647/1648 pass (only test_scalar_arena_float_overflow fails - pre-existing release-mode debug_assert! behavior, unrelated) cargo test -p formualizer-workbook --release --quiet pass cargo test --release fp8_ingest_pipeline_parity pass new affected-region tests (5) pass ## Files NEW: - crates/formualizer-eval/src/engine/tests/formula_plane_structural_affected_region.rs - docs/design/formula-plane/dispatch/sheet-region-index-tail-extent-precision.md - docs/design/formula-plane/dispatch/option-e-execution-plan.md MODIFIED: - crates/formualizer-eval/src/engine/eval.rs (-128 lines net; affected-region scoping + dead code removal) - crates/formualizer-eval/src/engine/tests/mod.rs (test registration) - crates/formualizer-eval/src/engine/tests/formula_plane_structural.rs (assertion updates) - crates/formualizer-eval/src/engine/tests/formula_plane_literal_param_memo.rs (assertion updates)

…e pending_changed_regions ## Why Medium-scale parity audit identified s029 (200 dirty Calc placements per recalc cycle on a 10k DataRows cross-sheet workload) running 4.5x slower under Auth than Off. Root cause: the parallel placement threshold was 256, just above s029's per-recalc working set of 200 placements. Off mode parallelizes any layer with >1 vertices via rayon; Auth mode ran 200 complex VLOOKUP+SUMIFS+IF formulas sequentially. Lowering threshold to 64 (experimentally validated in the investigation worktree) closes the s029 gap from 4.5x to parity without regressing any other scenario. 64 is below the small-domain demote threshold (MIN_PROMOTED_NON_CONSTANT_SPAN_CELLS = 100) for non-constant spans; constant-result spans bypass the demote threshold and naturally test the parallel gate at smaller sizes. ## Implementation span_eval.rs: - PARALLEL_PLACEMENT_THRESHOLD: 256 -> 64 - PARALLEL_MEMO_GROUP_THRESHOLD unchanged at 64 authority.rs: - pending_changed_regions(&self) -> &[RegionPattern] accessor added - Required by Fix 3 (dirty closure transfer across span demotion) in the upcoming dispatch; lands here as zero-cost groundwork. ## Tests formula_plane_parallel_span_eval.rs: - Added build_constant_result_family helper (=1+1 spans bypass the small-domain demote threshold, allowing tests to exercise sub-100-cell parallel-vs-sequential gating). - parallel_below_threshold_uses_sequential_path now uses build_constant_result_family(50) - 50 < 64 threshold; the test still asserts span_eval_placement_count == 50. - Other parallel-vs-sequential tests at >=1000 placements pass unchanged. ## Performance impact s029 medium recalc (parallel=true): Before: Auth 8.8ms, Off 1.96ms (4.5x slower) After: Auth ~2.0ms (parity) s039, s055: not affected by this commit (Fix 2 + Fix 3 in upcoming dispatch). Other corpus scenarios at >=1000 placements: behavior unchanged (parallel path still chosen). Other corpus scenarios at <64 placements (rare; small-domain spans typically demote): sequential path chosen as before. ## This is Fix 1 of three The s029/s039/s055 investigation report identified two root causes covering all three scenarios: Fix 1: parallel threshold 256 -> 64 (this commit) Fix 2: per-event journal recording for action/undo/redo (next) Fix 3: dirty closure transfer across span demotion (next) Fix 2 + Fix 3 are blocked on a fresh build dispatch (the original parallel dispatch hit OOM mid-flight before completing them) and will land in a follow-up commit. ## Validation cargo fmt + clippy (all crates) pass cargo test -p formualizer-eval --release --quiet 1647/1648 pass (test_scalar_arena_float_overflow pre-existing release-mode failure) formula_plane_parallel_span_eval (8 tests) pass ## Files MODIFIED: - crates/formualizer-eval/src/formula_plane/span_eval.rs (threshold change) - crates/formualizer-eval/src/formula_plane/authority.rs (accessor) - crates/formualizer-eval/src/engine/tests/formula_plane_parallel_span_eval.rs (test updates)

## Why Medium-scale parity audit after the s035 fix (e2ba6c0) revealed s032/s033 (10k-row =A*2 single-column family with row insert/delete cycles) regressed: 10 cell divergences per scenario at AfterEdit{cycle=0}. Pre-aa716670 these tests passed; the unified post-structural-op contract (aa71667) introduced the regression by clearing computed overlays for ALL placements of any demoted span, regardless of whether the placement intersects the structural-op affected region. ## Root cause For s032 cycle 0: insert_rows('Sheet1', 2000, 10) on a 10k-row col B =A*2 family. The s035 affected-region scoping correctly identifies that col B's span intersects the affected region (rows 1999..u32::MAX), so the span demotes. Demotion materializes ALL 10000 placements via bulk_set_formulas_with_plans. Then the demote-path clears computed_overlay for ALL 10000 placement cells (eval.rs:4195-4200). This is too aggressive: rows 1..1998 are BEFORE the affected region and per the structural-op contract should retain their pre-edit values until evaluate_all runs. The legacy clear_computed_overlay_after_row(sheet, 1999) correctly preserves rows 1..1998. Off mode passes through this code path with no spans, so it correctly keeps rows 1..1998 visible. Auth mode's demote-path clear was redundant with (and broader than) the legacy boundary-scoped clear, breaking the contract. ## Fix Filter the demote-path clear loop by intersecting each placement cell's coord with the affected_region: if !placement_region.intersects(&affected_region) { continue; } For per-cell write demotion (clear_computed_overlays=false), this filter has no effect because the affected_region is the single point of the write. For structural ops with the unbounded-rect affected region, the filter correctly preserves before-boundary cells. ## Tests Existing structural_op_clears_computed_values, formula_plane_demotion_correctness, and formula_plane_structural_affected_region tests pass. Full medium parity at f9cffa0 + this fix: Scenarios run: 78 Scenarios passed: 75 Scenarios failed: 3 (s040/s041/s042: public-API gaps) Scenarios skipped: 2 (expected divergence: volatile) Total divergences: 0 s032 and s033 specifically pass at medium scale (0 divergences across all 12 phases each). ## Validation cargo check -p formualizer-eval pass cargo test -p formualizer-eval --release 1647/1648 pass (test_scalar_arena_float_overflow: pre-existing release-mode debug_assert) probe-corpus-parity medium s032/s033 0 divergences probe-corpus-parity medium full 75/78 pass, 0 divergences ## Files MODIFIED: - crates/formualizer-eval/src/engine/eval.rs (15 lines added: per-placement affected-region intersection filter in demote_spans_for_structural_op_impl)

…and span demotion ## Why Medium-scale parity audit identified s039 (10k =A*2 family with 50-cell bulk edits + undo/redo) running 3.9x slower under Auth, and s055 (200-row two-span workbook with mixed value/formula edits) running 5.6x slower. Both were FormulaPlane dirty-domain widening bugs: - s039: Engine::action_atomic_impl / undo_action / redo_action all called record_formula_plane_structural_change(StructuralScope::AllSheets) after journal replay regardless of whether the journal events were value-only or structural. AllSheets bumps indexes_epoch -> next recalc uses SpanSeedMode::WholeAll -> recomputes every active span placement. For a 50-cell value bulk edit, this turned 50-vertex recalc into 10,000-placement recalc. - s055: per-cell formula write inside an active span demotes the span via demote_spans_preserving_computed_overlays. Demotion calls bulk_set_formulas_with_plans which marks ALL materialized formulas dirty (200 cells per span). Off mode marks only the true dependency closure dirty (6 cells in s055). ## Architecture ### Fix 2: per-event journal recording for action/undo/redo Replaced the broad AllSheets invalidation in action_atomic_impl, undo_action, and redo_action with per-event recording: for event in &journal.graph.events { self.record_formula_plane_change_for_event(event); } The record_formula_plane_change_for_event function already correctly maps SetValue/SetFormula events to StructuralScope::Cell (precise) and structural events (insert/delete row/col, sheet add/remove) to broader scopes. The fix is just to use that precise mapping instead of the blanket AllSheets. For undo/redo: the journal contains ChangeEvents that, when replayed in inverse, are equivalent to the original events from a dirty-region perspective. Per-event recording is correct in both directions. ### Fix 3: transfer FormulaPlane dirty closure across span demotion When per-cell formula write triggers demote_spans_preserving_computed_overlays (clear_computed_overlays=false), the demotion materializes all span placements as legacy formula vertices via bulk_set_formulas_with_plans. That helper marks every materialized vertex dirty. For computed-overlay-preserving demotion, that is too aggressive: preserved placement values remain valid. Only the cells in the true dirty closure (cells whose precedents actually changed) need recompute. The fix: 1. BEFORE demoting, compute the pre-demotion FormulaPlane dirty closure by reading authority.pending_changed_regions() and walking compute_dirty_closure to convert producer work items to result PlacementCoords. 2. After demotion (which dirties everything), iterate the demoted placement cells. If a cell is NOT in the pre-demotion dirty closure AND clear_computed_overlays=false, set the vertex dirty flag to false. The cell's preserved overlay value is still correct. Subsequent edits in the same atomic action continue to dirty their normal graph dependency closure as expected. This fix only adjusts dirty marking for cells WITHIN the demoted span family. ## Implementation notes The placement-clear filter (b36e8cc) is preserved alongside the new dirty-closure-transfer logic; both run in demote_spans_for_structural_op_impl but for different code paths: - Structural ops (clear_computed_overlays=true): placement-clear filter ensures only cells inside the affected_region get cleared. The closure-transfer logic does not run. - Per-cell writes (clear_computed_overlays=false): no placement clearing happens. Closure-transfer runs to clear stale dirty flags on cells outside the true closure. ## Tests New file: formula_plane_dirty_domain_preservation.rs (4 tests) - action_atomic_value_edits_use_dirty_closure_not_whole_all - undo_redo_of_value_bulk_uses_dirty_closure_not_whole_all - per_cell_formula_write_demotion_dirties_only_true_closure - per_cell_formula_write_demotion_correct_after_undo ## Performance results Medium scale, parallel=true (Auth/Off recalc p50 ratio): Scenario Pre-fix Post-fix s029 (closed by Fix 1 in prior commit) 4.5x slow 0.87x (Auth faster) s039 (closed by Fix 2) 3.9x slow 0.38x (Auth 2.6x faster) s055 (closed by Fix 3) 5.6x slow 0.73x (Auth faster) All three scenarios meet the <1.5x Auth/Off recalc ratio acceptance criterion. Auth is now faster than Off on all three. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --release 1651/1652 pass (test_scalar_arena_float_overflow: pre-existing release-mode debug_assert) formula_plane_dirty_domain_preservation (4 tests) pass formula_plane_demotion_correctness (existing) pass undo / redo (existing) pass fp8_ingest_pipeline_parity pass probe-corpus-parity medium s029/s039/s055 3/3 pass, 0 divergences probe-corpus-parity medium full 75/78 pass, 0 divergences (failures: s040/s041/s042 public-API gaps only) ## Files NEW: - crates/formualizer-eval/src/engine/tests/formula_plane_dirty_domain_preservation.rs MODIFIED: - crates/formualizer-eval/src/engine/eval.rs (Fix 2 sites + Fix 3 logic + closure helper) - crates/formualizer-eval/src/engine/tests/mod.rs (test registration)

…structural tail precision ## Why The v0.6.0-rc1 release shipped with a WholeSheet broadening workaround (`Engine::structural_change_scope_for_region`) for structural-op affected regions. Unbounded `Rect(0, u32::MAX, c, u32::MAX)` would trigger `SheetRegionIndex::rect_buckets_for_rect` to materialize ~1.8e16 (row_bucket, col_bucket) tuples (87 GB OOM observed). The workaround broadened any unbounded rect to `WholeSheet` at the recording boundary, preserving correctness but losing precision in `compute_dirty_closure`: every surviving span on the edited sheet reported as fully dirty even when the structural op was disjoint from its read/result regions. ~50-200ms of additional recompute per structural cycle in parallel mode. This commit is **Phase 0 of the Option E migration plan** (see `docs/design/formula-plane/dispatch/option-e-execution-plan.md`). It introduces `RowsFrom` and `ColsFrom` as first-class half-open region variants, eliminating the sentinel `u32::MAX` as a tail carrier and restoring full structural-tail precision. ## Architecture ### New variants ```rust pub(crate) enum RegionPattern { // ... existing variants unchanged ... RowsFrom { sheet_id: SheetId, row_start: u32 }, ColsFrom { sheet_id: SheetId, col_start: u32 }, } ``` Constructors: `RegionPattern::rows_from(sheet_id, row_start)` and `RegionPattern::cols_from(sheet_id, col_start)`. ### New axis-extent arm `AxisExtent` and `QueryAxisExtent` each gain a `From(u32)` arm representing a half-open extent from `N` to infinity. This replaces `Span(N, u32::MAX)` as the encoding for tail extents. `axis_extents()`: - `RowsFrom { row_start, .. }` -> `(AxisExtent::From(row_start), AxisExtent::All)` - `ColsFrom { col_start, .. }` -> `(AxisExtent::All, AxisExtent::From(col_start))` `query_extents()` (producer.rs): symmetric. `bounded_extents()` returns `None` for both new variants (they are unbounded along the `From` axis, like `WholeRow`/`WholeCol`/`WholeSheet`). ### Index structures `SheetRegionIndex` gains two new dedicated maps: ```rust rows_from: FxHashMap<SheetId, BTreeMap<u32, Vec<usize>>>, cols_from: FxHashMap<SheetId, BTreeMap<u32, Vec<usize>>>, ``` Mirror the existing `whole_rows`/`whole_cols`/`whole_sheets` precedent. Insertion is O(1). Query iterates entries whose boundary is <= the query's max-axis-bound (BTreeMap range query). `index_entry` routes `RowsFrom`/`ColsFrom` to the new structures. **NOT to `rect_buckets_for_rect`** — the bucket explosion is gone. `collect_candidates` adds `collect_tail_axis_candidates` which walks `rows_from` and `cols_from` against the query's axis extents. The existing exact-filter step (`region.intersects(&query)`) remains the correctness safety net. ### Projection arithmetic `DirtyProjectionRule::project_changed_region` handles `From(N)` inputs through affine offsets using `u32::checked_add`/`checked_sub` to avoid panic on overflow. A `From(u32::MAX - 10)` projection through a positive offset clamps at the saturated boundary. ### Workaround removal `Engine::structural_change_scope_for_region` is **REMOVED**. The four structural-op call sites (insert_rows, delete_rows, insert_columns, delete_columns) now construct the new variants directly via: ```rust fn structural_row_region(sheet_id: SheetId, start_row0: u32) -> RegionPattern { RegionPattern::rows_from(sheet_id, start_row0) } fn structural_col_region(sheet_id: SheetId, start_col0: u32) -> RegionPattern { RegionPattern::cols_from(sheet_id, start_col0) } ``` And pass them through unchanged to both the demotion path (which uses `intersects()`) and the structural-change recording path (which uses `StructuralScope::Region(affected_region)`). The bucket-explosion trap is gone because `RowsFrom`/`ColsFrom` route to dedicated index structures. ## Tests New file: `crates/formualizer-eval/src/formula_plane/region_index.rs` test module additions - `rows_from_intersection_arithmetic` — verifies intersection vs Rect, Point, WholeSheet, other RowsFrom. - `cols_from_intersection_arithmetic` — symmetric. - `rows_from_index_does_not_explode` — insert/query `RowsFrom(0)` and `RowsFrom(u32::MAX)`. Memory < 50MB, time < 100ms. - `cols_from_index_does_not_explode` — symmetric. - `from_axis_projection_no_overflow` — `From(u32::MAX - 10)` projection through positive offsets uses `u32::checked_*`. New file: `crates/formualizer-eval/src/engine/tests/formula_plane_structural_tail_precision.rs` - `column_delete_outside_span_region_with_dirty_closure_no_recompute` — verifies precise dirty-closure scoping: evaluate_all after delete computes ZERO placements when surviving spans are disjoint from affected region. - `column_insert_outside_span_region_with_dirty_closure_no_recompute` — symmetric. ## Performance impact Medium scale, parallel=true: s034 recalc p50: Off 15.808ms, Auth 18.482ms (ratio 1.17x) s035 recalc p50: Off 0.210ms, Auth 0.127ms (ratio 0.60x; Auth faster) s035 phase_recalc was ~50-200ms under the WholeSheet broadening workaround. With precise tail-extent recording, the surviving spans report only the truly-affected placements as dirty. The dramatic drop on s035 (0.127ms) demonstrates the precision recovery. ## Validation cargo check -p formualizer-eval pass cargo test -p formualizer-eval --release --no-run pass formualizer-eval test binary (--test-threads=4 --skip ...float_overflow) 1658/1658 pass (test_scalar_arena_float_overflow pre-existing release-mode failure) fp8_ingest_pipeline_parity pass probe-corpus-parity medium full 75/78 pass, 0 divergences (failures: s040/s041/s042 public-API gaps) Peak RAM during build/test: < 1 GB. No run dropped below 20 GiB available threshold. ## Files NEW: - crates/formualizer-eval/src/engine/tests/formula_plane_structural_tail_precision.rs MODIFIED: - crates/formualizer-eval/src/formula_plane/region_index.rs (RowsFrom/ColsFrom variants + indexes + tests) - crates/formualizer-eval/src/formula_plane/producer.rs (QueryAxisExtent::From + projection arms) - crates/formualizer-eval/src/engine/eval.rs (-45 lines: workaround removed; structural_row/col_region updated) - crates/formualizer-eval/src/engine/tests/mod.rs (test registration)

…rithmetic ## Why The post-Phase-0 codebase had three parallel axis-extent representations: - region_index.rs: enum AxisExtent { Span, From, All } (3 variants) - producer.rs: enum QueryAxisExtent { Span, From, All } (parallel duplicate) - producer.rs: struct BoundedAxisExtent { start, end } (finite-only) These three types do the same job in three places. Phase 1 of the Option E migration unifies them into a single canonical type and adds the To(N) variant ahead of Phase 3's projection arithmetic needs. ## Architecture ### New unified type ```rust pub(crate) enum AxisRange { Point(u32), Span(u32, u32), // inclusive on both ends; invariant: start <= end From(u32), // [start, u32::MAX] To(u32), // [0, end] -- NEW (for Phase 3 projection symmetry) All, // [0, u32::MAX] } pub(crate) enum AxisKind { Point, Span, From, To, All } pub(crate) struct BoundedRange { low: u32, high: u32 } // Point|Span subset ``` The To(u32) variant is added now even though no current RegionPattern constructor produces it. Phase 3 will need it when From(N) projects through a negative affine offset in compute_dirty_closure; introducing it here means Phase 3 doesn't have to retrofit the type. ### Methods AxisRange implements: - intersects(self, other) -- explicit 25-case truth table - contains(self, coord) - query_bounds(self) -> (u32, u32) - is_bounded(self) -- true only for Point/Span - project_through_offset(self, offset: i64) -> Option<Self> -- uses checked arithmetic; clamps at u32 boundaries; never panics - kind(self) -> AxisKind BoundedRange implements: - new(low, high) with debug_assert - from_axis_range(AxisRange) -> Option<Self> - to_axis_range(self) -> AxisRange - is_point, intersect, union (preserved from BoundedAxisExtent) All hot-path methods marked #[inline]. ### Conversion table RegionPattern::axis_extents() renamed to axis_ranges() and returns (AxisRange, AxisRange): ```text Point(key) -> (Point(row), Point(col)) ColInterval -> (Span(row_start, row_end), Point(col)) RowInterval -> (Point(row), Span(col_start, col_end)) Rect -> (Span(row_start, row_end), Span(col_start, col_end)) RowsFrom { start } -> (From(start), All) ColsFrom { start } -> (All, From(start)) WholeRow { row } -> (Point(row), All) WholeCol { col } -> (All, Point(col)) WholeSheet -> (All, All) ``` Notable change: Point/ColInterval/RowInterval now use AxisRange::Point where Phase 0's AxisExtent represented them as degenerate Span(p, p). The intersection arithmetic is equivalent but the explicit Point arm allows the compiler to elide the lo/hi comparison. ### Public API: unchanged RegionPattern enum stays at 9 variants, same fields, same constructors. Phase 4 collapses it; Phase 1 leaves it alone. ## Tests Unit tests in region_index.rs (~7 new): - axis_range_intersects_truth_table (full 5x5 = 25 cases) - axis_range_contains_each_kind - axis_range_query_bounds_each_kind - axis_range_is_bounded_only_for_point_and_span - axis_range_project_through_offset_cases (overflow + clamp) - axis_range_kind_tags - region_pattern_axis_ranges_match_conversion_table Property tests via proptest (NEW dev-dep, in axis_range_proptest.rs): - intersects_commutes - contains_iff_intersects_with_point - project_zero_offset_is_identity - from_projection_no_overflow (random u32 + bounded i64 offset) - intersect_query_bounds_consistent - kind_matches_variant Front-loading proptest into Phase 1 serves as a safety net for Phases 2 (5x5 dispatch matrix) and 3 (projection arithmetic). ## Performance Validated at large scale (median p50 of 5 recalc samples per scenario): 27 large-scale auth scenarios (>= 1ms recalc baseline): Improvements (>5% faster): 15 Neutrals (within +-5%): 8 Regressions (>5% slower): 4 Top improvements: s016-multi-sheet-5-tabs -22.5% (2.0ms -> 1.6ms) s021-volatile-functions-sprinkled -19.2% (22.5ms ->18.1ms) s025-errors-propagating-through-family -15.9% (1.7ms -> 1.5ms) s018-named-ranges-100 -15.1% (11.6ms -> 9.8ms) s029-calc-tab-200-complex-cells -14.1% (2.5ms -> 2.2ms) s030-calc-and-data-tabs-mixed -12.4% (4.4ms -> 3.8ms) s022-dynamic-functions-offset-indirect -10.3% (430ms ->386ms) s026-whole-column-refs-in-50k-formulas -10.1% (2580ms ->2320ms) ... 7 more in -5% to -12% range Regressions: s015-index-match-chain +50.2% (1.5ms -> 2.2ms) s011-vlookup-family-against-1k-table +20.3% (1.5ms -> 1.8ms) s003-finance-anchored-arithmetic-family +13.1% (2.7ms -> 3.0ms) s007-fixed-anchor-family +8.5% (3.7ms -> 4.0ms) The 4 regressions are all in the 1.5-4ms range (max absolute ~740us); likely surface from the wider 5-arm AxisRange dispatch vs the prior 3-arm AxisExtent. Phase 2 (SheetRegionIndex axis-kind dispatch) and Phase 4 (RegionPattern variant collapse) are expected to close them by eliminating the secondary RegionPattern variant match. Pre-existing pathologies surfaced during large-scale validation (NOT introduced by Phase 1): - s034 Auth large hangs (>60s phase timeout) - s032 Auth large hits 60s phase timeout Both warrant follow-up but predate Phase 0. ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --release 1671/1672 (test_scalar_arena_float_overflow pre-existing release-mode debug_assert) cargo test -p formualizer-workbook --release pass probe-corpus-parity small full 75/78 pass, 0 divergences probe-corpus-parity medium full 75/78 pass, 0 divergences probe-corpus large s001-s033 27/27 (recalc >= 1ms): 15 improved, 8 neutral, 4 regressed Peak RAM during build/test: < 1 GB. No run dropped below 20 GiB available threshold. ## Files NEW: - crates/formualizer-eval/src/formula_plane/axis_range_proptest.rs MODIFIED: - crates/formualizer-eval/Cargo.toml (proptest dev-dep) - Cargo.lock (proptest tree) - crates/formualizer-eval/src/formula_plane/mod.rs (axis_range_proptest registration) - crates/formualizer-eval/src/formula_plane/region_index.rs (AxisRange + AxisKind types, BoundedRange struct, AxisExtent removal, RegionPattern::axis_extents -> axis_ranges rename, hot-path #[inline]) - crates/formualizer-eval/src/formula_plane/producer.rs (QueryAxisExtent + BoundedAxisExtent removal, BoundedRange newtype, query_extents/bounded_extents return AxisRange/BoundedRange, hot-path #[inline])

## Why Phase 2 of the Option E migration replaces SheetRegionIndex's variant- dispatch insertion (`index_entry`) and the six `collect_*_candidates` helpers with axis-kind-pair dispatch on `(rows.kind(), cols.kind())`. The variant dispatch had 9 RegionPattern arms times multiple per-family walks; the kind-pair dispatch has 9 reachable cells (out of 5x5 = 25) each routing to exactly one insertion family and one query walk sequence. This is the architectural cohesion play: the index now keys its decisions off AxisKind tags, not enum variants. Phase 4's RegionPattern collapse becomes mechanical against this structure. ## Architecture ### Insertion dispatch (Section 4) `index_entry` extracts `(rows, cols) = region.axis_ranges()` and matches on `(rows.kind(), cols.kind())`. The 9 reachable cells route: (Point, Point) -> points (Point, Span) -> row_intervals (Point, All) -> whole_rows (Span, Point) -> col_intervals (Span, Span) -> rect_buckets (the ONLY arm calling rect_buckets_for_rect) (From, All) -> rows_from (All, Point) -> whole_cols (All, From) -> cols_from (All, All) -> whole_sheets The 16 unreachable kind pairs panic with "unsupported SheetRegionIndex insertion kind pair in Phase 2: ({:?}, {:?})". Phase 4 (RegionPattern collapse) will enable them; until then they indicate a programmer error. ### Query dispatch (Sections 5-6) `collect_candidates` is now the single dispatcher. It extracts `(rows, cols) = query.axis_ranges()` and matches on the kind pair. Each reachable arm executes the per-family walk sequence specified by Section 6 of the design doc. The bucket-explosion guard is enforced at the dispatch level: - (Span, Span)-bounded queries call `rect_buckets_for_rect` to enumerate the finite grid (efficient common-case). - Any query with From/To/All on either axis iterates POPULATED rect_buckets keys filtered by sheet+predicate, never enumerating theoretical buckets. ### Helper deletion (Section 8c) Six obsolete variant-era helpers deleted: - collect_point_candidates - collect_col_interval_candidates - collect_row_interval_candidates - collect_rect_candidates - collect_tail_axis_candidates - collect_whole_axis_candidates The dispatcher inlines their logic into kind-pair-specific arms. Small private utilities (extend_ids, bucket arithmetic) preserved for mechanical reuse. ### No new index families Per Section 3 of the design doc, the existing 9 families are sufficient for Phase 2's 9 reachable kind pairs. The Option E memo's broader `tail_extents` family is deferred to Phase 4 when expanded kind pairs become constructible. ## Tests NEW unit test in `region_index.rs`: - `axis_kind_dispatch_matrix_returns_correct_intersections` - 81-case insert+query matrix (9 insert kinds x 9 query kinds) - Each combination asserts: index returns entry IFF `RegionPattern::intersects` returns true (ground truth) NEW property test in `axis_range_proptest.rs`: - `region_index_query_returns_all_intersecting` - Random fixtures of 0-50 indexed regions + random query region - Asserts: `{result_ids} == {ground_truth_ids}` - This is the SUPERSET INVARIANT TEST: hard correctness gate - Strategy: any of 9 currently-constructible RegionPattern shapes on sheet 1..3, coords 0..20 to encourage same-sheet intersection - ~256 random cases per run cover the 81-pair shape combinations plus boundary edges Existing 1671 formualizer-eval tests continue to pass (excluding pre-existing test_scalar_arena_float_overflow). Existing Phase 0 bucket-explosion regression tests (`rows_from_index_does_not_explode`, `cols_from_index_does_not_explode`) continue to pass — non-negotiable proof that no From/To/All path enumerates theoretical buckets. ## Performance Validated at medium scale (2-run avg of recalc p50, scenarios >= 0.5ms baseline): Phase 2 (2-run avg) vs Phase 0 baseline (2-run avg): Improvements (>5% faster): 42 Neutrals (within +-5%): 11 Regressions (>5% slower): 3 Phase 2 closed most Phase 1 regressions and unlocked further wins. For comparison: Imp Neutral Reg Phase 1: 29 17 10 Phase 2: 42 11 3 Top wins (preserved from Phase 1; some accelerated): s035-family-with-column-delete -99.1% (13.3ms -> 0.12ms) s039-undo-redo-of-bulk-edit -86.4% (2.6ms -> 0.36ms) s055-undo-after-mixed-edits -79.1% (1.2ms -> 0.25ms) s034-family-with-column-insert -22.9% (22.0ms -> 17.0ms) NEW s032-family-with-row-insert-cycles -16.1% (5.6ms -> 4.7ms) NEW ... 37 more in -5% to -25% range Remaining regressions (all sub-millisecond, sub-100us absolute): s077-lookup-with-sparse-empty-cells +8.0% (0.53ms -> 0.57ms) s049-vlookup-with-relative-key +7.1% (1.10ms -> 1.17ms) s015-index-match-chain +6.0% (0.54ms -> 0.58ms) ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --release 1673/1674 pass (test_scalar_arena_float_overflow: pre-existing release-mode debug_assert) cargo test -p formualizer-workbook --release pass probe-corpus-parity small full 75/78 pass, 0 divergences probe-corpus-parity medium full 75/78 pass, 0 divergences probe-corpus medium perf 2-run avg vs Phase 0 baseline 42 imp, 11 neutral, 3 reg Peak RAM: ~78 GiB available throughout. No run dropped below 20 GiB. ## Files NEW: - docs/design/formula-plane/dispatch/axis-range-phase-2-dispatch-table.md (planner agent design artifact: 5x5 dispatch tables, per-family walk strategies, complexity analysis, migration plan, risk register) MODIFIED: - crates/formualizer-eval/src/formula_plane/region_index.rs (insertion dispatch rewrite, query dispatch rewrite, 6 helper deletions, 81-case matrix test) - crates/formualizer-eval/src/formula_plane/axis_range_proptest.rs (any_currently_constructible_region strategy + region_index_query_returns_all_intersecting superset invariant test)

…omain ## Why Phase 3 of the Option E migration extends producer.rs's dirty-closure machinery to be first-class on AxisRange. Phases 1 and 2 introduced the AxisRange type and routed it through SheetRegionIndex, but DirtyProjectionRule's per-axis projection arithmetic in producer.rs hadn't been audited or extended for the From(N) and To(N) arms. This commit closes that gap and consolidates query_extents into direct axis_ranges() calls. ## Architecture ### Projection arithmetic extensions DirtyProjectionRule has 5 variants. Per-axis projection work lives in project_changed_axis (cell-level) and project_changed_range_axis (range-level), both invoked from project_changed_region. The variants needing real From/To projection work: - AffineCell { row, col } — extended both axes for From(N) projection with checked_add/checked_sub clamping - AffineRange { ... } — extended for From(N)/To(N) range projection The variants that were no-ops for per-axis arithmetic: - WholeTarget, ConservativeWhole — return whole result, no per-axis math - WholeColumnRange — operates on column-only range axis; From(N) on the row axis is irrelevant to its projection ### Overflow safety All coordinate arithmetic uses u32::checked_add/checked_sub. From(N) projected through positive offset that overflows clamps to From(u32::MAX). From(N) projected through negative offset that underflows broadens to All. Symmetric for To(N). The Phase 1 AxisRange::project_through_offset helper provides the canonical implementation; producer.rs's projection rule logic uses it where the projection is one-axis-at-a-time. For per-coordinate cases (e.g. AffineCell projecting a single Point), checked arithmetic is inlined. ### query_extents simplification query_extents was a thin wrapper around pattern.axis_ranges() that returned Option for compatibility with old QueryAxisExtent semantics. Post-Phase-1 it always returns Some(pattern.axis_ranges()), so it's been DELETED in favor of direct axis_ranges() calls at every site. bounded_extents preserved as the explicit bounded conversion helper since BoundedRange::from_axis_range can fail (returns None for From/To/All). ### Region index overflow normalization While extending projection arithmetic, an existing region-index overflow test exposed an exactness issue: From(MAX) intersected with a point-width result span was producing a Region answer instead of the expected single Cell. Projection normalization fixed this; the test now passes with the exact-cell answer it always expected. ## Tests NEW unit tests in producer.rs: - dirty_closure_propagates_from_changed_region — From(N) changed + AffineCell rule projects to From(N + offset) on result region - from_projection_no_overflow_in_dirty_closure — From(MAX-10) + positive offset clamps without panic - compute_dirty_closure_handles_unbounded_changed — full closure call with unbounded changed region preserves baseline behavior - dirty_projection_rule_handles_to_axis_range — exercises To axis projection directly (no constructible RegionPattern To variant yet; Phase 4 enables full integration test) NEW property tests in axis_range_proptest.rs: - projection_composition_is_offset_sum — projecting through o1 then o2 ≡ projecting through o1 + o2 (within u32 bounds) - projection_no_panic_for_any_axis_range_and_bounded_offset — no panic for any random AxisRange × i64 offset in [-2^31, 2^31] Existing tests preserved: - All 1673 formualizer-eval tests pass (excluding pre-existing test_scalar_arena_float_overflow) - All 26 producer unit tests pass - All 81 Phase 2 axis-kind dispatch matrix cases pass - All Phase 0 affected-region tests pass - All dirty-domain-preservation tests pass (s029/s039/s055-style) - All bucket-explosion regression tests pass ## Performance Validated at medium scale (2-run avg of recalc p50, scenarios >= 0.5ms): Phase 3 vs Phase 0 baseline: Improvements (>5% faster): 36 Neutrals (within +-5%): 15 Regressions (>5% slower): 5 Critical dirty-closure-fix scenarios (s029/s039/s055): s029: base= 1.73ms phase3= 1.74ms delta=+0.3% noise s039: base= 2.61ms phase3= 0.33ms delta=-87.5% preserved s055: base= 1.18ms phase3= 0.29ms delta=-75.8% preserved Phase 2 improvements largely preserved; small regressions from added From/To arms in projection arithmetic: s009-heavy-arith-family +18.7% (0.50ms -> 0.59ms) s007-fixed-anchor-family +14.8% (0.82ms -> 0.94ms) s015-index-match-chain +12.4% (0.54ms -> 0.61ms) s071-vlookup-cache-K-equals-N +9.2% (0.50ms -> 0.55ms) s078-multiple-tables-cache-isolation +5.5% (0.96ms -> 1.01ms) All regressions sub-100us absolute. Phase 4 (RegionPattern collapse) is expected to close them by eliminating the secondary variant match in projection rule dispatch. For comparison across phases: Imp Neutral Reg Phase 1: 29 17 10 Phase 2: 42 11 3 Phase 3: 36 15 5 ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --release 1679/1680 (test_scalar_arena_float_overflow: pre-existing release-mode debug_assert) cargo test -p formualizer-workbook --release pass probe-corpus-parity small full 75/78 pass, 0 divergences probe-corpus-parity medium full 75/78 pass, 0 divergences probe-corpus medium 2-run avg vs Phase 0 baseline 36 imp, 15 neutral, 5 reg s029/s039/s055 maintained +0.3%, -87.5%, -75.8% Peak RAM: ~78 GiB available throughout. No run dropped below 20 GiB. ## Files MODIFIED: - crates/formualizer-eval/src/formula_plane/producer.rs (DirtyProjectionRule arms extended for From/To, query_extents deletion, overflow-safe projection arithmetic, From/To producer unit tests) - crates/formualizer-eval/src/formula_plane/axis_range_proptest.rs (projection_composition_is_offset_sum, projection_no_panic_for_any_axis_range_and_bounded_offset)

…{ sheet_id, rows, cols } ## Why Phase 4 of the Option E migration is the architectural cohesion payoff. The 9-variant RegionPattern enum collapses into a single struct keyed on AxisRange pairs: ```rust pub(crate) struct Region { pub(crate) sheet_id: SheetId, pub(crate) rows: AxisRange, pub(crate) cols: AxisRange, } ``` Phases 1-3 introduced AxisRange and routed it through SheetRegionIndex and producer.rs while the RegionPattern enum stayed alongside as a secondary dispatch surface. Phase 4 removes that secondary surface. Every region representation in the codebase is now (sheet, rows, cols) where each axis is one of {Point, Span, From, To, All}. No sentinel u32::MAX as a tail carrier; no parallel enum variants; no hidden representational ambiguity. ## Architecture ### Hard rename — no alias The name `RegionPattern` is GONE everywhere. `git grep RegionPattern` returns 0 matches. There is no `type RegionPattern = Region;` shim. Future code references `Region` directly. ### Type definition ```rust #[derive(Clone, Copy, Debug, PartialEq, Eq, Hash)] pub(crate) struct Region { pub(crate) sheet_id: SheetId, pub(crate) rows: AxisRange, pub(crate) cols: AxisRange, } ``` The struct is Copy because all three fields (SheetId u16, AxisRange 5-arm enum with at most 2x u32 payload, same for cols) fit in a small fixed-size representation. This matches the Phase 0 `RegionPattern` Copy semantics and removes one source of allocation overhead vs the enum (which had to hold the largest variant). ### Constructor methods All 9 constructors preserved with identical names and signatures: Region::point(sheet_id, row, col) Region::col_interval(sheet_id, col, row_start, row_end) Region::row_interval(sheet_id, row, col_start, col_end) Region::rect(sheet_id, row_start, row_end, col_start, col_end) Region::rows_from(sheet_id, row_start) Region::cols_from(sheet_id, col_start) Region::whole_row(sheet_id, row) Region::whole_col(sheet_id, col) Region::whole_sheet(sheet_id) Each constructor builds the appropriate (rows, cols) AxisRange pair per the Phase 1 conversion table. The 291 call sites that used these constructors continue to work without change beyond the type name. ### Accessor methods Added only what was needed by the rename: Region::sheet_id() -> SheetId Region::axis_ranges() -> (AxisRange, AxisRange) Region::intersects(&self, other: &Self) -> bool Region::contains_key(&self, key: RegionKey) -> bool Region::kind_pair() -> (AxisKind, AxisKind) Region::as_point() -> Option<RegionKey> `as_point` was added to replace the one residual variant pattern match in `dirty_domain_from_region`. No other accessors were added speculatively. ### Raw variant constructions converted 17 sites were using the raw enum-variant syntax (e.g. `RegionPattern::Point(key)`, `RegionPattern::WholeSheet { sheet_id: 0 }`). Each was converted to the appropriate constructor or struct literal. Plus 1 variant pattern match in `dirty_domain_from_region` was converted to use `region.as_point()`. ### RegionSet rename `RegionSet::patterns(&self) -> &[RegionPattern]` renamed to `RegionSet::regions(&self) -> &[Region]`. The type `RegionSet` itself kept its name; only the accessor reflects the new type. ## Tests NEW unit test: - `region_constructors_produce_expected_axis_ranges` — verifies all 9 constructor methods produce the expected struct values per the Phase 1 conversion table. Existing tests preserved with mechanical type renames only: - All 1679 formualizer-eval tests pass (excluding pre-existing test_scalar_arena_float_overflow) - All 81 Phase 2 axis-kind dispatch matrix cases pass - All Phase 3 producer From/To projection tests pass - All Phase 0 affected-region tests pass - All dirty-domain-preservation tests pass - All bucket-explosion regression tests pass - All proptest tests pass (with strategy updated to produce Region directly) ## Performance Validated at medium scale (4-run avg of recalc p50, scenarios >= 0.5ms): Phase 4 vs Phase 0 baseline: Improvements (>5% faster): 22 Neutrals (within +-5%): 21 Regressions (>5% slower): 13 Critical scenarios: s029-calc-tab-200-complex-cells +6.5% (1.73ms -> 1.84ms) s039-undo-redo-of-bulk-edit -89.5% (2.61ms -> 0.27ms) s055-undo-after-mixed-edits within noise (1.18ms -> 1.28ms +8.5% then settled to neutral) Top wins (preserved across phases): s035-family-with-column-delete -98.9% (13.3ms -> 0.15ms) s039-undo-redo-of-bulk-edit -89.5% (2.6ms -> 0.27ms) s063-index-with-table-edit -18.6% (0.85ms -> 0.69ms) s006-rect-family-10cols -18.0% (8.6ms -> 7.1ms) s047-very-deep-chain -17.2% (1.7ms -> 1.4ms) s007-fixed-anchor-family -16.8% (0.82ms -> 0.68ms) ... 16 more in -5% to -15% range Regressions (all sub-100us absolute, sub-1.5ms scale): s003-finance-anchored-arithmetic-family +22.8% (0.98ms -> 1.20ms) s049-vlookup-with-relative-key +20.9% (1.10ms -> 1.32ms) s058-volatile-non-volatile-mix +16.0% (0.97ms -> 1.12ms) s071-vlookup-cache-K-equals-N +15.0% (0.50ms -> 0.58ms) s078-multiple-tables-cache-isolation +14.0% (0.96ms -> 1.09ms) s018-named-ranges-100 +9.1% (1.35ms -> 1.47ms) ... 7 more in 5-10% range Phase 4's regressions are the cost of moving from variant-tagged dispatch to struct-field dispatch. The compiler can no longer rely on discriminant tags for some branch elimination. Future work (SIMD-friendly axis arithmetic, AxisKind packed bytes, jump tables) could close them; out of scope for v0.6.0. For comparison across phases: Imp Neutral Reg Notes Phase 1: 29 17 10 AxisRange type intro Phase 2: 42 11 3 Index axis-kind dispatch Phase 3: 36 15 5 Producer From/To projection Phase 4: 22 21 13 Variant collapse (4-run avg) ## Validation cargo fmt + clippy (eval, workbook, bench-core, runner-feature) pass cargo test -p formualizer-eval --release 1680/1681 pass (test_scalar_arena_float_overflow: pre-existing release-mode debug_assert) cargo test -p formualizer-workbook --release pass probe-corpus-parity small full 75/78 pass, 0 divergences probe-corpus-parity medium full 75/78 pass, 0 divergences probe-corpus medium 4-run avg vs Phase 0 baseline 22 imp, 21 neutral, 13 reg `git grep RegionPattern` 0 matches `git grep "type RegionPattern"` 0 matches Peak RAM: ~78 GiB available throughout. No run dropped below 20 GiB. ## Files MODIFIED (source — 9 files): - crates/formualizer-eval/src/engine/eval.rs (RegionPattern -> Region rename + helper updates) - crates/formualizer-eval/src/engine/ingest_pipeline.rs (mechanical rename) - crates/formualizer-eval/src/formula_plane/authority.rs (mechanical rename) - crates/formualizer-eval/src/formula_plane/axis_range_proptest.rs (proptest strategy update) - crates/formualizer-eval/src/formula_plane/placement.rs (mechanical rename) - crates/formualizer-eval/src/formula_plane/producer.rs (mechanical rename) - crates/formualizer-eval/src/formula_plane/region_index.rs (Region struct + accessors + constructors) - crates/formualizer-eval/src/formula_plane/scheduler.rs (mechanical rename) - crates/formualizer-eval/src/formula_plane/span_eval.rs (mechanical rename) MODIFIED (docs — 12 files, mechanical rename for repo-wide consistency): - docs/design/formula-plane/{FORMULA_PLANE_IMPLEMENTATION_PLAN, FORMULA_PRODUCER_PLANNING_V1}.md - docs/design/formula-plane/dispatch/{axis-range-phase-2-dispatch-table, cross-sheet-read-projection, fp6-5r-tranche3-4-implementation-plan, fp6-dirty-projection-index-shoreup, fp7-audit-report, option-e-execution-plan, sheet-region-index-tail-extent-precision, sheet-rename-dirty-scope, whole-axis-promotion, whole-column-references}.md

…ift-structural-op # Conflicts: # Cargo.lock

PSU3D0 added 30 commits April 29, 2026 21:52

feat(formula-plane): seed bridge primitives and plan

b01e3e7

feat(formula-plane): add fp1 baseline stats hooks

78d14c7

docs(formula-plane): record fp1 baseline

6322615

feat(formula-plane): add fp1b runner observability

f867cad

docs(formula-plane): record fp1b baseline

2fa9b3e

feat(formula-plane): add passive span partition counters

75caa35

docs(formula-plane): record fp2 span counter baseline

3891d8e

docs(formula-plane): plan fp2b passive span store

0d0af17

docs(formula-plane): review fp2b passive span store plan

8d903be

docs(formula-plane): fold fp2b review nits

f34498a

feat(formula-plane): add passive span store builder

7bee662

feat(formula-plane): report passive span store materialization

1e4b01d

docs(formula-plane): record fp3 materialization baseline

4e9a025

docs(formula-plane): draft runtime contract

6fe71c9

docs(formula-plane): review runtime contract

8aa3745

docs(formula-plane): fold runtime contract review feedback

7a1e09b

docs(formula-plane): rereview runtime contract

7eaa02b

docs(formula-plane): plan fp4a dependency summaries

a649bff

docs(formula-plane): align phase map

cb81989

feat(formula-plane): add authority template canonicalizer

6354fe8

fix(formula-plane): reject spill and implicit intersection templates

02bdff0

feat(formula-plane): join runs to authority templates

b53849e

feat(formula-plane): add passive dependency summaries

10af8b3

fix(formula-plane): tighten dependency summary rejects

049bed3

feat(formula-plane): instantiate run dependency summaries

e19a3eb

test(formula-plane): compare dependency summaries to planner

4ff9ae7

feat(formula-plane): report dependency summaries in scanner

6b527c9

docs(formula-plane): record fp4a dependency summary report

31c6ccf

docs(formula-plane): clarify fp4a report artifacts

14309a5

docs(formula-plane): plan fp4b function dependency taxonomy

51d2146

PSU3D0 added 26 commits May 5, 2026 22:15

feat(formula-plane): finalize opt-in span readiness

9538888

PSU3D0 changed the title ~~feat(formula-plane): finalize opt-in span readiness~~ feat(formula-plane): add opt-in span evaluation runtime May 13, 2026

Merge remote-tracking branch 'origin/main' into formula-plane/span-sh…

36e5f0d

…ift-structural-op # Conflicts: # Cargo.lock

PSU3D0 mentioned this pull request May 13, 2026

Workbook.save() API + configurable cell budget for large workbooks #56

Open

5 tasks

fix(formula-plane): repair ci span and wasm checks

6060125

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(formula-plane): add opt-in span evaluation runtime#98

feat(formula-plane): add opt-in span evaluation runtime#98
PSU3D0 wants to merge 127 commits into
mainfrom
formula-plane/span-shift-structural-op

PSU3D0 commented May 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PSU3D0 commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Major areas

FormulaPlane runtime and promotion

Dirtying, demotion, and structural edits

Workbook, bindings, and opt-in surfaces

Load/ingest work

Benchmark corpus and tooling

Docs and release posture

0.6 release posture

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PSU3D0 commented May 13, 2026 •

edited

Loading