Skip to content

Fix recursion limit for enums with many variants#99

Draft
antiguru wants to merge 5 commits intofrankmcsherry:masterfrom
antiguru:recursion_limit
Draft

Fix recursion limit for enums with many variants#99
antiguru wants to merge 5 commits intofrankmcsherry:masterfrom
antiguru:recursion_limit

Conversation

@antiguru
Copy link
Copy Markdown
Contributor

@antiguru antiguru commented Mar 27, 2026

Summary

Fixes compilation failures for enums with many variants (80+) that hit Rust's type recursion limit due to deeply nested Chain<Chain<Chain<...>>> types in derived as_bytes() implementations.

Changes

  • Balanced chain tree in derive macro: Replace linear chain(chain(chain(a, b), c), d) nesting (O(N) depth) with a balanced binary tree (O(log N) depth). Supports up to 256 variants.

  • #[inline(always)] on Chain::fold: Ensures the balanced chain tree is fully inlined — as_bytes().count() compiles to a single constant for any size enum.

  • size_hint and count on Chain/ChainOne: Propagates exact sizes through the tree, making count() O(1) without needing fold. Eliminates an entire iteration pass in encode.

  • Eliminate panic paths in encode: Restructure the byte-copying loop to use split_at, chunks_exact, and bytemuck::cast_mut. Removes slice_index_fail and copy_from_slice::len_mismatch_fail from generated assembly.

  • Dyn dispatch in encode loop bodies: Route the per-slice closures through &mut dyn FnMut so the inlined iterator tree emits one indirect call per leaf instead of duplicating the closure body at each of the N fold sites. 3× code size reduction with no measurable runtime cost (workload is memcpy-bound).

Assembly verification

Includes examples/enum_as_bytes_asm.rs with a 256-variant enum for cargo rustc --example enum_as_bytes_asm --release -- --emit asm inspection.

Metric (256-variant enum) Before After
Compilation ❌ recursion limit ✅ compiles
as_bytes().count() mov w0, #258; ret
encode total code size ~12.5K insns
encode panic paths 0
small_encode (3 variants) ~300 insns 167 insns

Test plan

  • All existing tests pass (cargo test)
  • Round-trip example works (cargo run --release)
  • 256-variant enum compiles and produces correct assembly
  • Benchmarked: no measurable performance regression from dyn dispatch

🤖 Generated with Claude Code

antiguru and others added 5 commits March 27, 2026 07:11
The linear chain pattern `chain(chain(chain(a, b), c), d)` creates
O(N)-deep nested types that hit the compiler's recursion limit for
enums with many variants (e.g. 80+). Replace with a balanced binary
tree that produces O(log N) depth, supporting up to 256 variants.

Also add an asm inspection example for verifying inlining.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Mark Chain::fold as #[inline(always)] so the balanced chain tree
  is fully inlined for large enums (256 variants: count compiles to
  a constant, encode has zero residual fold calls).

- Restructure encode's Read 3 loop to eliminate bounds-check panics:
  use split_at instead of manual slicing, chunks_exact instead of
  resize+index for misaligned copies, and bytemuck::cast_mut for the
  remainder. This removes both slice_index_fail and
  copy_from_slice::len_mismatch_fail from the generated assembly,
  reducing big_encode from 43K to 36K instructions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These propagate exact sizes through the balanced chain tree,
making count() O(1) instead of O(N). For a 256-variant enum,
as_bytes().count() compiles to a constant without needing fold.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The encode loop bodies are identical for every byte slice, but
the fully-inlined Chain fold duplicates them at each of the N
leaves. Route the closures through `&mut dyn FnMut` so the
iterator tree inlines (for efficient traversal) while the
per-slice work exists only once.

For a 256-variant enum: big_encode shrinks from 37K to ~12.5K
instructions, small_encode from 562 to 167, with no measurable
runtime difference (workload is memcpy-bound).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant