Skip to content

Decode u64s#78

Merged
frankmcsherry merged 11 commits intomasterfrom
decode_u64s
Mar 14, 2026
Merged

Decode u64s#78
frankmcsherry merged 11 commits intomasterfrom
decode_u64s

Conversation

@frankmcsherry
Copy link
Copy Markdown
Owner

The result of work with Claude code. This summary was written by Claude:

  Summary

  Adds a new decode path (decode_u64s + from_u64s) that preserves u64 alignment
  information through the entire pipeline, enabling LLVM to eliminate unused tuple
   fields as dead code.

  The key insight: decode returns &[u8], which discards the alignment guarantee
  that the &[u64] store provides. from_bytes must then re-prove alignment via
  bytemuck::try_cast_slice for every field — a panicking operation that LLVM
  cannot eliminate, forcing O(k) work to access any field of a k-tuple. The new
  path returns (&[u64], u8) pairs (word slice + trailing byte count), making all
  casts infallible and all construction non-panicking. LLVM sees unused fields as
  pure dead code and removes them.

  Assembly: accessing field 0 of a k-tuple of u64s

  ┌─────┬────────────────────────┬──────────────────────┐
  │  k  │  from_bytes (before)   │  from_u64s (after)   │
  ├─────┼────────────────────────┼──────────────────────┤
  │ 1   │ 77 insns, 8 branches   │ 68 insns, 7 branches │
  ├─────┼────────────────────────┼──────────────────────┤
  │ 3   │ 133 insns, 14 branches │ 68 insns, 7 branches │
  ├─────┼────────────────────────┼──────────────────────┤
  │ 8   │ 273 insns, 29 branches │ 68 insns, 7 branches │
  └─────┴────────────────────────┴──────────────────────┘

  Also removes the EncodeDecode trait and Sequence encoding (superseded by
  Indexed), renames the module to indexed, and adds FromBytes::validate for
  upfront data validation at trust boundaries.

frankmcsherry and others added 8 commits March 14, 2026 14:29
…coding

Adds a new decode path that preserves u64 alignment information through
the entire pipeline, eliminating per-field alignment checks that
bytemuck::try_cast_slice required when going through &[u8].

Key changes:
- decode_u64s: returns (&[u64], u8) pairs instead of &[u8] slices,
  where the u8 indicates valid trailing bytes in the last word
- from_u64s on FromBytes: non-panicking field construction that enables
  LLVM to eliminate unused tuple fields as dead code
- validate/validate_typed: upfront structural and type-compatibility
  checks for encoded data, replacing the implicit panic-on-bad-data
- Remove inspect module (superseded by examples/decode_asm.rs)

Assembly impact for accessing field 0 of a k-tuple of u64s:
  Old (from_bytes): k=3 → 133 insns, k=8 → 273 insns (linear in k)
  New (from_u64s):  k=3 → 68 insns,  k=8 → 68 insns  (constant in k)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…exed

Indexed is now the sole encoding format with inherent methods, so callers
don't need to import a trait. The Sequence format provided no random
access or u64-aligned decoding and is no longer needed.

Renames serialization_neu to indexed now that there is no other
serialization module to distinguish from.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
validate_typed was partly a function of the Indexed format and partly
of the type being decoded. It now lives as FromBytes::validate, which
combines structural and type-compatibility checks using element_sizes.

The Indexed struct's methods were all one-line delegates to free
functions in the indexed module. Removed the struct and inlined
length_in_words/length_in_bytes as free functions. Callers use the
module directly (columnar::bytes::indexed::encode, etc).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
element_sizes is only used internally by validate. Mark it #[doc(hidden)]
and simplify tests to exercise validate directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
element_sizes is public for implementors to override.
validate is public for callers to use at trust boundaries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Trimmed from experimental accumulation to a clean comparison of:
- from_bytes + decode (O(k) baseline)
- from_u64s + decode_u64s (O(1) in k via dead code elimination)
- decode_field random access (O(1) in both k and field position)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The bench and serde benchmarks referenced the removed EncodeDecode
trait and Sequence type. Updated to use the indexed module directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The enum container struct now uses a single `indexes: Discriminant`
field instead of separate `variant` and `offset` fields. Update the
derive macro's from_u64s to match, and add from_u64s/element_sizes
to the Discriminant FromBytes impl.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
frankmcsherry and others added 3 commits March 14, 2026 15:13
FromBytes::validate now takes &[(&[u64], u8)] matching the from_u64s
input shape, making it composable for nested types. Added
indexed::validate_typed::<T> as the single entry point that combines
structural and type-level validation. Also added from_u64s and
element_sizes for Discriminant.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The obvious name should do the obvious thing: indexed::validate::<T>
does full validation (structural + type compatibility). The structural-
only check is now validate_structure, an implementation detail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@frankmcsherry
Copy link
Copy Markdown
Owner Author

There is still an increasing number of instructions for fields beyond 0, but saving that for a different PR. It seems likely related to taking larger steps through the iterator, removing the sequential dependencies, whereas this is mostly about tweaking the codegen to avoid panics.

@frankmcsherry frankmcsherry merged commit 8f254ac into master Mar 14, 2026
6 checks passed
@frankmcsherry frankmcsherry deleted the decode_u64s branch March 14, 2026 20:13
@github-actions github-actions bot mentioned this pull request Mar 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant