Decode u64s by frankmcsherry · Pull Request #78 · frankmcsherry/columnar

frankmcsherry · 2026-03-14T17:27:51Z

The result of work with Claude code. This summary was written by Claude:

  Summary

  Adds a new decode path (decode_u64s + from_u64s) that preserves u64 alignment
  information through the entire pipeline, enabling LLVM to eliminate unused tuple
   fields as dead code.

  The key insight: decode returns &[u8], which discards the alignment guarantee
  that the &[u64] store provides. from_bytes must then re-prove alignment via
  bytemuck::try_cast_slice for every field — a panicking operation that LLVM
  cannot eliminate, forcing O(k) work to access any field of a k-tuple. The new
  path returns (&[u64], u8) pairs (word slice + trailing byte count), making all
  casts infallible and all construction non-panicking. LLVM sees unused fields as
  pure dead code and removes them.

  Assembly: accessing field 0 of a k-tuple of u64s

  ┌─────┬────────────────────────┬──────────────────────┐
  │  k  │  from_bytes (before)   │  from_u64s (after)   │
  ├─────┼────────────────────────┼──────────────────────┤
  │ 1   │ 77 insns, 8 branches   │ 68 insns, 7 branches │
  ├─────┼────────────────────────┼──────────────────────┤
  │ 3   │ 133 insns, 14 branches │ 68 insns, 7 branches │
  ├─────┼────────────────────────┼──────────────────────┤
  │ 8   │ 273 insns, 29 branches │ 68 insns, 7 branches │
  └─────┴────────────────────────┴──────────────────────┘

  Also removes the EncodeDecode trait and Sequence encoding (superseded by
  Indexed), renames the module to indexed, and adds FromBytes::validate for
  upfront data validation at trust boundaries.

…coding Adds a new decode path that preserves u64 alignment information through the entire pipeline, eliminating per-field alignment checks that bytemuck::try_cast_slice required when going through &[u8]. Key changes: - decode_u64s: returns (&[u64], u8) pairs instead of &[u8] slices, where the u8 indicates valid trailing bytes in the last word - from_u64s on FromBytes: non-panicking field construction that enables LLVM to eliminate unused tuple fields as dead code - validate/validate_typed: upfront structural and type-compatibility checks for encoded data, replacing the implicit panic-on-bad-data - Remove inspect module (superseded by examples/decode_asm.rs) Assembly impact for accessing field 0 of a k-tuple of u64s: Old (from_bytes): k=3 → 133 insns, k=8 → 273 insns (linear in k) New (from_u64s): k=3 → 68 insns, k=8 → 68 insns (constant in k) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…exed Indexed is now the sole encoding format with inherent methods, so callers don't need to import a trait. The Sequence format provided no random access or u64-aligned decoding and is no longer needed. Renames serialization_neu to indexed now that there is no other serialization module to distinguish from. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

validate_typed was partly a function of the Indexed format and partly of the type being decoded. It now lives as FromBytes::validate, which combines structural and type-compatibility checks using element_sizes. The Indexed struct's methods were all one-line delegates to free functions in the indexed module. Removed the struct and inlined length_in_words/length_in_bytes as free functions. Callers use the module directly (columnar::bytes::indexed::encode, etc). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

element_sizes is only used internally by validate. Mark it #[doc(hidden)] and simplify tests to exercise validate directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

element_sizes is public for implementors to override. validate is public for callers to use at trust boundaries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Trimmed from experimental accumulation to a clean comparison of: - from_bytes + decode (O(k) baseline) - from_u64s + decode_u64s (O(1) in k via dead code elimination) - decode_field random access (O(1) in both k and field position) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The bench and serde benchmarks referenced the removed EncodeDecode trait and Sequence type. Updated to use the indexed module directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The enum container struct now uses a single `indexes: Discriminant` field instead of separate `variant` and `offset` fields. Update the derive macro's from_u64s to match, and add from_u64s/element_sizes to the Discriminant FromBytes impl. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

FromBytes::validate now takes &[(&[u64], u8)] matching the from_u64s input shape, making it composable for nested types. Added indexed::validate_typed::<T> as the single entry point that combines structural and type-level validation. Also added from_u64s and element_sizes for Discriminant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The obvious name should do the obvious thing: indexed::validate::<T> does full validation (structural + type compatibility). The structural- only check is now validate_structure, an implementation detail. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

frankmcsherry · 2026-03-14T19:48:14Z

There is still an increasing number of instructions for fields beyond 0, but saving that for a different PR. It seems likely related to taking larger steps through the iterator, removing the sequential dependencies, whereas this is mostly about tweaking the codegen to avoid panics.

frankmcsherry and others added 8 commits March 14, 2026 14:29

Hide element_sizes as implementation detail of FromBytes::validate

0c45441

element_sizes is only used internally by validate. Mark it #[doc(hidden)] and simplify tests to exercise validate directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Make both element_sizes and validate public on FromBytes

c588292

element_sizes is public for implementors to override. validate is public for callers to use at trust boundaries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Update benchmarks to use indexed module instead of removed Sequence

cddc3f7

The bench and serde benchmarks referenced the removed EncodeDecode trait and Sequence type. Updated to use the indexed module directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

frankmcsherry force-pushed the decode_u64s branch from b6e9b45 to 7aeea5b Compare March 14, 2026 18:40

frankmcsherry and others added 3 commits March 14, 2026 15:13

Remove unused FromBytes import in test module

a760bd0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

frankmcsherry merged commit 8f254ac into master Mar 14, 2026
6 checks passed

frankmcsherry deleted the decode_u64s branch March 14, 2026 20:13

github-actions bot mentioned this pull request Mar 14, 2026

chore: release v0.12.0 #77

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decode u64s#78

Decode u64s#78
frankmcsherry merged 11 commits intomasterfrom
decode_u64s

frankmcsherry commented Mar 14, 2026

Uh oh!

frankmcsherry commented Mar 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

frankmcsherry commented Mar 14, 2026

Uh oh!

frankmcsherry commented Mar 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant