Skip to content

Conversation

@teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Dec 2, 2025

LLM summary:

This PR implements encoding and decoding of RRD manifests in footers for the Rerun RRD file format. The changes enable random access to chunks within RRD files by storing metadata about chunk locations and properties in footer manifests.

Key changes:

  • Adds manifest building during encoding, tracking chunk metadata (offsets, sizes, entity paths, etc.)
  • Implements manifest parsing during decoding with transport-to-application conversion
  • Adds CLI support for displaying parsed footers (--footers flag) and recomputing manifests during routing (--recompute-manifests flag)

To which I actually don't have all that much to add.

This PR is basically all the remaining glue so that, whenever one uses our Encoder or one of our Decoder variants, RRD footers and manifests will automagically be computed, injected and serialized/deserialized.

The most important part of this PR is arguably the addition of a footer_roundtrip test, that encodes a recording and then manually decodes all of its chunks directly using the generated RRD manifest, instead of using a Decoder.


Part of RRD footers series of PRs:

@teh-cmc teh-cmc added 📉 performance Optimization, memory use, etc do-not-merge Do not merge this PR include in changelog 🔩 data model Sorbet 🪵 Log & send APIs Affects the user-facing API for all languages dataplatform Rerun Data Platform integration labels Dec 2, 2025
@github-actions
Copy link

github-actions bot commented Dec 2, 2025

Web viewer built successfully.

Result Commit Link Manifest
92b97b9 https://rerun.io/viewer/pr/12048 +nightly +main

View image diff on kitdiff.

Note: This comment is updated whenever you push a commit.

@github-actions
Copy link

github-actions bot commented Dec 2, 2025

Latest documentation preview deployed successfully.

Result Commit Link
92b97b9 https://landing-7t8x3cba1-rerun.vercel.app/docs

Note: This comment is updated whenever you push a commit.

@teh-cmc teh-cmc changed the title RRD footers 2: encoding/decoding manifests RRD footers 3: encoding/decoding manifests Dec 2, 2025
@teh-cmc teh-cmc requested a review from Copilot December 2, 2025 13:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements encoding and decoding of RRD manifests in footers for the Rerun RRD file format. The changes enable random access to chunks within RRD files by storing metadata about chunk locations and properties in footer manifests.

Key changes:

  • Adds manifest building during encoding, tracking chunk metadata (offsets, sizes, entity paths, etc.)
  • Implements manifest parsing during decoding with transport-to-application conversion
  • Adds CLI support for displaying parsed footers (--footers flag) and recomputing manifests during routing (--recompute-manifests flag)

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
docs/content/reference/cli.md Documents new CLI flags for footer display and manifest recomputation
crates/top/rerun/src/commands/stdio.rs Updates stream reading to return RRD manifests alongside messages
crates/top/rerun/src/commands/rrd/route.rs Adds --recompute-manifests flag to enable footer generation during routing
crates/top/rerun/src/commands/rrd/print.rs Implements --footers flag to display parsed footer manifests
crates/top/rerun/src/commands/rrd/merge_compact.rs Updates to handle new return signature with footer metadata
crates/top/rerun/src/commands/rrd/filter.rs Updates to handle new return signature with footer metadata
crates/store/re_log_encoding/tests/snapshots/* Adds test snapshots for manifest data and schema validation
crates/store/re_log_encoding/tests/footer_roundtrip.rs Comprehensive roundtrip test for footer encoding/decoding
crates/store/re_log_encoding/src/rrd/log_msg.rs Implements Encodable/Decodable traits for RrdManifest
crates/store/re_log_encoding/src/rrd/encoder.rs Implements manifest building during message encoding
crates/store/re_log_encoding/src/rrd/decoder/stream.rs Adds rrd_manifests() method to async decoder
crates/store/re_log_encoding/src/rrd/decoder/state_machine.rs Accumulates manifests during decoding and implements manifest extraction
crates/store/re_log_encoding/src/rrd/decoder/iterator.rs Adds rrd_manifests() method to sync decoder iterator
crates/store/re_log_encoding/Cargo.toml Adds insta test dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@teh-cmc teh-cmc force-pushed the cmc/rrd_footers_2_rrd_manifests branch 2 times, most recently from 964c086 to c1e6776 Compare December 2, 2025 15:51
@teh-cmc teh-cmc force-pushed the cmc/rrd_footers_3_encoding_decoding branch from f1ad3fb to 92b97b9 Compare December 2, 2025 17:14
}

#[test]
fn footer_roundtrip() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably the most important part of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(really nice test!)

@teh-cmc teh-cmc marked this pull request as ready for review December 2, 2025 17:26
emilk added a commit that referenced this pull request Dec 2, 2025
I'll actually apply it once these PRs are merged:
* #12044
* #12047
* #12048
Copy link
Member

@zehiko zehiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

went through all 3 PRs, it was really nicely documented and easy to follow, from the state machine updated, through rrd manifest builder (which I luckily knew bit about). nice work!

}

#[test]
fn footer_roundtrip() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(really nice test!)

Copy link
Member

@emilk emilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay!

/// RRD manifests are parsed from footers, of which there might be more than one e.g. in the
/// case of concatenated streams.
///
/// This is not cheap: it automatically performs the transport to app level conversion.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a re_tracing::profile_function!(); to it then

/// RRD manifests are parsed from footers, of which there might be more than one e.g. in the
/// case of concatenated streams.
///
/// This is not cheap: it automatically performs the transport to app level conversion.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a re_tracing::profile_function!(); to it then

if !bytes.is_empty() {
let rrd_footer =
re_protos::log_msg::v1alpha1::RrdFooter::from_rrd_bytes(&bytes)?;
_ = rrd_footer; // TODO(cmc): we'll use that in the next PR, promise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a man of your word 👍

Comment on lines +112 to +116
struct ManifestState {
/// The accumulated recording IDs of each individual chunk, extracted from their `LogMsg`.
///
/// This will only be used if [`FooterState::recording_id_scope`] is empty.
recording_ids: Vec<re_log_types::StoreId>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will in most cases be the same store id repeated, correct?

Comment on lines +125 to +126
byte_offset: u64,
byte_size: u64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use re_span::Span here

Suggested change
byte_offset: u64,
byte_size: u64,
bytes: Span<u64>,

// bit weirder on the other hand, but then again this is generally not a new a
// problem: we tend to perform Sorbet migrations a bit too aggressively all over
// the place. We really need a layer that sits between the transport and
// application layer where one can accessed the parsed, unmigrated data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100%

// Right now, the implemented behavior is that we end up with an empty footer, i.e. there are
// no manifests in it.
// Whether that's the correct behavior is another question, but at least it is defined for now
// and can be changed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love these kind of comments ⭐⭐⭐

@teh-cmc
Copy link
Member Author

teh-cmc commented Dec 3, 2025

@rerun-bot full-check

@github-actions
Copy link

github-actions bot commented Dec 3, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🔩 data model Sorbet dataplatform Rerun Data Platform integration do-not-merge Do not merge this PR include in changelog 🪵 Log & send APIs Affects the user-facing API for all languages 📉 performance Optimization, memory use, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants