Primitives, Components and Interfaces Design #29

frisitano · 2025-03-03T10:35:26Z

Overview

This PR outlines the additional crates, components, interfaces, and data structures that will be used for Milestones 3 and 4. This will provide a holistic view of how the rollup node should be structured. We can then provide the concrete implementations in future PRs.

greged93

just some thoughts related to the codec types and the input to the derivation pipeline

greged93 · 2025-03-03T11:39:46Z

crates/primitives/src/batch/input.rs

+/// This is used as input for the derivation pipeline. All data remains in its raw serialized form.
+/// The data is then deserialized, enriched and processed in the derivation pipeline.


I have thought a little more about this: don't you think we also need to have the calldata decoding inside of the pipeline? This way the pipeline receives only the calldata to the commit transaction, decodes the batch header and chunks, then proceeds on to decoding the chunk data and fetching the possible blob data for transactions if needed.

I think that the way we should authenticate the batch data is by computing the batch hash in the pipeline. Why do you think we should decode the calldata in the pipeline?

So because we commit to the batch in the contract via its hash, is that why you would skip the decoding of the call data?

the public input to the prover is the batch hash so it's primarily driven by proving requirements. The batch hash acts as the commitment to the batch data.

I've been thinking about this a little bit. I think the most future-proof and maintainable way is to not expose any details about internals of a Codec version, the way to read it (e.g. some calldata is required) and batch structure (e.g. chunks and skipped message bitmap).

Otherwise we will end up with lots of structs that represent different versions of the same thing e.g. BatchInputDataV1, BatchInputDataV2. This basically leaks implementation details of the pipeline into the rest of the node. Also this would need major restructuring if we have a new version where BatchInput does not really apply anymore.

From a high-level the derivation pipeline should just take some L1 inputs and produce PayloadAttributes. What exactly happens in between if there's batches, chunks, bundles, should not matter from the perspective of the rollup node overall.

If pipeline behavior is similar enough we can add different versions to the same pipeline instance with some type/version switches here and there. But this is already starting to get messy in l2geth:

type switch for creating different "raw" data, comparable to BatchInputV1 and BatchInputV2. as you can see commit data is required.

for reading the commit data we need to support multiple different function signatures etc: https://github.com/scroll-tech/go-ethereum/blob/40ebbd6491f2a3f604d835a37c7519df643de113/rollup/l1/reader.go#L386

another type switch to distinguish how to handle certain rollup events or different revert events

If they diverge too much, I think it would be best to have a separate pipeline implementation for a specific version. That includes reading calldata, blobs and creating a "batch" and "chunks" if that is the entity within the specific codec/protocol version.

greged93 · 2025-03-03T12:09:13Z

crates/primitives/src/batch/mod.rs

+pub struct Batch {
+    /// The index of the batch.
+    pub index: u64,
+    /// The total number of L1 messages popped before this batch.
+    pub total_l1_messages_popped_before: u64,
+    /// The hash of the parent batch.
+    pub parent_hash: B256,
+    /// The chunks in the batch.
+    pub chunks: Vec<Chunk>,
+    /// The hash of the L1 message queue before the batch.
+    pub prev_l1_message_queue_hash: B256,
+    /// The hash of the L1 message queue after the batch.
+    pub post_l1_message_queue_hash: B256,
+    /// The block commitments in the batch.
+    pub blocks: Vec<BlockCommitment>,
+}


I think we need to confirm with @jonastheis but afaiu, the types in da.go are not used for the decoding but for the construction of the blob and/or calldata. I think we should be using the DAChunkRawTx for the chunk and the daBatchV0, V1, V3 and V7.

Good catch. Let's wait for input from @jonastheis, and I will also review myself.

That is correct the types in da.go are only required for creating batches/blobs, its calldata and preparation for the provers.

I think that every rollup node should ultimately support batch submission to increase decentralization and have an easy fail-over in case the centralized operator is down. Or as a supporting mechanism for subcommitments (so that a node is able to commit the seen data itself autonomously). For now we can ignore this part though.

From the Codec interface only following is required:

Some of these methods like DecodeDAChunksRawTx and DecodeTxsFromBlob are already technical debt because they are only needed in older versions. V7 uses DecodeBlob.

frisitano · 2025-03-04T17:27:19Z

crates/pipeline/src/lib.rs

+/// A pipeline for processing batch inputs and producing scroll payloads.
+#[derive(Debug)]
+pub struct Pipeline;
+
+impl Pipeline {
+    /// Creates a new [`Pipeline`] instance.
+    pub fn new() -> Self {
+        Pipeline
+    }
+
+    /// Handles a batch input.
+    pub fn handle_batch_input(&mut self, _batch_input: BatchInput) {
+        // Handle the batch input.
+        todo!()
+    }
+
+    /// Gets the next scroll payload.
+    pub fn next(&mut self) -> Option<ScrollPayloadAttributes> {
+        // Get the next scroll payload.
+        todo!()
+    }
+}
+
+impl Stream for Pipeline {
+    type Item = ScrollPayloadAttributes;
+
+    fn poll_next(
+        self: std::pin::Pin<&mut Self>,
+        cx: &mut std::task::Context<'_>,
+    ) -> std::task::Poll<Option<Self::Item>> {
+        let this = self.get_mut();
+
+        todo!()
+    }
+}


We should refactor this interface as we discussed. @greged93 I have assigned #2 to you. We should redesign this interface, taking into account the insights provided by @Thegaram. We don't need full implementation now, but an API skeleton would be good.

sure, I will provide some pseudocode and let Peter review as well.

jonastheis · 2025-03-06T02:10:32Z

crates/l1/src/constants.rs

this is more network configuration no?

Yes, we should probably move this.

jonastheis · 2025-03-06T02:21:25Z

crates/indexer/src/lib.rs

+impl Indexer {
+    /// Handles an event from the L1.
+    pub async fn handle_l1_event(&mut self, event: L1Event) {
+        match event {


CommitBatch and L1Message are specific events to Scroll and the others are general L1 "events". Is this correct?

What about Revert and Finalize events for batches?

Yes, we will need to index Finalize to determine the rollup finality status.

I'm not sure this is such a good idea to add the rollup events here.
When having a new version: If we add or change the events intrinsic meaning, even if we remove an event we will always need to support all of this here to be able to sync from genesis.

I still think we should try to encapsulate all the derivation specifics as much as possible and push them into a specific derivation pipeline implementation. We could have derivation pipeline implementations for different versions of the protocol/codec.

This way the general node framework stays the same:

notify of L1 updates (and maybe supply data in a very general way. but this a bit tricky as we don't know what exact data is required, so we need to enable the derivation pipeline to specify what events and data it is interested in without decoding the data)

derivation pipeline does all the specifics

jonastheis · 2025-03-06T02:27:54Z

crates/primitives/src/block.rs

+    /// The withdrawal root of the block commitment.
+    pub withdraw_root: B256,
+    /// The prover row consumption of the block commitment.
+    pub row_consumption: RowConsumption,


iiuc RowConsumption is not needed anymore after EuclidV1

@greged93 keep this in mind as you work on the derivation pipeline.

jonastheis · 2025-03-06T03:51:24Z

crates/primitives/src/batch/input.rs

+/// This is used as input for the derivation pipeline. All data remains in its raw serialized form.
+/// The data is then deserialized, enriched and processed in the derivation pipeline.


I've been thinking about this a little bit. I think the most future-proof and maintainable way is to not expose any details about internals of a Codec version, the way to read it (e.g. some calldata is required) and batch structure (e.g. chunks and skipped message bitmap).

Otherwise we will end up with lots of structs that represent different versions of the same thing e.g. BatchInputDataV1, BatchInputDataV2. This basically leaks implementation details of the pipeline into the rest of the node. Also this would need major restructuring if we have a new version where BatchInput does not really apply anymore.

From a high-level the derivation pipeline should just take some L1 inputs and produce PayloadAttributes. What exactly happens in between if there's batches, chunks, bundles, should not matter from the perspective of the rollup node overall.

If pipeline behavior is similar enough we can add different versions to the same pipeline instance with some type/version switches here and there. But this is already starting to get messy in l2geth:

type switch for creating different "raw" data, comparable to BatchInputV1 and BatchInputV2. as you can see commit data is required.

for reading the commit data we need to support multiple different function signatures etc: https://github.com/scroll-tech/go-ethereum/blob/40ebbd6491f2a3f604d835a37c7519df643de113/rollup/l1/reader.go#L386

another type switch to distinguish how to handle certain rollup events or different revert events

If they diverge too much, I think it would be best to have a separate pipeline implementation for a specific version. That includes reading calldata, blobs and creating a "batch" and "chunks" if that is the entity within the specific codec/protocol version.

jonastheis · 2025-03-06T03:59:47Z

crates/primitives/src/batch/mod.rs

+pub struct Batch {
+    /// The index of the batch.
+    pub index: u64,
+    /// The total number of L1 messages popped before this batch.
+    pub total_l1_messages_popped_before: u64,
+    /// The hash of the parent batch.
+    pub parent_hash: B256,
+    /// The chunks in the batch.
+    pub chunks: Vec<Chunk>,
+    /// The hash of the L1 message queue before the batch.
+    pub prev_l1_message_queue_hash: B256,
+    /// The hash of the L1 message queue after the batch.
+    pub post_l1_message_queue_hash: B256,
+    /// The block commitments in the batch.
+    pub blocks: Vec<BlockCommitment>,
+}


That is correct the types in da.go are only required for creating batches/blobs, its calldata and preparation for the provers.

I think that every rollup node should ultimately support batch submission to increase decentralization and have an easy fail-over in case the centralized operator is down. Or as a supporting mechanism for subcommitments (so that a node is able to commit the seen data itself autonomously). For now we can ignore this part though.

From the Codec interface only following is required:

Some of these methods like DecodeDAChunksRawTx and DecodeTxsFromBlob are already technical debt because they are only needed in older versions. V7 uses DecodeBlob.

frisitano added 4 commits February 28, 2025 21:57

primitives and l1 contracts

770df2f

add database empty lib

6d0fc65

commit updated code strucutre

fb58c5e

update codec and interfaces

2795c09

frisitano requested review from Thegaram, greged93 and jonastheis March 3, 2025 10:35

greged93 reviewed Mar 3, 2025

View reviewed changes

updates

69016d7

frisitano commented Mar 4, 2025

View reviewed changes

jonastheis reviewed Mar 6, 2025

View reviewed changes

Refactor L1Watcher to return PipelineEvent(u64) (#31)

9d495b2

frisitano closed this Mar 13, 2025

frisitano mentioned this pull request Apr 2, 2025

Implement Primitives #24

Closed

		/// This is used as input for the derivation pipeline. All data remains in its raw serialized form.
		/// The data is then deserialized, enriched and processed in the derivation pipeline.

Primitives, Components and Interfaces Design #29

Primitives, Components and Interfaces Design #29

Uh oh!

Conversation

frisitano commented Mar 3, 2025

Overview

Uh oh!

greged93 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

frisitano Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

frisitano Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

frisitano Mar 4, 2025 •

edited

Loading

frisitano Mar 6, 2025 •

edited

Loading