Skip to content

Comments

apollo_propeller: add unit validation receiving and spawning#12004

Closed
sirandreww-starkware wants to merge 1 commit into01-26-apollo_propeller_add_messageprocessor_scaffoldingfrom
01-26-apollo_propeller_add_unit_validation_receiving_and_spawning
Closed

apollo_propeller: add unit validation receiving and spawning#12004
sirandreww-starkware wants to merge 1 commit into01-26-apollo_propeller_add_messageprocessor_scaffoldingfrom
01-26-apollo_propeller_add_unit_validation_receiving_and_spawning

Conversation

@sirandreww-starkware
Copy link
Contributor

@sirandreww-starkware sirandreww-starkware commented Jan 26, 2026

Note

Medium Risk
Introduces new concurrency and thread-pool usage in the message processing/validation path; incorrect handling of the pending validation result or validator handoff could lead to stalled processing or subtle race/logic bugs.

Overview
apollo_propeller now depends on rayon and uses it in MessageProcessor::run to offload CPU-bound shard validation (signature/merkle proof checks) from the Tokio runtime.

MessageProcessor instantiates a per-message UnitValidator, receives (PeerId, PropellerUnit) from unit_rx, and spawns validation work onto the Rayon thread pool, tracking the in-flight job via a oneshot channel (pending_validation) to prevent concurrent validations.

Written by Cursor Bugbot for commit 714e27d. This will update automatically on new commits. Configure here.

@reviewable-StarkWare
Copy link

This change is Reviewable

Copy link
Contributor Author

sirandreww-starkware commented Jan 26, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@sirandreww-starkware sirandreww-starkware self-assigned this Jan 26, 2026
@sirandreww-starkware sirandreww-starkware marked this pull request as ready for review January 26, 2026 14:13
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

@sirandreww-starkware sirandreww-starkware force-pushed the 01-26-apollo_propeller_add_messageprocessor_scaffolding branch from 2516198 to db030c5 Compare February 9, 2026 14:27
@sirandreww-starkware sirandreww-starkware force-pushed the 01-26-apollo_propeller_add_unit_validation_receiving_and_spawning branch from 28acffb to 30f70f2 Compare February 9, 2026 14:27
Copy link
Collaborator

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShahakShama reviewed 3 files and all commit messages, and made 6 comments.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @dan, @noamsp-starkware, and @sirandreww-starkware).


Cargo.toml line 323 at r3 (raw file):

rand_chacha = "0.3.1"
rand_distr = "0.4.3"
rayon = "1.11.0"

@dan-starkware do you approve?


crates/apollo_propeller/src/message_processor.rs line 78 at r3 (raw file):

        loop {
            tokio::select! {
                _ = sleep_until(deadline) => {

Use tokio timeout instead of select and sleep


crates/apollo_propeller/src/message_processor.rs line 87 at r3 (raw file):

                    let (result_tx, result_rx) = oneshot::channel();
                    let mut validator_moved = validator.take().unwrap();

Why is this unwrap safe? (I think it's not)


crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):

                    let mut validator_moved = validator.take().unwrap();

                    rayon::spawn(move || {

Why do you use rayon and not tokio? Explain both to me and in a comment


crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):

                    let mut validator_moved = validator.take().unwrap();

                    rayon::spawn(move || {

Shouldn't you store the task handle and poll it


crates/apollo_propeller/src/message_processor.rs line 94 at r3 (raw file):

                    });

                    pending_validation = Some(result_rx);

What do you do with pending_validation?

@sirandreww-starkware sirandreww-starkware changed the base branch from 01-26-apollo_propeller_add_messageprocessor_scaffolding to graphite-base/12004 February 19, 2026 10:16
Copy link
Contributor

@guy-starkware guy-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guy-starkware reviewed all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on dan, noamsp-starkware, and sirandreww-starkware).

@sirandreww-starkware sirandreww-starkware force-pushed the 01-26-apollo_propeller_add_unit_validation_receiving_and_spawning branch from 30f70f2 to 400e0a1 Compare February 19, 2026 15:36
@sirandreww-starkware sirandreww-starkware changed the base branch from graphite-base/12004 to 01-26-apollo_propeller_add_messageprocessor_scaffolding February 19, 2026 15:37
Copy link
Contributor Author

@sirandreww-starkware sirandreww-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sirandreww-starkware made 6 comments.
Reviewable status: 1 of 4 files reviewed, 6 unresolved discussions (waiting on dan, guy-starkware, noamsp-starkware, and ShahakShama).


Cargo.toml line 323 at r3 (raw file):

Previously, ShahakShama wrote…

@dan-starkware do you approve?

sent him a slack message, waiting for his approval


crates/apollo_propeller/src/message_processor.rs line 78 at r3 (raw file):

Previously, ShahakShama wrote…

Use tokio timeout instead of select and sleep

addressed this in a previous PR


crates/apollo_propeller/src/message_processor.rs line 87 at r3 (raw file):

Previously, ShahakShama wrote…

Why is this unwrap safe? (I think it's not)

added a comment as to why


crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):

Previously, ShahakShama wrote…

Why do you use rayon and not tokio? Explain both to me and in a comment

added a comment, let's talk f2f about this


crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):

Previously, ShahakShama wrote…

Shouldn't you store the task handle and poll it

Both options here tokio::spawn_blocking and rayon::spawn give the future to a runtime and it starts running, we use the oneshot to get a response and the task is short lived. So we do not use the handler at all.


crates/apollo_propeller/src/message_processor.rs line 94 at r3 (raw file):

Previously, ShahakShama wrote…

What do you do with pending_validation?

In the next PR I read from it in this same select and I handle a validated unit.

@sirandreww-starkware sirandreww-starkware force-pushed the 01-26-apollo_propeller_add_messageprocessor_scaffolding branch from 5805542 to f15b95a Compare February 19, 2026 16:02
@sirandreww-starkware sirandreww-starkware force-pushed the 01-26-apollo_propeller_add_unit_validation_receiving_and_spawning branch from 400e0a1 to 5bb14f7 Compare February 19, 2026 16:02
@sirandreww-starkware sirandreww-starkware force-pushed the 01-26-apollo_propeller_add_messageprocessor_scaffolding branch from f15b95a to 6a30d55 Compare February 22, 2026 07:37
@sirandreww-starkware sirandreww-starkware force-pushed the 01-26-apollo_propeller_add_unit_validation_receiving_and_spawning branch from 5bb14f7 to 1fe6bbd Compare February 22, 2026 07:37
@sirandreww-starkware sirandreww-starkware force-pushed the 01-26-apollo_propeller_add_unit_validation_receiving_and_spawning branch from 1fe6bbd to 714e27d Compare February 22, 2026 09:15
@sirandreww-starkware sirandreww-starkware force-pushed the 01-26-apollo_propeller_add_messageprocessor_scaffolding branch from 6a30d55 to 0862594 Compare February 22, 2026 09:15
Copy link
Collaborator

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShahakShama reviewed 3 files and all commit messages, made 5 comments, and resolved 2 discussions.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on dan, noamsp-starkware, and sirandreww-starkware).


crates/apollo_propeller/src/message_processor.rs line 87 at r3 (raw file):

Previously, sirandreww-starkware (Andrew Luka) wrote…

added a comment as to why

I don't like this. Let's put a TODO to see if we can avoid this expect


crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):

Previously, sirandreww-starkware (Andrew Luka) wrote…

Both options here tokio::spawn_blocking and rayon::spawn give the future to a runtime and it starts running, we use the oneshot to get a response and the task is short lived. So we do not use the handler at all.

How will you know if the handler panicked?
And in general we have a convention not to leave detached tasks


crates/apollo_propeller/src/message_processor.rs line 61 at r5 (raw file):

            Arc::clone(&self.tree_manager),
        ));
        let mut pending_validation: Option<oneshot::Receiver<ValidationResult>> = None;

Consider storing all pending validations in a FuturesUnordered instead


crates/apollo_propeller/src/message_processor.rs line 70 at r5 (raw file):

                }

                Some((sender, unit)) = self.unit_rx.recv(), if pending_validation.is_none() => {

What if pending validation is some? I don't know this syntax but it looks like you're taking an item from the stream and discarding it


crates/apollo_propeller/src/message_processor.rs line 93 at r5 (raw file):

                    });

                    pending_validation = Some(result_rx);

Why not just handle the validation here?

Copy link
Collaborator

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShahakShama made 4 comments and resolved 2 discussions.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on dan, noamsp-starkware, and sirandreww-starkware).


crates/apollo_propeller/src/message_processor.rs line 87 at r3 (raw file):

Previously, ShahakShama wrote…

I don't like this. Let's put a TODO to see if we can avoid this expect

Unite both options into one enum


crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):

Previously, sirandreww-starkware (Andrew Luka) wrote…

added a comment, let's talk f2f about this

Add a TODO to investigate why rayon is better and if that is still true after improvements


crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):

Previously, ShahakShama wrote…

How will you know if the handler panicked?
And in general we have a convention not to leave detached tasks

Add a comment that it's ok to drop the handle because we know the status through the oneshot channel


crates/apollo_propeller/src/message_processor.rs line 70 at r5 (raw file):

Previously, ShahakShama wrote…

What if pending validation is some? I don't know this syntax but it looks like you're taking an item from the stream and discarding it

Add a comment what this does

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants