apollo_propeller: add unit validation receiving and spawning#12004
Conversation
dd8bb3d to
28acffb
Compare
d07d1b1 to
2516198
Compare
2516198 to
db030c5
Compare
28acffb to
30f70f2
Compare
ShahakShama
left a comment
There was a problem hiding this comment.
@ShahakShama reviewed 3 files and all commit messages, and made 6 comments.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @dan, @noamsp-starkware, and @sirandreww-starkware).
Cargo.toml line 323 at r3 (raw file):
rand_chacha = "0.3.1" rand_distr = "0.4.3" rayon = "1.11.0"
@dan-starkware do you approve?
crates/apollo_propeller/src/message_processor.rs line 78 at r3 (raw file):
loop { tokio::select! { _ = sleep_until(deadline) => {
Use tokio timeout instead of select and sleep
crates/apollo_propeller/src/message_processor.rs line 87 at r3 (raw file):
let (result_tx, result_rx) = oneshot::channel(); let mut validator_moved = validator.take().unwrap();
Why is this unwrap safe? (I think it's not)
crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):
let mut validator_moved = validator.take().unwrap(); rayon::spawn(move || {
Why do you use rayon and not tokio? Explain both to me and in a comment
crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):
let mut validator_moved = validator.take().unwrap(); rayon::spawn(move || {
Shouldn't you store the task handle and poll it
crates/apollo_propeller/src/message_processor.rs line 94 at r3 (raw file):
}); pending_validation = Some(result_rx);
What do you do with pending_validation?
guy-starkware
left a comment
There was a problem hiding this comment.
@guy-starkware reviewed all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on dan, noamsp-starkware, and sirandreww-starkware).
30f70f2 to
400e0a1
Compare
db030c5 to
5805542
Compare
sirandreww-starkware
left a comment
There was a problem hiding this comment.
@sirandreww-starkware made 6 comments.
Reviewable status: 1 of 4 files reviewed, 6 unresolved discussions (waiting on dan, guy-starkware, noamsp-starkware, and ShahakShama).
Cargo.toml line 323 at r3 (raw file):
Previously, ShahakShama wrote…
@dan-starkware do you approve?
sent him a slack message, waiting for his approval
crates/apollo_propeller/src/message_processor.rs line 78 at r3 (raw file):
Previously, ShahakShama wrote…
Use tokio timeout instead of select and sleep
addressed this in a previous PR
crates/apollo_propeller/src/message_processor.rs line 87 at r3 (raw file):
Previously, ShahakShama wrote…
Why is this unwrap safe? (I think it's not)
added a comment as to why
crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):
Previously, ShahakShama wrote…
Why do you use rayon and not tokio? Explain both to me and in a comment
added a comment, let's talk f2f about this
crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):
Previously, ShahakShama wrote…
Shouldn't you store the task handle and poll it
Both options here tokio::spawn_blocking and rayon::spawn give the future to a runtime and it starts running, we use the oneshot to get a response and the task is short lived. So we do not use the handler at all.
crates/apollo_propeller/src/message_processor.rs line 94 at r3 (raw file):
Previously, ShahakShama wrote…
What do you do with pending_validation?
In the next PR I read from it in this same select and I handle a validated unit.
5805542 to
f15b95a
Compare
400e0a1 to
5bb14f7
Compare
f15b95a to
6a30d55
Compare
5bb14f7 to
1fe6bbd
Compare
1fe6bbd to
714e27d
Compare
6a30d55 to
0862594
Compare
ShahakShama
left a comment
There was a problem hiding this comment.
@ShahakShama reviewed 3 files and all commit messages, made 5 comments, and resolved 2 discussions.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on dan, noamsp-starkware, and sirandreww-starkware).
crates/apollo_propeller/src/message_processor.rs line 87 at r3 (raw file):
Previously, sirandreww-starkware (Andrew Luka) wrote…
added a comment as to why
I don't like this. Let's put a TODO to see if we can avoid this expect
crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):
Previously, sirandreww-starkware (Andrew Luka) wrote…
Both options here
tokio::spawn_blockingandrayon::spawngive the future to a runtime and it starts running, we use the oneshot to get a response and the task is short lived. So we do not use the handler at all.
How will you know if the handler panicked?
And in general we have a convention not to leave detached tasks
crates/apollo_propeller/src/message_processor.rs line 61 at r5 (raw file):
Arc::clone(&self.tree_manager), )); let mut pending_validation: Option<oneshot::Receiver<ValidationResult>> = None;
Consider storing all pending validations in a FuturesUnordered instead
crates/apollo_propeller/src/message_processor.rs line 70 at r5 (raw file):
} Some((sender, unit)) = self.unit_rx.recv(), if pending_validation.is_none() => {
What if pending validation is some? I don't know this syntax but it looks like you're taking an item from the stream and discarding it
crates/apollo_propeller/src/message_processor.rs line 93 at r5 (raw file):
}); pending_validation = Some(result_rx);
Why not just handle the validation here?
ShahakShama
left a comment
There was a problem hiding this comment.
@ShahakShama made 4 comments and resolved 2 discussions.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on dan, noamsp-starkware, and sirandreww-starkware).
crates/apollo_propeller/src/message_processor.rs line 87 at r3 (raw file):
Previously, ShahakShama wrote…
I don't like this. Let's put a TODO to see if we can avoid this expect
Unite both options into one enum
crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):
Previously, sirandreww-starkware (Andrew Luka) wrote…
added a comment, let's talk f2f about this
Add a TODO to investigate why rayon is better and if that is still true after improvements
crates/apollo_propeller/src/message_processor.rs line 89 at r3 (raw file):
Previously, ShahakShama wrote…
How will you know if the handler panicked?
And in general we have a convention not to leave detached tasks
Add a comment that it's ok to drop the handle because we know the status through the oneshot channel
crates/apollo_propeller/src/message_processor.rs line 70 at r5 (raw file):
Previously, ShahakShama wrote…
What if pending validation is some? I don't know this syntax but it looks like you're taking an item from the stream and discarding it
Add a comment what this does

Note
Medium Risk
Introduces new concurrency and thread-pool usage in the message processing/validation path; incorrect handling of the pending validation result or validator handoff could lead to stalled processing or subtle race/logic bugs.
Overview
apollo_propellernow depends onrayonand uses it inMessageProcessor::runto offload CPU-bound shard validation (signature/merkle proof checks) from the Tokio runtime.MessageProcessorinstantiates a per-messageUnitValidator, receives(PeerId, PropellerUnit)fromunit_rx, and spawns validation work onto the Rayon thread pool, tracking the in-flight job via aoneshotchannel (pending_validation) to prevent concurrent validations.Written by Cursor Bugbot for commit 714e27d. This will update automatically on new commits. Configure here.