perf: Use await instead of block_on in native shuffle writer #2937
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
N/A.
Rationale for this change
The shuffle writer was calling
futures::executor::block_on()from within an async function running on Tokio's runtime. This blocks the Tokio worker thread and prevents other tasks from running.Specific Problems
thread::park()to block the current Tokio worker thread, preventing Tokio from scheduling other tasks on itSources
I am not a Tokio optimization expert, so I'll defer to some sources I read and maybe others can tell me if I've misunderstood:
From Rust Async Book:
From futures-rs source code,
block_oninternally usesthread::park()to block the current thread when waiting for the future to complete.From Tokio documentation:
What changes are included in this PR?
block_on(repartitioner.insert_batch(batch?))?withrepartitioner.insert_batch(batch?).await?This maintains the same sequential behavior (waiting for each batch to complete before getting the next one) but does so cooperatively without blocking the thread.
How are these changes tested?
Existing tests. I'll try to run a benchmark too.