-
Notifications
You must be signed in to change notification settings - Fork 22
Gene.bordegaray/2025/12/add broadcast exec #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gene-bordegaray
wants to merge
39
commits into
datafusion-contrib:main
Choose a base branch
from
gene-bordegaray:gene.bordegaray/2025/12/add_broadcast_exec
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
39 commits
Select commit
Hold shift + click to select a range
a6e01e7
Split channel resolver in two
gabotechs fc9bfc8
Simplify WorkerResolverExtension and ChannelResolverExtension
gabotechs 9e15f2b
Add default builder to ArrowFlightEndpoint
gabotechs 34cf529
Add some docs
gabotechs 312901d
Listen to clippy
gabotechs f026e41
Split get_flight_client_for_url in two
gabotechs 2508e48
Fix conflicts
gabotechs f7218b0
Remove unnecessary channel resolver
gabotechs b49289a
Improve WorkerResolver docs
gabotechs 793f898
Use one ChannelResolver per runtime
gabotechs eaad60f
Improve error reporting on client connection failure
gabotechs ea4e09a
Add a from_session_builder method for constructing an InMemoryChannel…
gabotechs 33b0cc7
Add ChannelResolver and WorkerResolver default implementations for Arcs
gabotechs 1aeb719
Make TPC-DS tests use DataFusion test dataset
gabotechs e377698
Remove non-working in-memory option from benchmarks
gabotechs 7a0b296
Remove unnecessary utils folder
gabotechs 41f90a1
Refactor benchmark folder
gabotechs c88058e
Rename to prepare_tpch.rs
gabotechs b3bdd2b
Adapt benchmarks for TPC-DS
gabotechs 05a30cc
Update benchmarks README.md
gabotechs 0c736fd
Fix conflicts
gabotechs f9f4439
Use default session state builder
gabotechs 21e8581
Update benchmarks README.md
gabotechs c306c6d
add broadcast join
gene-bordegaray 12512af
don't distribute 1 consumer tasks
gene-bordegaray 8927012
fix analyze tests
gene-bordegaray d13a28c
dont strictibute a single consumer, use coalesce
gene-bordegaray e0e5f50
intriduce broadcast operator that does caching
gene-bordegaray ec607b5
Merge branch 'main' into gene.bordegaray/2025/12/add_broadcast_exec
gene-bordegaray 47d4ab9
refactored distributed planner to contain less broadcast logic and ad…
gene-bordegaray eae78c5
fix docs
gene-bordegaray 7152752
add comment for follow up streaming work
gene-bordegaray 83bfed2
add comment explaining cache solution for 1->1 task stage collapses
gene-bordegaray 5d02692
refactor network broadcast to be cleaner
gene-bordegaray 9669518
add new pass to the annotation
gene-bordegaray 4a3af9a
put broadcast joins behind feature flag
gene-bordegaray 9bcd287
Merge branch 'main' into gene.bordegaray/2025/12/add_broadcast_exec
gene-bordegaray b3cd8e0
ony distribute joins when broadcast enabled
gene-bordegaray 92b81a6
add benchmark config and update docs
gene-bordegaray File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -38,6 +38,10 @@ struct Cmd { | |||||||||||||||||
| /// The bucket name. | ||||||||||||||||||
| #[structopt(long, default_value = "datafusion-distributed-benchmarks")] | ||||||||||||||||||
| bucket: String, | ||||||||||||||||||
|
|
||||||||||||||||||
| // Turns broadcast joins on. | ||||||||||||||||||
| #[structopt(long)] | ||||||||||||||||||
| broadcast_joins: bool, | ||||||||||||||||||
| } | ||||||||||||||||||
|
|
||||||||||||||||||
| #[tokio::main] | ||||||||||||||||||
|
|
@@ -67,12 +71,15 @@ async fn main() -> Result<(), Box<dyn Error>> { | |||||||||||||||||
| let runtime_env = Arc::new(RuntimeEnv::default()); | ||||||||||||||||||
| runtime_env.register_object_store(&s3_url, s3); | ||||||||||||||||||
|
|
||||||||||||||||||
| let state = SessionStateBuilder::new() | ||||||||||||||||||
| let mut state = SessionStateBuilder::new() | ||||||||||||||||||
| .with_default_features() | ||||||||||||||||||
| .with_runtime_env(Arc::clone(&runtime_env)) | ||||||||||||||||||
| .with_distributed_worker_resolver(Ec2WorkerResolver::new()) | ||||||||||||||||||
| .with_physical_optimizer_rule(Arc::new(DistributedPhysicalOptimizerRule)) | ||||||||||||||||||
| .build(); | ||||||||||||||||||
| if cmd.broadcast_joins { | ||||||||||||||||||
| state = state.with_distributed_broadcast_joins_enabled(true)?; | ||||||||||||||||||
| } | ||||||||||||||||||
|
Comment on lines
78
to
+82
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: even sorter:
Suggested change
|
||||||||||||||||||
| let ctx = SessionContext::from(state); | ||||||||||||||||||
|
|
||||||||||||||||||
| let worker = Worker::default().with_runtime_env(runtime_env); | ||||||||||||||||||
|
|
||||||||||||||||||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😄 it's funny how you managed to choose three different arguments for the broadcast joins flags in three different places (
broadcast_joins,enable_broadcast_joins,broadcast_joins_enabled).It's a pretty good idea to add this option, I'd probably just choose one flag name in all places, probably
broadcast_joinssince its shorter.