-
Notifications
You must be signed in to change notification settings - Fork 14
Refactor partition groups and add tests #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bf7b224 to
506a7fe
Compare
src/planning.rs
Outdated
| stage_id: stage.stage_id, | ||
| plan_bytes, | ||
| partition_group: partition_group.to_vec(), | ||
| partition_group: (partition_group.start() as u64..partition_group.end() as u64).collect(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me know if I can change the DDTask proto to use a range as well.
src/planning.rs
Outdated
| let new_child = Arc::new(PartitionIsolatorExec::new( | ||
| child.clone(), | ||
| partitions_per_worker.unwrap(), // we know it is a Some, here. | ||
| partition_count, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was a bug before. The isolator seems to want the total number of partitions here.
25fe243 to
0fa3702
Compare
0fa3702 to
0ec389d
Compare
NGA-TRAN
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very impressive first PR, @jayshrivastava
I have a few minor comments to suggest clearer names and more unit tests
b208ab2 to
05d3d2a
Compare
This change adds a new abstraction called a `Grouper` in `distribution_strategy.rs`. This is responsible for grouping partitions into partition groups. The new `distribution_strategy` module can be used to store other splitters and, in the future, distribution strategies like round robin etc. A "partition group" is now a range [starting_partition, ending_partition) instead of a Vec, so we improve space efficiency. In the proto layer, the `DDTask` proto still uses a Vec, so we simply expand the range into a Vec when creating them in `planning.rs::assign_to_workers`. Testing - unit tests in isolator.rs - unit tests for `build_replacement` in `planning.rs` - unit tests for the grouper in `distribution_strategy.rs`
05d3d2a to
6c86f29
Compare
This change adds a new abstraction called a
Grouperindistribution_strategy.rs. This is responsible for grouping partitions into partition groups. The newdistribution_strategymodule can be used to store other splitters and, in the future, distribution strategies like round robin etc.A "partition group" is now a range [starting_partition, ending_partition) instead of a Vec, so we improve space efficiency. In the proto layer, the
DDTaskproto still uses a Vec, so we simply expand the range into a Vec when creating them inplanning.rs::assign_to_workers.This also fixes a bug in the isolator where it may not have reported the right number of partitions.
Testing
build_replacementinplanning.rsdistribution_strategy.rs