Skip to content

Conversation

@gabotechs
Copy link
Collaborator

This PR ships some misc improvements to the public API:

  • Hide some methods and structs that were not supposed to be public
  • Expose some methods, structs and traits that are supposed to be public
  • Reorganize the constructor methods in NetworkShuffleExec, NetworkCoalesceExec and PartitionIsolatorExec so that they are suitable to be public
  • Improve naming, signature and docs for NetworkBoundary methods so that they are suitable to be used publicly

Comment on lines -126 to +132
let result = plan.transform_up(|plan| {
// If this node is a DataSourceExec, we need to wrap it with PartitionIsolatorExec so
// that not all tasks have access to all partitions of the underlying DataSource.
if plan.as_any().is::<DataSourceExec>() {
let node = PartitionIsolatorExec::new_pending(plan);
let result =
plan.transform_up(|plan| {
// If this node is a DataSourceExec, we need to wrap it with PartitionIsolatorExec so
// that not all tasks have access to all partitions of the underlying DataSource.
if plan.as_any().is::<DataSourceExec>() {
let node = PartitionIsolatorExec::new(plan);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very annoying that cargo fmt decided that its time to chance the formatting of the whole thing, when the changes are actually minimal

Comment on lines 86 to 90
pub struct ExecutionTask {
/// The url of the worker that will execute this task. A None value is interpreted as
/// unassigned.
pub url: Option<Url>,
pub(crate) url: Option<Url>,
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I wonder if people actually care about accessing the URL here

Comment on lines -69 to +70
┌───── Stage 2 ── Tasks: t0:[p0,p1,p2,p3,p4,p5,p6,p7,p8,p9] t1:[p10,p11,p12,p13,p14,p15,p16,p17,p18,p19] t2:[p20,p21,p22,p23,p24,p25,p26,p27,p28,p29] t3:[p30,p31,p32,p33,p34,p35,p36,p37,p38,p39] t4:[p40,p41,p42,p43,p44,p45,p46,p47,p48,p49] t5:[p50,p51,p52,p53,p54,p55,p56,p57,p58,p59] t6:[p60,p61,p62,p63,p64,p65,p66,p67,p68,p69] t7:[p70,p71,p72,p73,p74,p75,p76,p77,p78,p79] t8:[p80,p81,p82,p83,p84,p85,p86,p87,p88,p89] t9:[p90,p91,p92,p93,p94,p95,p96,p97,p98,p99]
│ RepartitionExec: partitioning=RoundRobinBatch(10), input_partitions=1
┌───── Stage 2 ── Tasks: t0:[p0,p1,p2,p3,p4,p5,p6,p7,p8,p9] t1:[p0,p1,p2,p3,p4,p5,p6,p7,p8,p9] t2:[p0,p1,p2,p3,p4,p5,p6,p7,p8,p9] t3:[p0,p1,p2,p3,p4,p5,p6,p7,p8,p9] t4:[p0,p1,p2,p3,p4,p5,p6,p7,p8,p9] t5:[p0,p1,p2,p3,p4,p5,p6,p7,p8,p9] t6:[p0,p1,p2,p3,p4,p5,p6,p7,p8,p9] t7:[p0,p1,p2,p3,p4,p5,p6,p7,p8,p9] t8:[p0,p1,p2,p3,p4,p5,p6,p7,p8,p9] t9:[p0,p1,p2,p3,p4,p5,p6,p7,p8,p9]
│ RepartitionExec: partitioning=Hash([], 10), input_partitions=1
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change actually seems right, now that we require a hash partitioned node as the input of a NetworkShuffleExec this happens.

@gabotechs gabotechs merged commit f4b94f0 into main Oct 14, 2025
4 checks passed
@gabotechs gabotechs deleted the gabrielmusat/improve-public-api branch October 14, 2025 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants