You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Although planner provides O(log n) search complexity for the resource state at a given time and O(log n) search complexity for "the earliest schedulable point where your resource requirement can be satisfied," planner is complex, predisposed to memory mismanagement, and hard to maintain. It provides a large C API that maintains state for the C++ objects it owns, and is unlikely to be used outside Fluxion.
Early testing in 2024 on El Capitan indicated that planner can become a bottleneck on large clusters with a large number of jobs. Unfortunately I don't know where the test data went. A possible cause of the unexpectedly bad scaling is that while the "earliest time" tree indeed performs an O(log n) search, planner must check that the interval starting from the earliest discovered time satisfies the resource request count. That search is linear and involves removing and reinserting nodes in the "earliest time" tree (mt_tree_*):
if (static_cast<int64_t> (at + duration) > ctx->plan->get_plan_end ())
at = -1;
break;
}
}
The overall complexity is bounded below by O(n log n).
Boost.MultiIndex
With the goal of reproducing planner with a simplified code structure, I developed a prototype "planner2" based on Boost.MultiIndex: https://github.com/milroy/flux-sched/tree/planner-v2. After implementing a large number of planner unit tests, I realized that Boost.MultiIndex doesn't support two-dimensional search on composite keys (e.g., a time, resource count pair). The query is multi-dimensional, so there is no way to sort by time and guarantee a total order on resource count (or vice-versa). Put more simply, the current planner provides an augmented red-black tree search, while Boost.MultiIndex does not.
k-d tree
It possible to perform a single, O(log n) search that returns the earliest time at which a given number of resources is available for a desired interval of time. A k-d tree (specifically, 3-D here) provides this capability, while maintaining O(log n) insertion and removal: https://en.wikipedia.org/wiki/K-d_tree. It should be possible to implement the required functionality of planner with a single k-d tree. nanoflann provides a C++ header-only library with a k-d tree: https://github.com/jlblancoc/nanoflann
Boost.icl
Along the course of implementing planner features in "planner2" I discovered that handling searches on open and closed intervals results in iterator edge cases that require complicated gymnastics and conditionals. It would be nice to use an interval library like Boost.icl: https://www.boost.org/doc/libs/latest/libs/icl/doc/html/index.html. Boost.icl could handle interval computations and aggregation, storing a value per interval with the number of free resources, and the max length of time (extending beyond the interval end) for which at least that quantity is available. Then the value can be used together with the interval start time as a 3-D key in the k-d tree.
Handling max intervals to enable a single O(log n) search
Insertion a (time, resource count, max available interval) into Boost.icl would require computing the max length of time for which at least the current request quantity is available. To do so, the insertion must check the next interval to determine if the free count is greater than or equal to the current free count. If so, that max interval is added to the interval length to be inserted. Removals would be more complicated, because multiple earlier intervals could need their max available interval values updated. It would be possible to avoid a linear search for these intervals by embedding start times of max available intervals in each scheduled point time. The additional memory overhead and complexity may be prohibitive.
Proposed work
I propose the following investigation:
Profile the current planner to identify scalability limitations, identify bottlenecks, and understand memory usage.
Create an experimental implementation of a k-d tree-based planner based on nanoflann to determine if all planner functionality can be provided by a single data structure.
Create an experimental scheduled point interval implementation based on Boost.icl and determine if it simplifies the interval computations.
After those items are complete, we can assess whether further work and performance testing is warranted. Note that it may be worthwhile to restructure the current planner if some combination of new data structures can result in a more maintainable planner even without performance or memory-usage improvements.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Motivation
Although planner provides O(log n) search complexity for the resource state at a given time and O(log n) search complexity for "the earliest schedulable point where your resource requirement can be satisfied," planner is complex, predisposed to memory mismanagement, and hard to maintain. It provides a large C API that maintains state for the C++ objects it owns, and is unlikely to be used outside Fluxion.
Early testing in 2024 on El Capitan indicated that planner can become a bottleneck on large clusters with a large number of jobs. Unfortunately I don't know where the test data went. A possible cause of the unexpectedly bad scaling is that while the "earliest time" tree indeed performs an O(log n) search, planner must check that the interval starting from the earliest discovered time satisfies the resource request count. That search is linear and involves removing and reinserting nodes in the "earliest time" tree (
mt_tree_*):flux-sched/resource/planner/c/planner_c_interface.cpp
Lines 165 to 179 in 8e1718d
The overall complexity is bounded below by O(n log n).
Boost.MultiIndexWith the goal of reproducing planner with a simplified code structure, I developed a prototype "planner2" based on
Boost.MultiIndex: https://github.com/milroy/flux-sched/tree/planner-v2. After implementing a large number of planner unit tests, I realized thatBoost.MultiIndexdoesn't support two-dimensional search on composite keys (e.g., a time, resource count pair). The query is multi-dimensional, so there is no way to sort by time and guarantee a total order on resource count (or vice-versa). Put more simply, the current planner provides an augmented red-black tree search, whileBoost.MultiIndexdoes not.k-d tree
It possible to perform a single, O(log n) search that returns the earliest time at which a given number of resources is available for a desired interval of time. A k-d tree (specifically, 3-D here) provides this capability, while maintaining O(log n) insertion and removal: https://en.wikipedia.org/wiki/K-d_tree. It should be possible to implement the required functionality of planner with a single k-d tree.
nanoflannprovides a C++ header-only library with a k-d tree: https://github.com/jlblancoc/nanoflannBoost.iclAlong the course of implementing planner features in "planner2" I discovered that handling searches on open and closed intervals results in iterator edge cases that require complicated gymnastics and conditionals. It would be nice to use an interval library like
Boost.icl: https://www.boost.org/doc/libs/latest/libs/icl/doc/html/index.html.Boost.iclcould handle interval computations and aggregation, storing a value per interval with the number of free resources, and the max length of time (extending beyond the interval end) for which at least that quantity is available. Then the value can be used together with the interval start time as a 3-D key in the k-d tree.Handling max intervals to enable a single O(log n) search
Insertion a (time, resource count, max available interval) into
Boost.iclwould require computing the max length of time for which at least the current request quantity is available. To do so, the insertion must check the next interval to determine if the free count is greater than or equal to the current free count. If so, that max interval is added to the interval length to be inserted. Removals would be more complicated, because multiple earlier intervals could need their max available interval values updated. It would be possible to avoid a linear search for these intervals by embedding start times of max available intervals in each scheduled point time. The additional memory overhead and complexity may be prohibitive.Proposed work
I propose the following investigation:
nanoflannto determine if all planner functionality can be provided by a single data structure.Boost.icland determine if it simplifies the interval computations.After those items are complete, we can assess whether further work and performance testing is warranted. Note that it may be worthwhile to restructure the current planner if some combination of new data structures can result in a more maintainable planner even without performance or memory-usage improvements.
Let me know what you think!
Beta Was this translation helpful? Give feedback.
All reactions