-
Notifications
You must be signed in to change notification settings - Fork 117
Set algorithms, revert to set_a_write algorithm for small sizes #2317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR reworks the set algorithms in the SYCL backend to dynamically select between different implementation strategies based on input sizes. Key changes include removing the boolean template parameter from the set tag types, updating call sites and helper functions to use non-templated tags, and introducing a threshold switch in __set_op_impl to choose between the write_a_scan and balanced_path algorithms.
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
test/general/implementation_details/balanced_path_unit_tests.pass.cpp | Removed instantiations with std::true_type from set tag calls in tests |
include/oneapi/dpl/pstl/hetero/dpcpp/unseq_backend_sycl.h | Dropped the _IsOneShot template parameter from set tag types and updated __brick_set_op accordingly |
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h | Updated comments and template parameters (renaming to _SetTag) to align with new tag type definitions |
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_merge.h | Adjusted __parallel_merge overloads to remove extraction of _ExecutionPolicy and use a custom name parameter |
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_for.h | Updated parallel_for overloads to use an explicit queue parameter and _CustomName for kernel naming |
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl.h | Refactored the __set_op_impl and related functions to support algorithm switching based on input size |
include/oneapi/dpl/pstl/hetero/algorithm_impl_hetero.h | Modified set algorithm patterns to remove the boolean tag instantiation and use the new tag types |
Comments suppressed due to low confidence (3)
include/oneapi/dpl/pstl/hetero/dpcpp/unseq_backend_sycl.h:1230
- Since the _IsOneShot template parameter and the associated __can_write_from_rng2_v flag have been removed from the set tag types, please add a brief comment explaining that the write capability is now determined elsewhere for clarity and maintainability.
struct _IntersectionTag
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_merge.h:532
- With the introduction of the _CustomName parameter and the new overloads for __parallel_merge, please add inline documentation to clarify when each overload should be used and how _CustomName is derived from the execution policy.
template <typename _ExecutionPolicy, typename _Range1, typename _Range2, typename _Range3, typename _Compare,
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl.h:1332
- The magic number used as a threshold for switching algorithms (1024 * 1024) could benefit from an inline comment that explains its rationale and expected impact on performance, to improve code maintainability.
static constexpr std::size_t __threshold = 1024 * 1024;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor comments from me.
0271355
to
25654d0
Compare
b43a8d1
to
8b9fdb7
Compare
22bd0b2
to
39eb107
Compare
I fixed the rebase with main, Now the diff reasonable is reasonable. It was previously clashing with a partial implementation of the partitioning PR which made the diff messy and incorrect. I have found a same-mangled name issue with set_union and set_symmetric_difference I am investigating. |
@danhoeflinger I can help you with this topic. |
I've resolved the same mangled name issue. |
8393391
to
f52a5b6
Compare
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h
Outdated
Show resolved
Hide resolved
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h
Show resolved
Hide resolved
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
d901a22
to
653e24a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done another pass over this PR, and it LGTM assuming green CI.
Signed-off-by: Dan Hoeflinger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reapproving after bugfix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Relocate the algorithm selection to sycl backend, and use an empirical threshold to decide between reduce_then_scan algorithms set_a_write and partitioned balanced path. --------- Signed-off-by: Dan Hoeflinger <[email protected]>
This PR makes some infrastructure changes to allow a runtime decision based upon number of elements to select the set algorithm selected, in addition to limitations based upon platform and environment.
__parallel_set_op
blocking. (temporarily required for type alignment, see [Draft] RFC future type and keepalives #2300)__parallel_merge_impl
) and "copy" (__parallel_copy_impl
).__set_op_impl
using__consider_write_a_alg
This results in an overall performance improvement, as small numbers of input elements do not perform as well with the balanced path algorithm.
edit: this incorporates a fix from #2386 for the reduce_then_scan temporary buffer for set_a_write