-
Notifications
You must be signed in to change notification settings - Fork 116
Refactor __get_sycl_range to align with sycl #2519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors __get_sycl_range to align with SYCL runtime semantics for the write access mode. The primary change introduces a _NoInit template parameter to control copy-in behavior, making write mode perform copy-in by default (SYCL-compliant) unless explicitly suppressed.
Key Changes:
- Added
_NoInittemplate parameter to__get_sycl_rangeto control copy-in behavior forwriteaccess mode - Updated
transform_ifpatterns to use properwriteaccess mode instead ofread_writeworkaround - Fixed histogram pattern to use
read_write + no_initinstead ofwriteworkaround
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
utils_ranges_sycl.h |
Core implementation: added _NoInit parameter, removed unused _Iterator parameter, updated __is_copy_direct_v logic |
algorithm_impl_hetero.h |
Updated all callsites to remove _Iterator parameter; added /*_NoInit=*/true to preserve existing behavior for write mode; fixed transform_if patterns |
numeric_impl_hetero.h |
Updated callsites to remove _Iterator parameter and add /*_NoInit=*/true for write mode |
histogram_impl_hetero.h |
Fixed histogram to use read_write + no_init instead of write workaround; removed _Iterator parameter from callsites |
parallel_backend_sycl.h |
Updated set operation temporary buffers with /*_NoInit=*/true; removed _Iterator parameter |
binary_search_impl.h |
Removed unused _Iterator template parameter from all __get_sycl_range calls |
async_impl_hetero.h |
Updated async operations with /*_NoInit=*/true for write mode |
glue_async_impl.h |
Removed _Iterator parameter from sort_async |
single_pass_scan.h |
Updated scan kernel template with /*_NoInit=*/true |
esimd_radix_sort_dispatchers.h |
Removed _Iterator parameter from radix sort dispatcher |
esimd_radix_sort.h |
Removed _Iterator parameter from all radix sort variants |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
* separated no_init from write * remove unnecessary type specification Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Co-authored-by: Copilot <[email protected]>
3625a7e to
8f0adfc
Compare
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
| //------------------------------------------------------------------------ | ||
|
|
||
| template <typename _BackendTag, typename _ExecutionPolicy, typename _ForwardIterator, typename _Function> | ||
| template <__par_backend_hetero::access_mode __acc_mode = __par_backend_hetero::access_mode::read_write, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about the template defaults, they make sense for the common case across __pattern_walk 1,2,3. But is having different default template arguments for __pattern_walk1 from the others confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it does makes sense to have these defaults (read_write) for __pattern_walk1, partially because this has been the only mode prior to this PR for __pattern_walk1. However, I wouldn't object to simply requiring the access mode always (no default).
We could have the default be write & no_init, like the others, but I would worry a little about that because it changes the semantics of walk1 if no arguments are given.
While looking at this, I realize that uninitialized_fill and uninitialized_value_construct, uninitialized_default_construct also dont need read_write, but currently use it because they go through a wrapper of pattern_walk1. I should probably propagate these template options up to all the wrappers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this is done)
|
Probably we forgot to change somehow the template <typename _BackendTag, typename _ExecutionPolicy, typename _ForwardIterator, typename _T>
auto
__pattern_fill_async(__hetero_tag<_BackendTag> __tag, _ExecutionPolicy&& __exec, _ForwardIterator __first,
_ForwardIterator __last, const _T& __value)
{
return __pattern_walk1_async(
__tag, ::std::forward<_ExecutionPolicy>(__exec),
__par_backend_hetero::make_iter_mode<__par_backend_hetero::access_mode::write>(__first),
__par_backend_hetero::make_iter_mode<__par_backend_hetero::access_mode::write>(__last),
fill_functor<_T>{__value});
} |
|
One more consideration. template <typename _BackendTag, typename _ExecutionPolicy, typename _ForwardIterator, typename _T>
_ForwardIterator
__pattern_fill(__hetero_tag<_BackendTag> __tag, _ExecutionPolicy&& __exec, _ForwardIterator __first,
_ForwardIterator __last, const _T& __value)
{
__pattern_walk1<__par_backend_hetero::access_mode::write, /*_NoInit=*/true>(
__tag, ::std::forward<_ExecutionPolicy>(__exec),
__par_backend_hetero::make_iter_mode<__par_backend_hetero::access_mode::write>(__first),
__par_backend_hetero::make_iter_mode<__par_backend_hetero::access_mode::write>(__last),
fill_functor<_T>{__value});
return __last;
}So we have two iterators initialized in |
(check mangled output past range) Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
This reverts commit 283d1fc.
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
|
@akukanov @SergeyKopienko @mmichel11 |
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
| }; | ||
|
|
||
| template <sycl::access::mode AccMode, typename _Iterator = void> //TODO: _Iterator is not used and should be removed | ||
| template <sycl::access::mode AccMode, bool _NoInit = false> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| template <sycl::access::mode AccMode, bool _NoInit = false> | |
| template <sycl::access::mode AccMode = __par_backend_hetero::access_mode::read, bool _NoInit = false> |
In this case we able to replace
auto __keep1 = oneapi::dpl::__ranges::__get_sycl_range<__par_backend_hetero::access_mode::read>();to
auto __keep1 = oneapi::dpl::__ranges::__get_sycl_range(); // or __get_sycl_range{} to highlight that this is instance of structureanywhere.
This is inly my suggestion, I don't insist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference would be to keep access modes explicit for clarity in the individual patterns, but if others agree, I could possibly be convinced.
SergeyKopienko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Align
__get_sycl_rangewith SYCL runtime behavior forwriteaccess modeFixes #1272
Summary
This PR aligns
__get_sycl_rangewith SYCL semantics by addingno_initproperty support and makingwritemode perform copy-in by default (consistent with SYCL standard). It also optimizes write-only algorithms and fixes access mode workarounds.Except for a single positive functional change discussed below, this PR should be purely refactoring, and not change the behavior of any copies of data in our out of kernels. Subsequent PRs will cover the functional changes which were originally in this PR.
Key Changes
Core Implementation
bool _NoInit = falsetemplate parameter to__get_sycl_range__is_copy_direct_vto makewritemode copy-in by default unlessno_initis specified_Iteratortemplate parameterwritemode callsites to useno_init=true, preserving current behaviorPattern API Enhancements
Added
_NoInittemplate parameters to__pattern_walk1/2/3(and access mode for__pattern_walk1), enabling fine-grained copy-in control over access modes for output sequences. Removed (unsupported with vector) access modes for input sequences, they must be read without NoInit.One small functional change
unique: copy back changed fromread_writefor both input to defaultedread. This results in a removal of an unnecessary copy of the input buffer back to the host after the kernel. This functional change remains, because it depends upon the__pattern_walkinfrastructure, and we would like to make the inputs to these patterns requireread without no_init.