-
Notifications
You must be signed in to change notification settings - Fork 31
Add iota kernel #1946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add iota kernel #1946
Conversation
Reuse that function call in sorting code-base where argsort is used.
|
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_336 ran successfully. |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_337 ran successfully. |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_338 ran successfully. |
29d7198 to
f1b2045
Compare
f1b2045 to
51ead2b
Compare
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_339 ran successfully. |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_340 ran successfully. |
Until it is passed over to the host function, and
unique_ptr's ownership is released.
Also reduced allocation sizes, where too much was being
allocated.
Introduce smart_malloc_device, etc.
The smart_malloc_device<T>(count, q) makes USM allocation
and returns a unique_ptr<T, USMDeleter> which owns the
allocation. The function throws an exception (std::runtime_error)
if USM allocation is not successful.
Introduce async_smart_free.
This function intends to replace use of host_task submissions
to manage USM temporary deallocations.
The usage is as follows:
```
// returns unique_ptr
auto alloc_owner = smart_malloc_device<T>(count, q);
// get raw pointer for use in kernels
T *data = alloc_owner.get();
[..SNIP..]
// submit host_task that releases the unique_ptr
// after the host task was successfully submitted
// and ownership of USM allocation is transfered to
// the said host task
sycl::event ht_ev =
async_smart_free(q,
dependent_events,
alloc_owner);
[...SNIP...]
```
bbb55f1 to
da3fbcc
Compare
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_341 ran successfully. |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_338 ran successfully. |
Replaced three duplicates of the same kernel with calls to this function.
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_339 ran successfully. |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_340 ran successfully. |
Factored out map_back_impl projects indexing from flat index to a row-wise index. Removed dead code excluded by preprocessor conditional.
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_341 ran successfully. |
|
Ping @AlexanderKalistratov |
Replaced it with hand-written implementation of ceil_log2(n),
such that n <= (dectype(n){1} << ceil_log2(n)) is true for all
positive values of `n` in the range.
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_342 ran successfully. |
Add check of computed against expected indices
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_344 ran successfully. |
One asserts that at least one unique pointer is specified. Another that specified arguments are unique pointers with USMDeleter.
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_346 ran successfully. |
|
I suggest we exclude these failing |
ndgrigorian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, we can merge this into the topk branch and drop the test file PR, then remove the commit that adds test_top_k_largest_1d_radix_i1
This PR builds on top of feature/topk branch.
It adds
iota_implin newsort_utils.hppfile, and uses it inmerge_sort.hpp,radix_sort.hppandtopk.hpp.It also fixes possible USM allocation leak in exception handling.