-
Notifications
You must be signed in to change notification settings - Fork 99
feat: Add ak.sort() for CUDA backend
#3750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| # BSD 3-Clause License; see https://github.com/scikit-hep/awkward/blob/main/LICENSE | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| from awkward._nplikes.cupy import Cupy | ||
|
|
||
| # Cache for cuda.compute availability | ||
| _cuda_compute_available: bool | None = None | ||
|
|
||
|
|
||
| def is_available() -> bool: | ||
| global _cuda_compute_available | ||
|
|
||
| if _cuda_compute_available is not None: | ||
| return _cuda_compute_available | ||
|
|
||
| try: | ||
| import cuda.compute # noqa: F401 | ||
|
|
||
| _cuda_compute_available = True | ||
| except ImportError: | ||
| _cuda_compute_available = False | ||
|
|
||
| return _cuda_compute_available | ||
|
|
||
|
|
||
| def segmented_sort( | ||
| toptr, | ||
| fromptr, | ||
| length, | ||
| offsets, | ||
| offsetslength, | ||
| parentslength, | ||
| ascending, | ||
| stable, | ||
| ): | ||
| from cuda.compute import SortOrder, segmented_sort | ||
|
|
||
| cupy_nplike = Cupy.instance() | ||
| cp = cupy_nplike._module | ||
|
|
||
| # Ensure offsets are int64 as expected by segmented_sort | ||
| if offsets.dtype != cp.int64: | ||
| offsets = offsets.astype(cp.int64) | ||
|
|
||
| num_segments = offsetslength - 1 | ||
| num_items = int(offsets[-1]) if len(offsets) > 0 else 0 | ||
|
|
||
| start_offsets = offsets[:-1] | ||
| end_offsets = offsets[1:] | ||
|
|
||
| order = SortOrder.ASCENDING if ascending else SortOrder.DESCENDING | ||
|
|
||
| segmented_sort( | ||
| fromptr, # d_in_keys | ||
| toptr, # d_out_keys | ||
| None, # d_in_values (not sorting values, just keys) | ||
| None, # d_out_values | ||
| num_items, # num_items | ||
| num_segments, # num_segments | ||
| start_offsets, # start_offsets_in | ||
| end_offsets, # end_offsets_in | ||
| order, # order (ASCENDING or DESCENDING) | ||
| None, # stream (use default stream) | ||
| ) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -494,7 +494,6 @@ | |
| assert virtual_numpyarray.is_all_materialized | ||
|
|
||
|
|
||
| @pytest.mark.xfail(reason="awkward_sort is not implemented") | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @shwina - we should probably keep the virtual arrays out of POC. Our design goal is to support delayed (lazy) evaluation of operations on virtual arrays.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with keeping virtual arrays out of this for now, I think adding this later is an easy step as we can just explicitly " Just to clarify: Virtual arrays are not for delaying kernels/operations. It's only to lazify the IO when using It's not the goal of virtual arrays to create a symbolic representation of the program. That doesn't exist in awkward so far. It was what we've done over in https://github.com/dask-contrib/dask-awkward at the level of "highlevel Arrays". In principle it's interesting to have this though, but then it should likely be at the level of buffers and kernels that act on those.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looking at the ci errors btw for virtual arrays, the tests seem to be hitting this cupy/cupy#9089 which has been fixed but is to be out in cupy 14. I think if it wasn't for this problem, they would probably pass too. So yeah...they can't be enabled until cupy 14 is out at least. |
||
| def test_numpyarray_sort(numpyarray, virtual_numpyarray): | ||
| assert not virtual_numpyarray.is_any_materialized | ||
| assert ak.array_equal( | ||
|
|
@@ -1224,7 +1223,6 @@ | |
| assert virtual_array.is_all_materialized | ||
|
|
||
|
|
||
| @pytest.mark.xfail(reason="awkward_sort is not implemented") | ||
| def test_listoffsetarray_sort(listoffsetarray, virtual_listoffsetarray): | ||
| assert not virtual_listoffsetarray.is_any_materialized | ||
| assert ak.array_equal( | ||
|
|
@@ -2256,11 +2254,10 @@ | |
| assert virtual_array.is_all_materialized | ||
|
|
||
|
|
||
| @pytest.mark.xfail(reason="awkward_sort is not implemented") | ||
| def test_listarray_sort(listarray, virtual_listarray): | ||
| assert not virtual_listarray.is_any_materialized | ||
| assert ak.array_equal( | ||
| ak.sort(virtual_listarray), | ||
|
Check failure on line 2260 in tests-cuda/test_3459_virtualarray_with_cuda.py
|
||
| ak.sort(listarray), | ||
| ) | ||
| assert virtual_listarray.is_any_materialized | ||
|
|
@@ -3356,7 +3353,6 @@ | |
| assert virtual_recordarray.is_any_materialized | ||
|
|
||
|
|
||
| @pytest.mark.xfail(reason="awkward_sort is not implemented") | ||
| def test_recordarray_sort_x_field(recordarray, virtual_recordarray): | ||
| # Test sort on the x field (ListOffsetArray) | ||
| assert not virtual_recordarray.is_any_materialized | ||
|
|
@@ -3370,7 +3366,6 @@ | |
| assert virtual_recordarray.is_any_materialized | ||
|
|
||
|
|
||
| @pytest.mark.xfail(reason="awkward_sort is not implemented") | ||
| def test_recordarray_sort_y_field(recordarray, virtual_recordarray): | ||
| # Test sort on the y field (NumpyArray) | ||
| assert not virtual_recordarray.is_any_materialized | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should prioritize CCCL kernels for GPU operations, with CuPy serving as a fallback when CCCL is unavailable?