feat: Add `ak.sort()` for CUDA backend #3750

shwina · 2025-11-26T22:29:16Z

Closes #3749.

Following the study #3748, this PR adds support for ak.sort() for the CUDA backend, using https://nvidia.github.io/cccl/python/compute.html.

⚠️ This PR should not be merged in its current state - I'm opening it as a POC at this time.

codecov · 2025-11-26T22:34:10Z

Codecov Report

❌ Patch coverage is 16.00000% with 42 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.60%. Comparing base (b749e49) to head (40e1839).
⚠️ Report is 474 commits behind head on main.

Files with missing lines	Patch %	Lines
src/awkward/_connect/cuda/_compute.py	0.00%	24 Missing ⚠️
src/awkward/_backends/cupy.py	22.22%	14 Missing ⚠️
src/awkward/_kernels.py	50.00%	4 Missing ⚠️

Additional details and impacted files

Files with missing lines	Coverage Δ
src/awkward/_kernels.py	`69.78% <50.00%> (+0.41%)`	⬆️
src/awkward/_backends/cupy.py	`47.61% <22.22%> (-19.05%)`	⬇️
src/awkward/_connect/cuda/_compute.py	`0.00% <0.00%> (ø)`

... and 198 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-11-26T22:48:36Z

The documentation preview is ready to be viewed at http://preview.awkward-array.org.s3-website.us-east-1.amazonaws.com/PR3750

ianna

@shwina - this is great! Thanks for implementing it. Let’s keep it open and discuss.

ianna · 2025-11-27T05:00:10Z

tests-cuda/test_3459_virtualarray_with_cuda.py

    assert virtual_numpyarray.is_all_materialized


-@pytest.mark.xfail(reason="awkward_sort is not implemented")


@shwina - we should probably keep the virtual arrays out of POC. Our design goal is to support delayed (lazy) evaluation of operations on virtual arrays.
This ensures that array transformations are represented symbolically until explicitly
materialized.

I agree with keeping virtual arrays out of this for now, I think adding this later is an easy step as we can just explicitly "maybe_materialize" when calling CCCL kernels.

Just to clarify: Virtual arrays are not for delaying kernels/operations. It's only to lazify the IO when using ak.from_buffers to create an ak.Array. Once the buffer has been delivered by the IO source, everything is eager (there's no symbolic representation of the program or any compute graph here).

It's not the goal of virtual arrays to create a symbolic representation of the program. That doesn't exist in awkward so far. It was what we've done over in https://github.com/dask-contrib/dask-awkward at the level of "highlevel Arrays". In principle it's interesting to have this though, but then it should likely be at the level of buffers and kernels that act on those.

Looking at the ci errors btw for virtual arrays, the tests seem to be hitting this cupy/cupy#9089 which has been fixed but is to be out in cupy 14.
I see

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I think if it wasn't for this problem, they would probably pass too. So yeah...they can't be enabled until cupy 14 is out at least.

ianna · 2025-11-27T05:07:49Z

src/awkward/_backends/cupy.py

-        else:
-            raise AssertionError(f"CuPyKernel not found: {index!r}")
+
+        # CuPy kernel not found, try cuda.compute as fallback


Perhaps we should prioritize CCCL kernels for GPU operations, with CuPy serving as a fallback when CCCL is unavailable?

ianna · 2025-11-27T14:03:33Z

.github/workflows/test.yml

    strategy:
      fail-fast: false
      matrix:
        cuda-version:
          - 11
          - 12
          - 13

    steps:
      - name: Clean the workspace and mamba
        run: |
          rm -rf * .[!.]* || echo "Nothing to clean"
          rm -rf ~/micromamba* || echo "Nothing to clean"

      - uses: actions/checkout@v6
        with:
          submodules: true

      - name: Get micromamba
        uses: mamba-org/setup-micromamba@v2
        with:
          environment-name: test-env
          init-shell: bash
          create-args: >-
            -c rapidsai
            -c nvidia
            python=3.13
+            cccl-python
            cudf
            cupy
            cuda-version=${{ matrix.cuda-version }}
            cuda-toolkit
            cxx-compiler


Suggested change

strategy:

fail-fast: false

matrix:

cuda-version: [11, 12, 13]

steps:

- name: Clean the workspace and mamba

run: |

rm -rf * .[!.]* || echo "Nothing to clean"

rm -rf ~/micromamba* || echo "Nothing to clean"

- uses: actions/checkout@v6

with:

submodules: true

- name: Get micromamba (CUDA <= 11)

if: ${{ matrix.cuda-version < 12 }}

uses: mamba-org/setup-micromamba@v2

with:

environment-name: test-env

init-shell: bash

create-args: >-

-c rapidsai

-c nvidia

python=3.13

cudf

cupy

cuda-version=${{ matrix.cuda-version }}

cuda-toolkit

cxx-compiler

- name: Get micromamba (CUDA >= 12, add cccl-python)

if: ${{ matrix.cuda-version >= 12 }}

uses: mamba-org/setup-micromamba@v2

with:

environment-name: test-env

init-shell: bash

create-args: >-

-c rapidsai

-c nvidia

python=3.13

cccl-python

cudf

cupy

cuda-version=${{ matrix.cuda-version }}

cuda-toolkit

cxx-compiler

Here is a hopefully correct fixed matrix that installs cccl-python only when CUDA ≥ 12, and avoids the solver error when CUDA = 11.

@ariostas - given that we still want to use CUDA 11.

Initial implementation of ak.sort() for CUDA backend using cuda.compute

22d665a

Remove xfails

c02f4cb

shwina changed the title ~~Add ak.sort() for CUDA backend~~ feat: Add ak.sort() for CUDA backend Nov 26, 2025

ianna approved these changes Nov 27, 2025

View reviewed changes

Add back xfail for test_listarray_sort

40e1839

ianna reviewed Nov 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add `ak.sort()` for CUDA backend #3750

feat: Add `ak.sort()` for CUDA backend #3750

Uh oh!

shwina commented Nov 26, 2025

Uh oh!

codecov bot commented Nov 26, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

ianna left a comment

Uh oh!

ianna Nov 27, 2025

Uh oh!

pfackeldey Nov 27, 2025 •

edited

Loading

Uh oh!

ikrommyd Nov 27, 2025 •

edited

Loading

Uh oh!

ianna Nov 27, 2025

Uh oh!

ianna Nov 27, 2025

Uh oh!

ianna Nov 27, 2025

Uh oh!

ianna Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		assert virtual_numpyarray.is_all_materialized


		@pytest.mark.xfail(reason="awkward_sort is not implemented")

feat: Add ak.sort() for CUDA backend #3750

Are you sure you want to change the base?

feat: Add ak.sort() for CUDA backend #3750

Uh oh!

Conversation

shwina commented Nov 26, 2025

Uh oh!

codecov bot commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

ianna left a comment

Choose a reason for hiding this comment

Uh oh!

ianna Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

pfackeldey Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ikrommyd Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ianna Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

ianna Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

ianna Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

ianna Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: Add `ak.sort()` for CUDA backend #3750

feat: Add `ak.sort()` for CUDA backend #3750

codecov bot commented Nov 26, 2025 •

edited

Loading

pfackeldey Nov 27, 2025 •

edited

Loading

ikrommyd Nov 27, 2025 •

edited

Loading