Skip to content
Draft
4 changes: 2 additions & 2 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@ jobs:
wheel-build-cpp:
needs: checks
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@python-3.14
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@build-wheels-old-ctk
with:
matrix_filter: group_by([.ARCH, (.CUDA_VER|split(".")|map(tonumber)|.[0])]) | map(max_by(.PY_VER|split(".")|map(tonumber)))
build_type: pull-request
Expand All @@ -269,7 +269,7 @@ jobs:
wheel-build-python:
needs: wheel-build-cpp
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@python-3.14
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@build-wheels-old-ctk
with:
# Build a wheel for each CUDA x ARCH x minimum supported Python version
matrix_filter: group_by({CUDA_VER, ARCH}) | map(min_by(.PY_VER | split(".") | map(tonumber)))
Expand Down
38 changes: 38 additions & 0 deletions ci/download-torch-wheels.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

# [description]
#
# Downloads a CUDA variant of 'torch' from the correct index, based on CUDA major version.
#
# This exists to avoid using 'pip --extra-index-url', which could allow for CPU-only 'torch'
# to be downloaded from pypi.org.
#

set -e -u -o pipefail

TORCH_WHEEL_DIR="${1}"

# Ensure CUDA-enabled 'torch' packages are always used.
#
# Downloading + passing the downloaded file as a requirement forces the use of this
# package, so we don't accidentally end up with a CPU-only 'torch' from 'pypi.org'
# (which can happen because pip doesn't support index priority).
#
# Not appending this to PIP_CONSTRAINT, because we don't want the torch '--extra-index-url'
# to leak outside of this script into other 'pip {download,install}'' calls.
rapids-dependency-file-generator \
--output requirements \
--file-key "torch_only" \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};dependencies=${RAPIDS_DEPENDENCIES};require_gpu_pytorch=true" \
| tee ./torch-constraints.txt

rapids-pip-retry download \
--isolated \
--prefer-binary \
--no-deps \
-d "${TORCH_WHEEL_DIR}" \
--constraint "${PIP_CONSTRAINT}" \
--constraint ./torch-constraints.txt \
'torch'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just picking a place on the diff to have a threaded conversation.

Here's an interesting one... on the Python 3.14 + CUDA 13.1.1 + latest dependencies jobs (arm64 and amd64), the solve is falling back to numba-cuda==24.0, which has an sdist but no Python 3.14 wheels, which is leading to it being built from source and that build failing!

    Downloading http://pip-cache.local.gha-runners.nvidia.com/packages/04/51/8935ff9ae5150e1ffed945bf1b95002a6a5e1f9256aeb1143e1c159b68c5/numba_cuda-0.24.0.tar.gz (1.3 MB)
...
    Installing build dependencies: started
    Running command installing build dependencies for numba-cuda
...
  Building wheels for collected packages: numba-cuda
...
    g++ -fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall -fPIC -I/tmp/pip-build-env-y5e9zevx/overlay/lib/python3.14/site-packages/numpy/_core/include -Inumba_cuda/numba/cuda/cext -I/pyenv/versions/3.14.3/include/python3.14 -c numba_cuda/numba/cuda/cext/_dispatcher.cpp -o build/temp.linux-x86_64-cpython-314/numba_cuda/numba/cuda/cext/_dispatcher.o -std=c++11
    numba_cuda/numba/cuda/cext/_dispatcher.cpp:1018:2: error: #error "Python minor version is not supported."
     1018 | #error "Python minor version is not supported."
...
    Building wheel for numba-cuda (pyproject.toml): finished with status 'error'
    ERROR: Failed building wheel for numba-cuda
  Failed to build numba-cuda
  error: failed-wheel-build-for-install

(build link)

numba-cuda 0.26.0 was the first version with Python 3.14 wheels... something must be holding the solver back from using that.

Copy link
Member Author

@jameslamb jameslamb Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhhh there it is. torch is == pinning to cuda-bindings and cuda-pathfinder

...
Collecting cuda-bindings==13.0.3 (from torch==2.10.0+cu130)
Collecting cuda-pathfinder~=1.1 (from cuda-bindings==13.0.3->torch==2.10.0+cu130)
...

Later numba-cuda[cu13] needs at least cuda-pathfinder>=1.3.1.

$ docker run --rm -it python:3.14 bash
$ pip install pkginfo
$ pip download --no-deps 'numba-cuda==0.26.0'
$ pkginfo --json ./numba_cuda*.whl
...
  "requires_dist": [
    "numba>=0.60.0",
    "cuda-bindings<14.0.0,>=12.9.1",
    "cuda-core<1.0.0,>=0.5.1",
    "packaging",
    "cuda-bindings<13.0.0,>=12.9.1; extra == \"cu12\"",
    "cuda-pathfinder<2.0.0,>=1.3.1; extra == \"cu12\"",
    "cuda-toolkit[cccl,cudart,nvcc,nvjitlink,nvrtc]==12.*; extra == \"cu12\"",
    "cuda-bindings==13.*; extra == \"cu13\"",
    "cuda-pathfinder<2.0.0,>=1.3.1; extra == \"cu13\"",
    "cuda-toolkit[cccl,cudart,nvjitlink,nvrtc,nvvm]==13.*; extra == \"cu13\""
  ],
...

numba-cuda[cu13]==24.0 doesn't constraint cuda-pathfinder

$ pip download --no-deps 'numba-cuda==0.24.0'
$ pkginfo --json ./numba_cuda-0.24.0*.tar.gz
...
  "requires_dist": [
    "numba>=0.60.0",
    "cuda-bindings<14.0.0,>=12.9.1",
    "cuda-core<1.0.0,>=0.3.2",
    "packaging",
    "cuda-bindings<13.0.0,>=12.9.1; extra == \"cu12\"",
    "cuda-core<1.0.0,>=0.3.0; extra == \"cu12\"",
    "cuda-toolkit[cccl,cudart,nvcc,nvjitlink,nvrtc]==12.*; extra == \"cu12\"",
    "cuda-bindings==13.*; extra == \"cu13\"",
    "cuda-core<1.0.0,>=0.3.2; extra == \"cu13\"",
    "cuda-toolkit[cccl,cudart,nvjitlink,nvrtc,nvvm]==13.*; extra == \"cu13\""
  ],
...

Looks like that was added here: NVIDIA/numba-cuda#308

It looks like this wasn't caught on earlier PRs because CI fell back to a CPU-only torch 😬

  Collecting torch>=2.10.0 (from -r test-pytorch-requirements.txt (line 4))
    Obtaining dependency information for torch>=2.10.0 from http://pip-cache.local.gha-runners.nvidia.com/packages/69/2b/51e663ff190c9d16d4a8271203b71bc73a16aa7619b9f271a69b9d4a936b/torch-2.10.0-cp314-cp314-manylinux_2_28_aarch64.whl.metadata
    Downloading http://pip-cache.local.gha-runners.nvidia.com/packages/69/2b/51e663ff190c9d16d4a8271203b71bc73a16aa7619b9f271a69b9d4a936b/torch-2.10.0-cp314-cp314-manylinux_2_28_aarch64.whl.metadata (31 kB)

(rmm#3316 - wheel-tests-integration-optional / 13.1.1, 3.14, arm64, ubuntu24.04, l4, latest-driver, latest-deps)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, it goes deeper than this.

Just latest numba-cuda and latest torch are happily installable together on Python 3.14.

pip download \
    --no-deps \
    --index-url https://download.pytorch.org/whl/cu130 \
    'torch==2.10.0+cu130'

pip install \
    --prefer-binary \
    ./torch-*.whl \
    'numba-cuda[cu13]>=0.22.1'

# Successfully installed ... cuda-bindings-13.0.3 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-toolkit-13.0.2 ... numba-0.64.0 numba-cuda-0.28.2 ... nvidia-cublas-13.1.0.3 nvidia-cuda-cccl-13.0.85 ... torch-2.10.0+cu130

I think the problem looks like this:

  1. torch-2.10+cu130 depends on a bunch of nvidia-{thing}=={version-from-CTK-13.0.2} wheels
  2. newer numba-cuda depends on cuda-toolkit[cccl,cudart,nvrtc,nvvm]==13.* (since Set up a new VM-based CI infrastructure  NVIDIA/numba-cuda#604)
  3. in this CI job, we're constraining to cuda-toolkit==13.1.*
  4. the solver backtracks to numba-cuda==24.0 (which didn't have cuda-toolkit pinnings, and whose nvidia-nccl and similar dependencies are compatible with torch-2.10's)
  5. numba-cuda==24.0 didn't have wheels for Python 3.14, so pip tries to build it from source
  6. that build from source fails with the error above that basically means "this doesn't support Python 3.14"

What we really want here is a big loud solver error that says "torch-2.10+cu130 only works with the packages pinned in cuda-toolkit==13.0.2, not installable here".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright on the latest build, here's what happened.

Looking at the most recent CI run (build link)

All of these environments look like I'd expect them too and show we're covering a wide range of CTK versions.

NOTE: we end up using CTK 12.4 in the arm64 jobs with RAPIDS_CUDA_VERSION=12.2.2, because there weren't aarch64 cuBLAS wheels for earlier CTKs. CUDA 12.2 will have to be tested in nightlies (I'll do that next on this PR and add a follow-up comment).

Regular wheel tests

wheel-tests / 12.2.2, 3.11, arm64, ubuntu22.04, a100, latest-driver, latest-deps

details (click me)

(link)

Looks exactly like what we want... cuda-toolkit 12.4 (allowed on arm), nvJitLink 12.9, latest numba-cuda

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 cuda-toolkit-12.4.0 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-cu12-12.4.99 nvidia-cuda-nvcc-cu12-12.4.99 nvidia-cuda-nvrtc-cu12-12.4.99 nvidia-cuda-runtime-cu12-12.4.99 nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a55

wheel-tests / 12.9.1, 3.11, amd64, ubuntu22.04, l4, latest-driver, oldest-deps

details (click me)

(link)

No cuda-toolkit, oldest numba-cuda (oldest-deps!), nvjitlink 12.9

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 ... numba-cuda-0.22.1 numpy-1.23.5 ... nvidia-nvjitlink-cu12-12.9.86 ...rmm-cu12-26.4.0a55

wheel-tests / 12.9.1, 3.14, amd64, ubuntu24.04, h100, latest-driver, latest-deps

details (click me)

(link)

Looks good, everything from CTK 12.9 and latest numba-cuda

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 cuda-toolkit-12.9.1... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-cu12-12.9.27 nvidia-cuda-nvcc-cu12-12.9.86 nvidia-cuda-nvrtc-cu12-12.9.86 nvidia-cuda-runtime-cu12-12.9.79 nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a55

wheel-tests / 13.0.2, 3.12, amd64, ubuntu24.04, l4, latest-driver, latest-deps

details (click me)

(link)

Looks good, latest numba-cuda, cuda-toolkit 13.0, most CTK libraries from 13.0, nvJitLink from 13.1.

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.0.2 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.0.85 nvidia-cuda-nvrtc-13.0.88 nvidia-cuda-runtime-13.0.96 ... nvidia-nvjitlink-13.1.115 ... rmm-cu13-26.4.0a55

wheel-tests / 13.0.2, 3.12, arm64, rockylinux8, l4, latest-driver, latest-deps

details (click me)

(link)

Looks good, everything from CTK 13.0 and latest numba-cuda

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.0.2 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.0.85 nvidia-cuda-nvrtc-13.0.88 nvidia-cuda-runtime-13.0.96 nvidia-nvjitlink-13.1.115 nvidia-nvvm-13.0.88 ... rmm-cu13-26.4.0a55

wheel-tests / 13.1.1, 3.13, amd64, rockylinux8, rtxpro6000, latest-driver, latest-deps

details (click me)

(link)

Looks good, everything from CTK 13.1 and latest numba-cuda

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 nvidia-nvvm-13.1.115 ... rmm-cu13-26.4.0a55

wheel-tests / 13.1.1, 3.14, amd64, ubuntu24.04, rtxpro6000, latest-driver, latest-deps

details (click me)

(link)

Looks good, everything from CTK 13.1 and latest numba-cuda

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 nvidia-nvvm-13.1.115 ... rmm-cu13-26.4.0a55

wheel-tests / 13.1.1, 3.14, arm64, ubuntu24.04, l4, latest-driver, latest-deps

details (click me)

(link)

Looks good, everything from CTK 13.1 and latest numba-cuda

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 nvidia-nvvm-13.1.115 ... rmm-cu13-26.4.0a55

PyTorch / CuPy tests

wheel-tests-integration-optional / 12.2.2, 3.11, arm64, ubuntu22.04, a100, latest-driver, latest-deps

details (click me)

(link)

As expected, PyTorch skipped because this project doesn't test PyTorch versions old enough to run against CTK 12.2

Skipping PyTorch tests (requires CUDA 12.6-12.9 or 13.0, found 12.2.2)

CuPy tests pulled in latest numba-cuda, cuda-toolkit 12.4 (intentionally allowed on arm64, it's fine), and the latest 12.x nvjitlink (12.9). This looks like what we want!

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 cuda-toolkit-12.4.0 cupy-cuda12x-14.0.1 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-cu12-12.4.99 nvidia-cuda-nvcc-cu12-12.4.99 nvidia-cuda-nvrtc-cu12-12.4.99 nvidia-cuda-runtime-cu12-12.4.99 nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a55

wheel-tests-integration-optional / 12.9.1, 3.11, amd64, ubuntu22.04, l4, latest-driver, oldest-deps

details (click me)

(link)

For PyTorch tests, no cuda-toolkit installed in the environment, fell all the way back to numba-cuda=0.22.1 (makes sense, oldest-deps!) , used nvidia-* packages from CTK 12.9.

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 ... numba-cuda-0.22.1 numpy-1.23.5 nvidia-cublas-cu12-12.9.1.4 ... nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a55 ... torch-2.9.0+cu129 ...

CuPy tests downgraded CuPy to 13.6.0 (makes sense, oldest-deps!) and that brought fastrlock down with it.

Successfully installed cupy-cuda12x-13.6.0 fastrlock-0.8.3

wheel-tests-integration-optional / 12.9.1, 3.14, amd64, ubuntu24.04, h100, latest-driver, latest-deps

details (click me)

(build link)

For PyTorch tests, cuda-toolkit 12.9 gets installed. It's the correct version (12.9) and we see the expected versions of CTK libraries, like cuBLAS 12.9 and nvJitLink 12.9.

Successfully installed ... cuda-bindings-12.9.4 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.4 cuda-toolkit-12.9.1 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cublas-cu12-12.9.1.4 ... nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a55 ... torch-2.10.0+cu129 ...

CuPy tests kept everything in that environment and just added CuPy

Successfully installed cupy-cuda12x-14.0.1

wheel-tests-integration-optional / 13.0.2, 3.12, amd64, ubuntu24.04, l4, latest-driver, latest-deps

details (click me)

(link)

For PyTorch tests, no cuda-toolkit installed in the environment, but latest numba-cuda (0.28.2) and used CTK 13.0 packages (e.g. cuBLAS 13.1, nvJitLink 13.08). See https://docs.nvidia.com/cuda/archive/13.0.2/cuda-toolkit-release-notes/index.html toconfirm those versions.

Successfully installed ... cuda-bindings-13.0.3 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.0.3 cuda-toolkit-13.0.2 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cublas-13.1.0.3 ... nvidia-nvjitlink-13.0.88 ... rmm-cu13-26.4.0a55 ... torch-2.10.0+cu130 ...

CuPy tests kept everything in that environment and just added CuPy

Successfully installed cupy-cuda13x-14.0.1

wheel-tests-integration-optional / 13.0.2, 3.12, arm64, rockylinux8, l4, latest-driver, latest-deps

details (click me)

(link)

PyTorch tests pulled in cuda-toolkit==13.0.2 and CTK 13.0 libraries (including nvJitLink 13.0)

Successfully installed ... cuda-bindings-13.0.3 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.0.3 cuda-toolkit-13.0.2 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cublas-13.1.0.3 ... nvidia-nvjitlink-13.0.88 ... rmm-cu13-26.4.0a55 ... torch-2.10.0+cu130 ...

CuPy tests kept everything in that environment and just added CuPy

Successfully installed cupy-cuda13x-14.0.1

wheel-tests-integration-optional / 13.1.1, 3.13, amd64, rockylinux8, rtxpro6000, latest-driver, latest-deps

details (click me)

(link)

As expected, skipped because there aren't PyTorch wheels support CUDA 13.1 yet.

Skipping PyTorch tests (requires CUDA 12.6-12.9 or 13.0, found 13.1.1) 

CuPy tests pulled in cuda-toolkit 13.1, latest numba-cuda (0.28.2), and corresponding nvidia-* libraries.

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1 cupy-cuda13x-14.0.1 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 ... rmm-cu13-26.4.0a55

wheel-tests-integration-optional / 13.1.1, 3.14, amd64, ubuntu24.04, rtxpro6000, latest-driver, latest-deps

details (click me)

(link)

As expected, skipped because there aren't PyTorch wheels support CUDA 13.1 yet.

Skipping PyTorch tests (requires CUDA 12.6-12.9 or 13.0, found 13.1.1) 

CuPy tests pulled in cuda-toolkit 13.1, latest numba-cuda (0.28.2), and corresponding nvidia-* libraries.

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1 cupy-cuda13x-14.0.1 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 ... rmm-cu13-26.4.0a55

wheel-tests-integration-optional / 13.1.1, 3.14, arm64, ubuntu24.04, l4, latest-driver, latest-deps

details (click me)

(link)

As expected, skipped because there aren't PyTorch wheels support CUDA 13.1 yet.

Skipping PyTorch tests (requires CUDA 12.6-12.9 or 13.0, found 13.1.1) 

CuPy tests pulled in cuda-toolkit 13.1, latest numba-cuda (0.28.2), and corresponding nvidia-* libraries.

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1 cupy-cuda13x-14.0.1... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 ... rmm-cu13-26.4.0a55

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(updates from testing with nightly matrix)

wheel-tests / 12.2.2, 3.11, amd64, ubuntu22.04, v100, earliest-driver, latest-deps

(link)

Looks exactly like what we want... cuda-toolkit 12.2 (allowed on arm), 12.2 versions of most CTK libraries, nvJitLink 12.9, latest numba-cuda.

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 cuda-toolkit-12.2.2 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-cu12-12.2.140 nvidia-cuda-nvcc-cu12-12.2.140 nvidia-cuda-nvrtc-cu12-12.2.140 nvidia-cuda-runtime-cu12-12.2.140 nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a57

2 changes: 1 addition & 1 deletion ci/test_python_integrations.sh
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ if [ "${CUDA_MAJOR}" -gt 12 ] || { [ "${CUDA_MAJOR}" -eq 12 ] && [ "${CUDA_MINOR
rapids-dependency-file-generator \
--output conda \
--file-key test_pytorch \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};dependencies=${RAPIDS_DEPENDENCIES}" \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};dependencies=${RAPIDS_DEPENDENCIES};require_gpu_pytorch=true" \
--prepend-channel "${CPP_CHANNEL}" \
--prepend-channel "${PYTHON_CHANNEL}" \
| tee env.yaml
Expand Down
7 changes: 3 additions & 4 deletions ci/test_wheel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,16 @@ LIBRMM_WHEELHOUSE=$(RAPIDS_PY_WHEEL_NAME="librmm_${RAPIDS_PY_CUDA_SUFFIX}" rapid
RMM_WHEELHOUSE=$(rapids-download-from-github "$(rapids-package-name "wheel_python" rmm --stable --cuda "$RAPIDS_CUDA_VERSION")")

# generate constraints (possibly pinning to oldest support versions of dependencies)
rapids-generate-pip-constraints test_python ./constraints.txt
rapids-generate-pip-constraints test_python "${PIP_CONSTRAINT}"

# notes:
#
# * echo to expand wildcard before adding `[test]` requires for pip
# * need to provide --constraint="${PIP_CONSTRAINT}" because that environment variable is
# ignored if any other --constraint are passed via the CLI
# * just providing --constraint="${PIP_CONSTRAINT}" to be explicit, and because
# that environment variable is ignored if any other --constraint are passed via the CLI
#
rapids-pip-retry install \
-v \
--constraint ./constraints.txt \
--constraint "${PIP_CONSTRAINT}" \
"$(echo "${LIBRMM_WHEELHOUSE}"/librmm_"${RAPIDS_PY_CUDA_SUFFIX}"*.whl)" \
"$(echo "${RMM_WHEELHOUSE}"/rmm_"${RAPIDS_PY_CUDA_SUFFIX}"*.whl)[test]"
Expand Down
35 changes: 19 additions & 16 deletions ci/test_wheel_integrations.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,23 @@

set -eou pipefail

RAPIDS_INIT_PIP_REMOVE_NVIDIA_INDEX="true"
export RAPIDS_INIT_PIP_REMOVE_NVIDIA_INDEX
source rapids-init-pip

RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen "${RAPIDS_CUDA_VERSION}")"
LIBRMM_WHEELHOUSE=$(RAPIDS_PY_WHEEL_NAME="librmm_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-github cpp)
RMM_WHEELHOUSE=$(rapids-download-from-github "$(rapids-package-name "wheel_python" rmm --stable --cuda "$RAPIDS_CUDA_VERSION")")

# generate constraints (possibly pinning to oldest support versions of dependencies)
rapids-generate-pip-constraints test_python ./constraints.txt
rapids-generate-pip-constraints test_python "${PIP_CONSTRAINT}"

# notes:
#
# * echo to expand wildcard before adding `[test]` requires for pip
# * need to provide --constraint="${PIP_CONSTRAINT}" because that environment variable is
# ignored if any other --constraint are passed via the CLI
# * just providing --constraint="${PIP_CONSTRAINT}" to be explicit, and because
# that environment variable is ignored if any other --constraint are passed via the CLI
#
PIP_INSTALL_SHARED_ARGS=(
--constraint=./constraints.txt
--prefer-binary
--constraint="${PIP_CONSTRAINT}"
"$(echo "${LIBRMM_WHEELHOUSE}"/librmm_"${RAPIDS_PY_CUDA_SUFFIX}"*.whl)"
"$(echo "${RMM_WHEELHOUSE}"/rmm_"${RAPIDS_PY_CUDA_SUFFIX}"*.whl)[test]"
Expand All @@ -39,19 +37,24 @@ CUDA_MINOR=$(echo "${RAPIDS_CUDA_VERSION}" | cut -d'.' -f2)

echo "::group::PyTorch Tests"

if [ "${CUDA_MAJOR}" -gt 12 ] || { [ "${CUDA_MAJOR}" -eq 12 ] && [ "${CUDA_MINOR}" -ge 8 ]; }; then
rapids-logger "Generating PyTorch test requirements"
rapids-dependency-file-generator \
--output requirements \
--file-key test_wheels_pytorch \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" \
| tee test-pytorch-requirements.txt
# Update this when 'torch' publishes CUDA wheels supporting newer CTKs.
#
# See notes in 'dependencies.yaml' for details on supported versions.
if \
{ [ "${CUDA_MAJOR}" -eq 12 ] && [ "${CUDA_MINOR}" -ge 6 ]; } \
|| { [ "${CUDA_MAJOR}" -eq 13 ] && [ "${CUDA_MINOR}" -le 0 ]; }; \
then

# ensure a CUDA variant of 'torch' is used
rapids-logger "Downloading PyTorch CUDA wheels"
TORCH_WHEEL_DIR="$(mktemp -d)"
./ci/download-torch-wheels.sh "${TORCH_WHEEL_DIR}"

rapids-logger "Installing PyTorch test requirements"
rapids-pip-retry install \
-v \
"${PIP_INSTALL_SHARED_ARGS[@]}" \
-r test-pytorch-requirements.txt
"${TORCH_WHEEL_DIR}"/torch-*.whl

timeout 15m python -m pytest -k "torch" ./python/rmm/rmm/tests \
&& EXITCODE_PYTORCH=$? || EXITCODE_PYTORCH=$?
Expand All @@ -60,7 +63,7 @@ if [ "${CUDA_MAJOR}" -gt 12 ] || { [ "${CUDA_MAJOR}" -eq 12 ] && [ "${CUDA_MINOR
EXITCODE="${EXITCODE_PYTORCH}"
fi
else
rapids-logger "Skipping PyTorch tests (requires CUDA 12.8+, found ${RAPIDS_CUDA_VERSION})"
rapids-logger "Skipping PyTorch tests (requires CUDA 12.6-12.9 or 13.0, found ${RAPIDS_CUDA_VERSION})"
fi

echo "::endgroup::"
Expand All @@ -71,7 +74,7 @@ rapids-logger "Generating CuPy test requirements"
rapids-dependency-file-generator \
--output requirements \
--file-key test_wheels_cupy \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};use_cuda_wheels=true" \
| tee test-cupy-requirements.txt

rapids-logger "Installing CuPy test requirements"
Expand Down
118 changes: 100 additions & 18 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,6 @@ files:
- depends_on_cupy
- depends_on_librmm
- depends_on_rmm
test_wheels_pytorch:
output: none
includes:
- depends_on_pytorch
test_wheels_cupy:
output: none
includes:
Expand Down Expand Up @@ -131,6 +127,10 @@ files:
key: test
includes:
- test_python
torch_only:
output: none
includes:
- depends_on_pytorch
channels:
- rapidsai-nightly
- rapidsai
Expand Down Expand Up @@ -238,14 +238,12 @@ dependencies:
- output_types: conda
packages:
- &doxygen doxygen=1.9.1
# 'cuda_version' intentionally does not contain fallback entries... we want
# a loud error if an unsupported 'cuda' value is passed
cuda_version:
specific:
- output_types: conda
matrices:
- matrix:
cuda: "12.0"
packages:
- cuda-version=12.0
- matrix:
cuda: "12.2"
packages:
Expand All @@ -270,6 +268,51 @@ dependencies:
cuda: "13.1"
packages:
- cuda-version=13.1
- output_types: requirements
matrices:
# if use_cuda_wheels=false is provided, do not add dependencies on any CUDA wheels
# (e.g. for DLFW and pip devcontainers)
- matrix:
use_cuda_wheels: "false"
packages:
- matrix:
arch: aarch64
cuda: "12.2"
use_cuda_wheels: "true"
packages:
# some components (like nvidia-cublas-cu12 and nvidia-cuda-nvcc-cu12) didn't have
# aarch64 wheels until CTK 12.3, so allow a slightly looser bound here
- cuda-toolkit>=12.2,<12.4
- matrix:
cuda: "12.2"
use_cuda_wheels: "true"
packages:
- cuda-toolkit==12.2.*
- matrix:
cuda: "12.5"
use_cuda_wheels: "true"
packages:
- cuda-toolkit==12.5.*
- matrix:
cuda: "12.8"
use_cuda_wheels: "true"
packages:
- cuda-toolkit==12.8.*
- matrix:
cuda: "12.9"
use_cuda_wheels: "true"
packages:
- cuda-toolkit==12.9.*
- matrix:
cuda: "13.0"
use_cuda_wheels: "true"
packages:
- cuda-toolkit==13.0.*
- matrix:
cuda: "13.1"
use_cuda_wheels: "true"
packages:
- cuda-toolkit==13.1.*
develop:
common:
- output_types: conda
Expand Down Expand Up @@ -397,25 +440,64 @@ dependencies:
# pip recognizes the index as a global option for the requirements.txt file
- --extra-index-url=https://pypi.anaconda.org/rapidsai-wheels-nightly/simple
depends_on_pytorch:
common:
- output_types: conda
packages:
- pytorch-gpu>=2.10.0
specific:
- output_types: [requirements, pyproject]
- output_types: conda
matrices:
- matrix:
cuda: "12.*"
require_gpu_pytorch: "true"
packages:
- --extra-index-url=https://download.pytorch.org/whl/cu128
- pytorch-gpu>=2.9
- matrix:
packages:
- --extra-index-url=https://download.pytorch.org/whl/cu130
- output_types: [requirements, pyproject]
- pytorch>=2.9
# The 'pytorch.org' indices referenced in --extra-index-url below host CPU-only variants too,
# so requirements like '>=' are not safe.
#
# Using '==' and a version with the CUDA specifier like '+cu130' is the most reliable way to ensure
# the packages we want are pulled (at the expense of needing to maintain this list).
#
# 'torch' tightly pins wheels to a single {major}.{minor} CTK version.
#
# This list only contains entries exactly matching CUDA {major}.{minor} that we test in RAPIDS CI,
# to ensure a loud error alerts us to the need to update this list (or CI scripts) when new
# CTKs are added to the support matrix.
- output_types: requirements
matrices:
# avoid pulling in 'torch' in places like DLFW builds that prefer to install it other ways
- matrix:
no_pytorch: "true"
packages:
# matrices below ensure CUDA 'torch' packages are used
- matrix:
cuda: "12.9"
dependencies: "oldest"
require_gpu_pytorch: "true"
packages:
- &torch_cu129_index --extra-index-url=https://download.pytorch.org/whl/cu129
- torch==2.9.0+cu129
- matrix:
cuda: "12.9"
require_gpu_pytorch: "true"
packages:
- *torch_cu129_index
- torch==2.10.0+cu129
- matrix:
cuda: "13.0"
dependencies: "oldest"
require_gpu_pytorch: "true"
packages:
- &torch_index_cu13 --extra-index-url=https://download.pytorch.org/whl/cu130
- torch==2.9.0+cu130
- matrix:
cuda: "13.0"
require_gpu_pytorch: "true"
packages:
- *torch_index_cu13
- torch==2.10.0+cu130
- matrix:
require_gpu_pytorch: "false"
packages:
- torch>=2.10.0
- torch>=2.9
depends_on_cupy:
common:
- output_types: conda
Expand Down