-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Description
In mid-26.04, RAPIDS was building its wheels with v13.1.1 of the CUDA toolkit (including libnvJitLink 13.1.1) and directly linking against libnvJitLink for JIT-LTO (example: rapidsai/cuvs#1405).
This resulted in runtime issues in environments with v13.0.x of the CTK, like this:
libcugraph.so: undefined symbol: __nvJitLinkGetErrorLog_13_1, version libnvJitLink.so.13
Requiring nvidia-nvjitlink>=13.1 at runtime would solve those issues, but it'd also make RAPIDS wheels incompatible with cuda-toolkit[nvjitlink]<13.1, which torch 2.10 (the latest release) pins to:
- wheels CI: stricter torch index selection, test oldest versions of dependencies cugraph-gnn#413
- (pytorch/pytorch - .github/scripts/generate_binary_build_matrix.py)
In an offline discussion with @bdice @vyasr and @divyegala we discussed and tried several options (example: rapidsai/cuvs#1855), and decided to try building RAPIDS wheels against CTK 13.0.x for RAPIDS 26.04, to avoid losing compatibility with projects tightly pinned to earlier nvJitLink versions.
This tracks that work.
Benefits of this work
- allows RAPIDS to continue adopting JIT-LTO while also staying compatible with
torchand other projects tightly pinning to earliernvidia-nvjitlinkversions
Acceptance Criteria
- all RAPIDS libraries build wheels against CTK 13.0
- RAPIDS conda builds continue to build against the latest CUDA 13 CTK RAPIDS supports (as of this writing, 13.1.1)
- RAPIDS devcontainers continue to support the latest CUDA 13 CTK RAPIDS supports
- RAPIDS CUDA 12 wheels continue to build against the latest CUDA 12 CTK RAPIDS support (as of this writing, 12.9.1)
cugraph-gnnwheels CI is successfully testing against CUDA 12 and CUDA 13torchwheels
Approach
-
ci-imgschanges (ci-wheel: restore CUDA 13.0.2 and 12.2.2 images ci-imgs#373) -
shared-workflowschanges (WIP: wheels-build: build on CUDA 13.0 shared-workflows#510) - library changes, in RAPIDS dependency order (paired with wheels CI: test mix of
cuda-toolkitversion in CI #256)- rmm (WIP: [NOT READY FOR REVIEW] build wheels with CUDA 13.0.x, test wheels against mix of CTK versions rmm#2270)
- kvikio (WIP: build wheels with CUDA 13.0.x, test wheels against mix of CTK versions kvikio#942)
dask-cudanot necessary: pure Python- ucxx (WIP: build wheels with CUDA 13.0.x, test wheels against mix of CTK versions ucxx#604)
- raft (WIP: build wheels with CUDA 13.0.x, test wheels against mix of CTK versions raft#2971)
- cuvs (WIP: [NOT READY FOR REVIEW] enforce a floor on libnvjitlink, build wheels with CUDA 13.0.x, test wheels against mix of CTK versions cuvs#1862)
- nvforest
- cudf (WIP: [NOT READY FOR REVIEW] enforce a floor on libnvjitlink, build wheels with CUDA 13.0.x, test wheels against mix of CTK versions cudf#21671)
- cuopt
- cucim
- cuxfilter
- rapidsmpf
- cugraph
- cuml
- cugraph-gnn (wheels CI: stricter torch index selection, test oldest versions of dependencies cugraph-gnn#413)
nx-cugraphnot necessary: pure Python
- revert
ci-imgsCUDA 12.2.2 images if we end up not needing them (ci-wheel: skip older CUDA versions ci-imgs#374)
Notes
N/A