prefer CUDA 13.1 devcontainers, react to some cutlass removals in RAFT #1686

jameslamb · 2026-01-08T15:27:22Z

use CUDA 13.1 devcontainers

Follow-up to #1677

There, I forgot to switch devcontainer testing here back to CUDA 13.1 (I'd temporarily kept it at 13.0 because there weren't yet NCCL packages with 13.1 support). This fixes that.

react to cutlass removals in RAFT

rapidsai/raft#2916 removed headers used by cuVS and stopped exporting cutlass from RAFT.

This brings those headers and some related patches over here to cuVS.

Related: rapidsai/cuml#7658

copy-pr-bot · 2026-01-08T15:27:25Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

jameslamb · 2026-01-08T15:27:29Z

/ok to test

jameslamb · 2026-01-08T16:16:17Z

/ok to test

jameslamb · 2026-01-08T16:50:36Z

This isn't working yet, but taking it out of draft so I don't have to /ok to test. Since many people are focused on this fix anyway, draft wasn't sparing too many notification.

jameslamb · 2026-01-08T16:52:08Z

I suspect that in CI we'll see the same thing I now see locally in a cuda13.1-pip devcontainer:

/home/coder/cuvs/cpp/src/neighbors/detail/cagra/../../detail/../ivf_pq/../../cluster/detail/../../distance/detail/fused_distance_nn/cutlass_base.cuh:21:10: fatal error: raft/util/cutlass_utils.cuh: No such file or directory
   21 | #include <raft/util/cutlass_utils.cuh>  // RAFT_CUTLASS_TRY
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

rapidsai/raft#2916 removed that header from RAFT, but it's still used here:

cuvs/cpp/src/distance/detail/fused_distance_nn/cutlass_base.cuh

Line 21 in bae4cdb

#include <raft/util/cutlass_utils.cuh> // RAFT_CUTLASS_TRY

cuvs/cpp/src/distance/detail/pairwise_distance_cutlass_base.cuh

Line 22 in bae4cdb

#include <raft/util/cutlass_utils.cuh>

jameslamb · 2026-01-08T17:39:05Z

Here's the next error (reproducible locally in a cuda13.1-pip devcontainer too):

  sccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -DCCCL_DISABLE_PDL -DCUB_DISABLE_NAMESPACE_MAGIC -DCUB_IGNORE_NAMESPACE_MAGIC_ERROR -DCUTLASS_NAMESPACE=cuvs_cutlass -DCUVS_BUILD_MG_ALGOS -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DRAFT_LOG_ACTIVE_LEVEL=RAPIDS_LOGGER_LOG_LEVEL_INFO -DRAFT_SYSTEM_LITTLE_ENDIAN=1 -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_DISABLE_ABI_NAMESPACE -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -DTHRUST_IGNORE_ABI_NAMESPACE_ERROR -I/__w/cuvs/cuvs/cpp/include -I/__w/cuvs/cuvs/python/libcuvs/build/py3-none-linux_x86_64/cuvs-cpp/include -I/__w/cuvs/cuvs/python/libcuvs/build/py3-none-linux_x86_64/_deps/nvidiacutlass-src/include -I/__w/cuvs/cuvs/python/libcuvs/build/py3-none-linux_x86_64/_deps/nvidiacutlass-build/include -I/__w/cuvs/cuvs/python/libcuvs/build/py3-none-linux_x86_64/_deps/cccl-src/lib/cmake/thrust/../../../thrust -I/__w/cuvs/cuvs/python/libcuvs/build/py3-none-linux_x86_64/_deps/cccl-src/lib/cmake/libcudacxx/../../../libcudacxx/include -I/__w/cuvs/cuvs/python/libcuvs/build/py3-none-linux_x86_64/_deps/cccl-src/lib/cmake/cub/../../../cub -isystem /usr/local/cuda/include -isystem /pyenv/versions/3.13.11/lib/python3.13/site-packages/libraft/include -isystem /pyenv/versions/3.13.11/lib/python3.13/site-packages/rapids_logger/include -isystem /pyenv/versions/3.13.11/lib/python3.13/site-packages/librmm/include -isystem /usr/local/cuda/targets/x86_64-linux/include -isystem /usr/local/cuda/targets/x86_64-linux/include/cccl -O3 -DNDEBUG -std=c++20 "--generate-code=arch=compute_75,code=[sm_75]" "--generate-code=arch=compute_80,code=[sm_80]" "--generate-code=arch=compute_86,code=[sm_86]" "--generate-code=arch=compute_90a,code=[sm_90a]" "--generate-code=arch=compute_100f,code=[sm_100f]" "--generate-code=arch=compute_120a,code=[sm_120a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -Xcompiler=-Wno-deprecated-declarations -DRAFT_HIDE_DEPRECATION_WARNINGS -Xcompiler=-Wall,-Werror,-Wno-error=deprecated-declarations,-Wno-reorder -Werror=all-warnings --expt-extended-lambda --expt-relaxed-constexpr -DCUDA_API_PER_THREAD_DEFAULT_STREAM -Xfatbin=-compress-all --compress-mode=size -Xcompiler=-fopenmp -MD -MT cuvs-cpp/CMakeFiles/cuvs-cagra-search.dir/src/neighbors/cagra_search_float.cu.o -MF cuvs-cpp/CMakeFiles/cuvs-cagra-search.dir/src/neighbors/cagra_search_float.cu.o.d -x cu -rdc=true -c /__w/cuvs/cuvs/cpp/src/neighbors/cagra_search_float.cu -o cuvs-cpp/CMakeFiles/cuvs-cagra-search.dir/src/neighbors/cagra_search_float.cu.o
  /__w/cuvs/cuvs/python/libcuvs/build/py3-none-linux_x86_64/_deps/nvidiacutlass-src/include/cutlass/cuda_host_adapter.hpp(145): error: identifier "PFN_cuTensorMapEncodeTiled" is undefined
    ( "cuTensorMapEncodeTiled", &pfn, cudaEnableDefault, &cuda_status); if (cuda_status != cudaDriverEntryPointSuccess || cuda_err != cudaSuccess) { return CUDA_ERROR_UNKNOWN; } return reinterpret_cast<PFN_cuTensorMapEncodeTiled>(pfn)(args...); };
                                                                                                                                                                                                          ^

  /__w/cuvs/cuvs/python/libcuvs/build/py3-none-linux_x86_64/_deps/nvidiacutlass-src/include/cutlass/cuda_host_adapter.hpp(146): error: identifier "PFN_cuTensorMapEncodeIm2col" is undefined
    ( "cuTensorMapEncodeIm2col", &pfn, cudaEnableDefault, &cuda_status); if (cuda_status != cudaDriverEntryPointSuccess || cuda_err != cudaSuccess) { return CUDA_ERROR_UNKNOWN; } return reinterpret_cast<PFN_cuTensorMapEncodeIm2col>(pfn)(args...); };
                                                                                                                                                                                                           ^

  2 errors detected in the compilation of "/__w/cuvs/cuvs/cpp/src/neighbors/cagra_search_float.cu".

(build link)

Does bringing over these utilities from RAFT require declaring some other dependency? Or do we maybe need to change the flags cutlass is compiled with to avoid some codepaths? Or maybe these patches that RAFT had (https://github.com/rapidsai/raft/pull/2916/files#diff-5c9671513a52b5c647b9db97761ff01bb6d9a5d15ced9741a1760cec9d7dc9b7) need to be ported over here?

bdice

Let’s fix all the raft names to cuvs since it’s defined in cuvs now.

cpp/include/cuvs/util/cutlass_utils.hpp

cpp/src/distance/detail/fused_distance_nn/cutlass_base.cuh

robertmaynard · 2026-01-08T21:38:29Z

cpp/CMakeLists.txt

              "$<INSTALL_INTERFACE:include>"
  )
-  target_link_libraries(cuvs_cpp_headers INTERFACE raft::raft rmm::rmm)
+  target_link_libraries(cuvs_cpp_headers INTERFACE nvidia::cutlass::cutlass raft::raft rmm::rmm)


I don't believe we need cutlass here as it isn't part of any public header

Good catch! It is part of this file but I'll move it over to source. https://github.com/rapidsai/cuvs/pull/1686/files#diff-b220f246b1f8a4db2836dc1c87f1f30e2cee2d0f534f564e6f4cfd48e8d1a285R10

dantegd

Just had a couple of questions

Also there are now two cutlass_utils.hpp files:

cpp/include/cuvs/util/cutlass_utils.hpp with public header with just cutlass_error exception
cpp/src/util/cutlass_utils.hpp with internal header with CUVS_CUTLASS_TRY macro

This split makes sense (exception is public API, macro is internal), but might confuse future contributors. Consider adding a comment in the internal header noting that it's separate from the public one.

cpp/cmake/patches/cutlass_override.json

dantegd · 2026-01-08T22:26:39Z

cpp/cmake/thirdparty/get_cutlass.cmake

-  if (CUDA_STATIC_RUNTIME)
-    set(CUDART_LIBRARY "${CUDA_cudart_static_LIBRARY}" CACHE FILEPATH "fixing cutlass cmake code" FORCE)
-  endif()
+  set(CUDART_LIBRARY "${CUDA_cudart_static_LIBRARY}" CACHE FILEPATH "fixing cutlass cmake code" FORCE)


why the change to always static?

You mean the other way round? The conditional is removed now. This option just tells so to CUTLASS.

The change was part of raft and not mirrored to cuvs.
The always static cudart is part of this effort: rapidsai/build-planning#235

cpp/src/distance/detail/fused_distance_nn/cutlass_base.cuh

jameslamb · 2026-01-09T15:50:33Z

cpp/cmake/thirdparty/get_cutlass.cmake

+
+  rapids_export_package(
+      BUILD NvidiaCutlass cuvs-exports GLOBAL_TARGETS nvidia::cutlass::cutlass
+  )


Jobs downstream of libcuvs builds are failing like this:

│ │ -- Found rmm: $PREFIX/lib/cmake/rmm/rmm-config.cmake (found version "26.02.0") │ │ -- Found raft: $PREFIX/lib/cmake/raft/raft-config.cmake (found version "26.02.0") │ │ CMake Error at $BUILD_PREFIX/share/cmake-4.2/Modules/CMakeFindDependencyMacro.cmake:93 (find_package): │ │ By not providing "FindNvidiaCutlass.cmake" in CMAKE_MODULE_PATH this │ │ project has asked CMake to find a package configuration file provided by │ │ "NvidiaCutlass", but CMake did not find one. │ │ │ │ Could not find a package configuration file provided by "NvidiaCutlass" │ │ with any of the following names: │ │ │ │ NvidiaCutlassConfig.cmake │ │ nvidiacutlass-config.cmake │ │

(build link)

I think I know what's happening, based on my read of https://docs.rapids.ai/api/rapids-cmake/legacy/dependency_tracking/

Since nvidia::cutlass::cutlass is a PRIVATE dependency of cuvs, we shouldn't be including it in the INSTALL export set and therefore generating a find_dependency(NvidiaCutlass).

I just pushed 9d102ce attempting to fix that.

Looks like one more similar change for cuco (also a PRIVATE dependency) is needed:

│ │ CMake Error at $BUILD_PREFIX/share/cmake-4.2/Modules/CMakeFindDependencyMacro.cmake:93 (find_package): │ │ By not providing "Findcuco.cmake" in CMAKE_MODULE_PATH this project has │ │ asked CMake to find a package configuration file provided by "cuco", but │ │ CMake did not find one. │ │ │ │ Could not find a package configuration file provided by "cuco" with any of │ │ the following names: │ │ │ │ cucoConfig.cmake │ │ cuco-config.cmake

(build link)

Pushed 523e320

robertmaynard · 2026-01-09T17:12:51Z

cpp/cmake/thirdparty/get_cutlass.cmake

+  include("${rapids-cmake-dir}/export/find_package_root.cmake")
+  rapids_export_find_package_root(
+          BUILD NvidiaCutlass [=[${CMAKE_CURRENT_LIST_DIR}]=]
+          EXPORT_SET cuvs-exports


This needs to be in the cuvs-static export set and not the general export set.

PRIVATE static dependencies are still exported

ah ok I did not realize that. Fixed in 59fd826

jameslamb · 2026-01-09T22:56:58Z

/merge

prefer CUDA 13.1 devcontainers

ba5b6f6

jameslamb added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Jan 8, 2026

github-project-automation bot added this to Vector Search, ML, & Data Mining Release Board Jan 8, 2026

github-project-automation bot moved this to Todo in Vector Search, ML, & Data Mining Release Board Jan 8, 2026

jameslamb added 2 commits January 8, 2026 10:15

export cutlass

a7f76da

fix merge conflict

3d62f9c

jameslamb mentioned this pull request Jan 8, 2026

[CI] CMake export error: cuvs_static requires CUTLASS target not in export set rapidsai/cuml#7658

Closed

move dependencies around

2813673

jameslamb requested review from divyegala and robertmaynard January 8, 2026 16:49

jameslamb changed the title ~~WIP: prefer CUDA 13.1 devcontainers~~ prefer CUDA 13.1 devcontainers Jan 8, 2026

jameslamb marked this pull request as ready for review January 8, 2026 16:50

jameslamb requested review from a team as code owners January 8, 2026 16:50

add cutlass utils

a039844

divyegala requested a review from a team as a code owner January 8, 2026 17:16

jameslamb changed the title ~~prefer CUDA 13.1 devcontainers~~ prefer CUDA 13.1 devcontainers, react to some cutlass removals in RAFT Jan 8, 2026

divyegala added 4 commits January 8, 2026 18:42

use raft cutlass patches and add cuco

bfe628f

unconditionally apply cutlass patch

6c0485d

rename patch correctly

4269580

rename

89a5968

bdice reviewed Jan 8, 2026

View reviewed changes

cpp/include/cuvs/util/cutlass_utils.hpp Outdated Show resolved Hide resolved

cpp/include/cuvs/util/cutlass_utils.hpp Outdated Show resolved Hide resolved

cpp/src/distance/detail/fused_distance_nn/cutlass_base.cuh Outdated Show resolved Hide resolved

divyegala added 2 commits January 8, 2026 21:35

raft -> cuvs renames

f0cb5c0

Merge remote-tracking branch 'upstream/main' into cuda13.1-devcontainers

315eff1

bdice approved these changes Jan 8, 2026

View reviewed changes

robertmaynard requested changes Jan 8, 2026

View reviewed changes

This was referenced Jan 8, 2026

build and test against CUDA 13.1.0 rapidsai/cugraph#5383

Merged

build and test against CUDA 13.1.0 rapidsai/cuml#7650

Merged

move cutlass util to source

c04b189

robertmaynard approved these changes Jan 8, 2026

View reviewed changes

divyegala added 4 commits January 8, 2026 21:47

avoid odr violation

1842620

missing include

76a66eb

add cutlass link to cuvs_objs

c2dd949

add link to cuvs-cagra-search

4424cbd

jameslamb mentioned this pull request Jan 8, 2026

Add CUDA 13.1 support rapidsai/devcontainers#636

Merged

dantegd approved these changes Jan 8, 2026

View reviewed changes

link cutlass to tests

5caac0b

jameslamb mentioned this pull request Jan 8, 2026

Update RAPIDS from CUDA 13.0 to 13.1 rapidsai/build-planning#236

Open

try removing cutlass from the install export set

9d102ce

jameslamb commented Jan 9, 2026

View reviewed changes

remove cuco from install export set too

523e320

robertmaynard requested changes Jan 9, 2026

View reviewed changes

move cutlass to cuvs-static-exports

59fd826

jameslamb requested a review from robertmaynard January 9, 2026 18:12

jameslamb added 2 commits January 9, 2026 14:48

do not export cutlass, make cuco and cutlass extra-private

4f7a38b

ugh cmake-format

71a5129

robertmaynard approved these changes Jan 9, 2026

View reviewed changes

rapids-bot bot merged commit 6a8bbf9 into rapidsai:main Jan 10, 2026
99 checks passed

github-project-automation bot moved this from Todo to Done in Vector Search, ML, & Data Mining Release Board Jan 10, 2026

prefer CUDA 13.1 devcontainers, react to some cutlass removals in RAFT #1686

prefer CUDA 13.1 devcontainers, react to some cutlass removals in RAFT #1686

Uh oh!

Conversation

jameslamb commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

use CUDA 13.1 devcontainers

react to cutlass removals in RAFT

Uh oh!

copy-pr-bot bot commented Jan 8, 2026

Uh oh!

jameslamb commented Jan 8, 2026

Uh oh!

jameslamb commented Jan 8, 2026

Uh oh!

jameslamb commented Jan 8, 2026

Uh oh!

jameslamb commented Jan 8, 2026

Uh oh!

jameslamb commented Jan 8, 2026

Uh oh!

bdice left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dantegd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jameslamb commented Jan 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jameslamb commented Jan 8, 2026 •

edited

Loading