-
Notifications
You must be signed in to change notification settings - Fork 39
Open
Description
Hey,
Nice work on this project! I cannot run torchcomms with NCCLX as a backend on B200. I tried running the quick-start AllReduce example but it fails with the following error.
Traceback (most recent call last):
File "/root/dcl/benchmark/example.py", line 42, in <module>
main()
File "/root/dcl/benchmark/example.py", line 21, in main
tensor = torch.full(
^^^^^^^^^^^
torch.AcceleratorError: CUDA error: named symbol not found
Search for `cudaErrorSymbolNotFound' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
I installed torchcomms using pip install --pre torch torchcomms --index-url https://download.pytorch.org/whl/nightly/cu128
Note:
- Changing the backend to NCCL makes the example work.
- I have successfully used NCCLX on Ampere and Hopper, so my intuition is that the error is specific to the Blackwell.
Machine Details
nvidia-smi
Sun Nov 9 01:32:20 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.195.03 Driver Version: 570.195.03 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA B200 On | 00000000:29:00.0 Off | 0 |
| N/A 40C P0 147W / 1000W | 0MiB / 183359MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA B200 On | 00000000:3A:00.0 Off | 0 |
| N/A 31C P0 144W / 1000W | 0MiB / 183359MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0
Metadata
Metadata
Assignees
Labels
No labels