Summary
segger segment crashes with a segmentation fault on systems running NVIDIA driver 590.x (CUDA 13.1). The crash is caused by the UCX communication library, which is loaded transitively through cugraph at import time. UCX calls cuCtxGetDevice_v2 in the system's libcuda.so.1 (CUDA 13.1 driver) before a CUDA context is initialized, causing a segfault before any segger code actually runs.
This issue is not fixable by adjusting the CUDA toolkit or conda environment — libcuda.so.1 is always the system-global driver library. It is also not a transient problem: cuspatial has been archived (July 2025, read-only) and will never receive CUDA 13 builds, meaning segger cannot run on any system with a CUDA 13.x driver without changes to its dependencies.
Environment
- OS: Linux (RHEL-based), x86_64
- GPU: 2× NVIDIA RTX A4000 (16 GB each)
- NVIDIA Driver: 590.48.01 (CUDA 13.1)
- Python: 3.11.15 (conda-forge)
- segger: 0.1.0 (installed from
dpeerlab/segger main branch)
- PyTorch: 2.5.0+cu121 (works correctly —
torch.cuda.is_available() returns True)
- RAPIDS: 25.4.x (
cudf-cu12, cuml-cu12, cugraph-cu12, cuspatial-cu12)
- UCX:
ucx-py-cu12 0.43.0, libucx-cu12 1.18.1
Reproducing the issue
segger segment -i /path/to/ist/data/ -o /path/to/output/
Immediately crashes with:
[photon:912752:0:912752] Caught signal 11 (Segmentation fault: Sent by the kernel at address (nil))
==== backtrace (tid: 912752) ====
0 .../libucs.so(ucs_handle_error+0x294)
...
4 /lib64/libcuda.so.1(+0x31a708)
5 /lib64/libcuda.so.1(cuCtxGetDevice_v2+0x20)
6 .../libffi.so.8(+0x702a)
...
Segmentation fault (core dumped)
Root cause analysis
The crash occurs in the NVIDIA driver's cuCtxGetDevice_v2 function, called via ctypes/libffi by the UCX library during Python module import. The import chain is:
segger segment
→ segger.cli.segment
→ segger.data.ISTDataModule
→ segger.data.utils.anndata
→ segger.data.utils.neighbors (line 8: `import cugraph`)
→ cugraph.__init__
→ cugraph.structure.graph_primtypes_wrapper
→ cugraph.dask.__init__
→ cugraph.dask.comms.comms
→ raft_dask.common.comms
→ UCX (libucp.so / libucs.so)
→ libcuda.so.1 cuCtxGetDevice_v2 ← SEGFAULT
Why it happens
-
libcuda.so.1 is always system-global. It is provided by the NVIDIA kernel module and cannot be installed per-environment via conda or pip. On this system it is the CUDA 13.1 driver.
-
UCX probes the CUDA driver at import time by calling cuCtxGetDevice_v2 before any CUDA context has been created. On the CUDA 13.1 driver, this results in a segfault instead of a graceful error return.
-
RAPIDS cu12 packages ship UCX libraries compiled against CUDA 12.x, creating a mismatch with the CUDA 13.1 system driver.
-
PyTorch handles this correctly — torch.cuda.is_available() works fine with the same driver, demonstrating that the CUDA 13.1 driver is functional and backward-compatible for well-behaved clients.
-
cuspatial is archived and will never have CUDA 13 builds. The cuspatial repository was archived by RAPIDS on July 28, 2025. The cuspatial-cu13 entry on PyPI is a zero-version placeholder. This means segger's dependency on cuspatial is a permanent blocker for CUDA 13.x systems — not a temporary gap that will be filled by a future release.
-
UCX is not needed by segger. UCX provides multi-node multi-GPU communication for Dask distributed workloads. Segger runs single-node and does not use Dask distributed, yet UCX is loaded unconditionally because cugraph imports its dask submodule at package init time.
What was tried (and failed)
| Attempt |
Result |
Downgrade conda cuda-toolkit to 12.1 |
Same segfault — the toolkit is irrelevant, libcuda.so.1 is always system-global |
export UCX_MEMTYPE_CACHE=n; export UCX_TLS=tcp,self |
Same segfault — crash happens before UCX config is read |
unset LD_LIBRARY_PATH |
Same segfault |
Install RAPIDS via conda (mamba install -c rapidsai) |
Same segfault — conda UCX also calls into system libcuda.so.1 |
CUDA_VISIBLE_DEVICES="" |
No segfault, but then no GPU is available for computation |
Uninstall UCX packages (ucx-py-cu12, libucx-cu12, etc.) |
No segfault, but import cugraph fails with ImportError: libucp.so.0 because cugraph unconditionally imports its Dask/distributed submodule which requires UCX |
Possible solution
- The most long-term solution would be to replace cuspatial, but I guess it would be hard to replicate its functions
- Replace cugraph so that one could remove UCX, but it doesn't seem to be a very stable solution
Summary
segger segmentcrashes with a segmentation fault on systems running NVIDIA driver 590.x (CUDA 13.1). The crash is caused by the UCX communication library, which is loaded transitively throughcugraphat import time. UCX callscuCtxGetDevice_v2in the system'slibcuda.so.1(CUDA 13.1 driver) before a CUDA context is initialized, causing a segfault before any segger code actually runs.This issue is not fixable by adjusting the CUDA toolkit or conda environment —
libcuda.so.1is always the system-global driver library. It is also not a transient problem: cuspatial has been archived (July 2025, read-only) and will never receive CUDA 13 builds, meaning segger cannot run on any system with a CUDA 13.x driver without changes to its dependencies.Environment
dpeerlab/seggermain branch)torch.cuda.is_available()returnsTrue)cudf-cu12,cuml-cu12,cugraph-cu12,cuspatial-cu12)ucx-py-cu12 0.43.0,libucx-cu12 1.18.1Reproducing the issue
Immediately crashes with:
Root cause analysis
The crash occurs in the NVIDIA driver's
cuCtxGetDevice_v2function, called viactypes/libffiby the UCX library during Python module import. The import chain is:Why it happens
libcuda.so.1is always system-global. It is provided by the NVIDIA kernel module and cannot be installed per-environment via conda or pip. On this system it is the CUDA 13.1 driver.UCX probes the CUDA driver at import time by calling
cuCtxGetDevice_v2before any CUDA context has been created. On the CUDA 13.1 driver, this results in a segfault instead of a graceful error return.RAPIDS
cu12packages ship UCX libraries compiled against CUDA 12.x, creating a mismatch with the CUDA 13.1 system driver.PyTorch handles this correctly —
torch.cuda.is_available()works fine with the same driver, demonstrating that the CUDA 13.1 driver is functional and backward-compatible for well-behaved clients.cuspatialis archived and will never have CUDA 13 builds. The cuspatial repository was archived by RAPIDS on July 28, 2025. Thecuspatial-cu13entry on PyPI is a zero-version placeholder. This means segger's dependency on cuspatial is a permanent blocker for CUDA 13.x systems — not a temporary gap that will be filled by a future release.UCX is not needed by segger. UCX provides multi-node multi-GPU communication for Dask distributed workloads. Segger runs single-node and does not use Dask distributed, yet UCX is loaded unconditionally because
cugraphimports itsdasksubmodule at package init time.What was tried (and failed)
cuda-toolkitto 12.1libcuda.so.1is always system-globalexport UCX_MEMTYPE_CACHE=n; export UCX_TLS=tcp,selfunset LD_LIBRARY_PATHmamba install -c rapidsai)libcuda.so.1CUDA_VISIBLE_DEVICES=""ucx-py-cu12,libucx-cu12, etc.)import cugraphfails withImportError: libucp.so.0because cugraph unconditionally imports its Dask/distributed submodule which requires UCXPossible solution