Skip to content

[Bug]: Impossible to install with cuda 13 on linux #9318

@skier233

Description

@skier233

System Info


# Install torch/torchvision
pip install torch==2.9.0 torchvision --index-url https://download.pytorch.org/whl/cu130
python - <<'PY'
import torch, os
print("torch", torch.__version__, "built with CUDA", torch.version.cuda)
print("LD_LIBRARY_PATH =", os.environ.get("LD_LIBRARY_PATH"))
PY

-->

torch 2.9.0+cu130 built with CUDA 13.0
LD_LIBRARY_PATH = None
pip install --extra-index-url https://pypi.nvidia.com tensorrt-llm==1.2.0rc2

This installs CUDA 13.0. But it downgrades pytorch to cuda 12.8:

python - <<'PY'
import torch, os
print("torch", torch.__version__, "built with CUDA", torch.version.cuda)
print("LD_LIBRARY_PATH =", os.environ.get("LD_LIBRARY_PATH"))
PY

-->

torch 2.9.0+cu128 built with CUDA 12.8
LD_LIBRARY_PATH = None

This then makes it impossible to use because torch is now expecting cuda 12.8 but tensorrt-llm has overwritten the underlying cuda to be 13.0

I see this manifest in this error:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/mnt/c/Coding/Testing/PyTorch/MultiClassImageClassification/src/modern_imclas/export_tensorrt.py", line 307, in <module>
    main()
  File "/mnt/c/Coding/Testing/PyTorch/MultiClassImageClassification/src/modern_imclas/export_tensorrt.py", line 294, in main
    compile_with_torch_trt(
  File "/mnt/c/Coding/Testing/PyTorch/MultiClassImageClassification/src/modern_imclas/export_tensorrt.py", line 200, in compile_with_torch_trt
    import torch_tensorrt  # type: ignore
    ^^^^^^^^^^^^^^^^^^^^^
  File "/home/tyler/miniconda3/envs/new_training_env/lib/python3.12/site-packages/torch_tensorrt/__init__.py", line 79, in <module>
    _register_with_torch()
  File "/home/tyler/miniconda3/envs/new_training_env/lib/python3.12/site-packages/torch_tensorrt/__init__.py", line 66, in _register_with_torch
    torch.ops.load_library(linked_file_full_path)
  File "/home/tyler/miniconda3/envs/new_training_env/lib/python3.12/site-packages/torch/_ops.py", line 1392, in load_library
    ctypes.CDLL(path)
  File "/home/tyler/miniconda3/envs/new_training_env/lib/python3.12/ctypes/__init__.py", line 379, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libcudart.so.13: cannot open shared object file: No such file or directory

I have not been able to find any version combination of cuda+pytorch+tensorrt_llm that will work with pytorch, torchvision, torch-tensorrt, cuda, tensorrt_llm.

Who can help?

@kaiyux

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction


# Install torch/torchvision
pip install torch==2.9.0 torchvision --index-url https://download.pytorch.org/whl/cu130
python - <<'PY'
import torch, os
print("torch", torch.__version__, "built with CUDA", torch.version.cuda)
print("LD_LIBRARY_PATH =", os.environ.get("LD_LIBRARY_PATH"))
PY

-->

torch 2.9.0+cu130 built with CUDA 13.0
LD_LIBRARY_PATH = None
pip install --extra-index-url https://pypi.nvidia.com tensorrt-llm==1.2.0rc2

This installs CUDA 13.0. But it downgrades pytorch to cuda 12.8:

python - <<'PY'
import torch, os
print("torch", torch.__version__, "built with CUDA", torch.version.cuda)
print("LD_LIBRARY_PATH =", os.environ.get("LD_LIBRARY_PATH"))
PY

-->

torch 2.9.0+cu128 built with CUDA 12.8
LD_LIBRARY_PATH = None

This then makes it impossible to use because torch is now expecting cuda 12.8 but tensorrt-llm has overwritten the underlying cuda to be 13.0

Expected behavior

the cuda version should align

actual behavior

cuda version and pytorch expected cuda version are misaligned thus making the release unusable

additional notes

n/a

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Assignees

Labels

InstallationSetting up and building TRTLLM: compilation, pip install, dependencies, env config, CMake.bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions