Skip to content

TensorRT-LLM Release 0.18.0

Choose a tag to compare

@kaiyux kaiyux released this 02 Apr 09:08
· 5422 commits to main since this release
3c04620

Hi,

We are very pleased to announce the 0.18.0 version of TensorRT-LLM. This update includes:

Key Features and Enhancements

  • Features that were previously available in the 0.18.0.dev pre-releases are not included in this release.
  • [BREAKING CHANGE] Windows platform support is deprecated as of v0.18.0. All Windows-related code and functionality will be completely removed in future releases.

Known Issues

  • The PyTorch workflow on SBSA is incompatible with bare metal environments like Ubuntu 24.04. Please use the PyTorch NGC Container for optimal support on SBSA platforms.

Infrastructure Changes

  • The base Docker image for TensorRT-LLM is updated to nvcr.io/nvidia/pytorch:25.03-py3.
  • The base Docker image for TensorRT-LLM Backend is updated to nvcr.io/nvidia/tritonserver:25.03-py3.
  • The dependent TensorRT version is updated to 10.9.
  • The dependent CUDA version is updated to 12.8.1.
  • The dependent NVIDIA ModelOpt version is updated to 0.25 for Linux platform.