TensorRT-LLM Release 0.18.0

kaiyux released this 02 Apr 09:08

· 5422 commits to main since this release

3c04620

Hi,

We are very pleased to announce the 0.18.0 version of TensorRT-LLM. This update includes:

Key Features and Enhancements

Features that were previously available in the 0.18.0.dev pre-releases are not included in this release.
[BREAKING CHANGE] Windows platform support is deprecated as of v0.18.0. All Windows-related code and functionality will be completely removed in future releases.

Known Issues

The PyTorch workflow on SBSA is incompatible with bare metal environments like Ubuntu 24.04. Please use the PyTorch NGC Container for optimal support on SBSA platforms.

Infrastructure Changes

The base Docker image for TensorRT-LLM is updated to nvcr.io/nvidia/pytorch:25.03-py3.
The base Docker image for TensorRT-LLM Backend is updated to nvcr.io/nvidia/tritonserver:25.03-py3.
The dependent TensorRT version is updated to 10.9.
The dependent CUDA version is updated to 12.8.1.
The dependent NVIDIA ModelOpt version is updated to 0.25 for Linux platform.

Assets 2

2 Join discussion