TensorRT 10.15 Release

Latest

Latest

kevinch-nv released this 03 Feb 22:22

· 1 commit to release/10.15 since this release

9973b2f

For more information, see the TensorRT 10.15 Release Notes:

Sample changes

Added 2 safety samples sampleSafeMNIST, and sampleSafePluginV3 to demonstrate how to use TensorRT with the safety workflow.
Added trtSafeExec to accompany the safety workflow release.
Added python/stream_writer to showcase how to serialize a TensorRT engine directly to a custom stream using the IStreamWriter interface, rather than writing to a file or a contiguous memory buffer.
Added python/strongly_type_autocast to demonstrate how to convert FP32 ONNX models to mixed precision (FP32-FP16) using ModelOpt's AutoCast tool and subsequently building the engine with TensorRT's Strong Typing mode.
Added sampleCudla to demonstrate how to use the cuDLA API to run TensorRT engines on the Deep Learning Accelerator (DLA) hardware, which is available on NVIDIA Jetson and DRIVE platforms.
Deprecated sampleCharRNN.

Plugin changes

Deprecated bertQKVToContextPlugin and will be removed in a future release. No alternatives are planned to be provided.

Parser changes

Added support for RotaryEmbedding, RMSNormalization and TensorScatter for improved LLM model support
Added more specialized quantization ops for models quantized through TensorRT ModelOptimizer.
Added kREPORT_CAPABILITY_DLA flag to enable per-node validation when building DLA engines through TensorRT.
Added kENABLE_PLUGIN_OVERRIDE flag to enable TensorRT plugin override for nodes that share names with user plugins.
Improved error reporting for models with multiple subgraphs, such as Loop or Scan nodes.

Demo changes

demoDiffusion: Stable Diffusion 1.5, 2.0 and 2.1 pipelines have been deprecated and removed.

Assets 3