Skip to content

Unable to use FP8 convolution kernels with TensorRT 10.12.0.36 when running FP8 models on NVIDIA H100 #4560

@WenjingHuangHPC

Description

@WenjingHuangHPC

Hello,

I was testing FP8 inference on convolution layers using Torch-TensorRT with TensorRT 10.12.0.36 on an H100 GPU.
Even though the model structure meets FP8 convolution requirements (channels multiple of 32), TensorRT still didn't select FP8 convolution kernels, while FP8 works fine for Linear layers in the same environment.

Environment:

  • TensorRT 10.12.0.36
  • Torch-TensorRT 2.8.0
  • CUDA: 12.6

Here is the Python script:
fp8conv.py

Tensorboard shows that there are no TensorCores being used.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions