Hello,
I was testing FP8 inference on convolution layers using Torch-TensorRT with TensorRT 10.12.0.36 on an H100 GPU.
Even though the model structure meets FP8 convolution requirements (channels multiple of 32), TensorRT still didn't select FP8 convolution kernels, while FP8 works fine for Linear layers in the same environment.
Environment:
- TensorRT 10.12.0.36
- Torch-TensorRT 2.8.0
- CUDA: 12.6
Here is the Python script:
fp8conv.py
Tensorboard shows that there are no TensorCores being used.
