How to enable FP8 convolution in TensorRT 10.2

Hello,

I am using TensorRT 10.2 and noticed that the normal FP8 convolution has been updated.
However, when I try to use a simple QDQ + Conv model in ONNX, the FP8 convolution is not selected. Even timing FP8 tactics is not performed.

Here is the model I used. It was quantized by using TensorRT-Model-Optimizer. And I used H100 device.
<img width="459" alt="image" src="https://github.com/NVIDIA/TensorRT/assets/68811848/0f9f6f09-485c-4d96-9074-0a3048e024b4">
- [simple_conv_fp8.onnx.zip](https://github.com/user-attachments/files/16123237/simple_conv_fp8.onnx.zip)
- `trtexec` command:
```
$ trtexec --onnx=simple_conv_fp8.onnx --fp16 --fp8 --profilingVerbosity=detailed --verbose --exportLayerInfo=layerinfo.json
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to enable FP8 convolution in TensorRT 10.2 #3987

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to enable FP8 convolution in TensorRT 10.2 #3987

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions