FP8 Depthwise Convolutions

Hello,

There is a limitation that still holds from the beggining of the fp8 introduction (see the limitations section in the [release notes](https://docs.nvidia.com/deeplearning/tensorrt/10.12.0/getting-started/release-notes.html)):

> There are no optimized FP8 Convolutions for Group Convolutions and Depthwise Convolutions. Therefore, INT8 is still recommended for ConvNets containing these convolution ops.

Is this something that will be addressed at some point ? If not, could we have a hint at the reason why ?
Also would it be technically sound and feasible to leverage the CUTLASS framework to do that and integrate it as a tensorrt plugin (for a custom node) ?

Best,



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FP8 Depthwise Convolutions #4524

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FP8 Depthwise Convolutions #4524

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions