-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Closed
Labels
Feature RequestRequest for new functionalityRequest for new functionalityModule:PluginsIssues when using TensorRT pluginsIssues when using TensorRT pluginsModule:QuantizationIssues related to QuantizationIssues related to QuantizationtriagedIssue has been triaged by maintainersIssue has been triaged by maintainers
Description
Hello,
There is a limitation that still holds from the beggining of the fp8 introduction (see the limitations section in the release notes):
There are no optimized FP8 Convolutions for Group Convolutions and Depthwise Convolutions. Therefore, INT8 is still recommended for ConvNets containing these convolution ops.
Is this something that will be addressed at some point ? If not, could we have a hint at the reason why ?
Also would it be technically sound and feasible to leverage the CUTLASS framework to do that and integrate it as a tensorrt plugin (for a custom node) ?
Best,
Metadata
Metadata
Assignees
Labels
Feature RequestRequest for new functionalityRequest for new functionalityModule:PluginsIssues when using TensorRT pluginsIssues when using TensorRT pluginsModule:QuantizationIssues related to QuantizationIssues related to QuantizationtriagedIssue has been triaged by maintainersIssue has been triaged by maintainers