Is there any why to make safe trt fp16 conversion according the torch AMP training process

After converting the ONNX fp32 model to a TensorRT fp16 model, numerical overflow occurred during inference. The model was trained using the PyTorch framework with automatic mixed precision, and the ONNX fp32 model inference works normally. Therefore, it may be necessary to manually specify certain model layers to maintain fp32 precision. Is it possible to determine whether the TensorRT operations related to this operator are full precision based on the parameter types used in the mixed precision training process of the PyTorch model? How can this be carried out?     

We had tried keep some classical layers fp32, like Softmax, LayerNorm, Sigmoid, ....  with no luck.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is there any why to make safe trt fp16 conversion according the torch AMP training process #4662

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is there any why to make safe trt fp16 conversion according the torch AMP training process #4662

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions