Skip to content

Conversion from onnx hurts accuracy when model is using fp16 #2922

@omera-nv

Description

@omera-nv

Description

I have an onnx model (a t5 encoder that I exported from pytorch) that I wish to convert to trt. This works great, but when I try to convert the model to fp16 the model's accuracy drops and it produces nothing useful. I've tried to convert the onnx model to fp16 before converting to trt and the fp16 onnx model's accuracy is good, while the conversion to trt once again hurts it badly.

Environment

TensorRT Version: 8.5.3.1
NVIDIA GPU: NVIDIA RTX A4000
NVIDIA Driver Version: 515.43.04
CUDA Version: 11.7
CUDNN Version: 8.5.0
Operating System: Ubuntu 22.04
Python Version (if applicable): 3.10
Tensorflow Version (if applicable): N/A
PyTorch Version (if applicable): 2.0.0
Baremetal or Container (if so, version): Baremetal

Relevant Files

The fp32 and fp16 onnx models can be downloaded from this link: https://drive.google.com/drive/folders/1zeAW2oPP-2VwnK-SKcRVVqed30BZLMKk?usp=sharing

Steps To Reproduce

This can be reproduced using polygraphy (after making tensorrt use np.bool_ instead of np.bool):

polygraphy run t5_fp32_encoder.onnx --onnxrt --trt
polygraphy run t5_fp32_encoder.onnx --onnxrt --trt --fp16
polygraphy run t5_fp16_encoder.onnx --onnxrt --trt --fp16

The first line converts the fp32 model and works, the second and third lines convert the fp32 model to trtfp16 or the fp16 model to trt fp16 and both fail.

EDIT: just noticed that the entire output for the fp16 trt model is zeros (as can be seen by the following line in the polygraphy output:

...
[I]         trt-runner-N0-05/01/23-14:36:19: encoder_last_hidden_state | Stats: mean=2.2204e-16, std-dev=0, var=0, median=2.2204e-16, min=2.2204e-16 at (0, 0, 0), max=2.2204e-16 at (0, 0, 0), avg-magnitude=2.2204e-16
...

Metadata

Metadata

Assignees

Labels

triagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions