[Help Needed] Convert ONNX into fp16 Engine

## Description

I tried to convert a [DAT](https://github.com/zhengchen1999/DAT) model into TensorRT format at `fp16` precision. But when I tried to perform inference on it, it only produces `nan`.

## Environment

**TensorRT Version**: `10.0.1.6`

**NVIDIA GPU**: `RTX 3060` 

**NVIDIA Driver Version**: `551.61`

**CUDA Version**: `12.3`

**CUDNN Version**: `8.9.7.29`


## Operating System

Windows 11


## Relevant Files

**Model Link**: [4x-Nomos8kDAT](https://openmodeldb.info/models/4x-Nomos8kDAT) *(`.onnx` format)*


## Steps To Reproduce

1. Convert the model into `fp16` engine
```bash
trtexec --onnx=4xNomos8kDAT.onnx --saveEngine=4xNomos8kDAT-fp16.trt --shapes=input:1x3x128x128 --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --fp16
```

2. Perform Inference
    - **Code Used:** https://github.com/Haoming02/TensorRT-Cpp/tree/bf16
    - Replace every `__nv_bfloat16` with `half`; and `cuda_bf16.h` with `cuda_fp16.h`

3. See only a pure black output
    - When adding debug log to the `outputData`, it simply prints `nan`

## Misc

Interestingly, if I convert the model into `bf16` precision with the following:
```bash
trtexec --onnx=4xNomos8kDAT.onnx --saveEngine=4xNomos8kDAT-bf16.trt --shapes=input:1x3x128x128 --inputIOFormats=bf16:chw --outputIOFormats=bf16:chw --bf16
```

And use the above code to perform inference, the output works correctly. So only `fp16` causes `nan` issues...

- The model size for `fp32` is ~120 MB, for `fp16` is ~70 MB; for `bf16` is ~100 MB
- The inference speed is similar between `fp32` and `bf16`; but almost twice as fast for `fp16`

Previously, I also tried using TensorRT **8.6** to convert the model. When specifying the `fp16` flag, it would print out some warnings about inaccuracy. However, these warnings were not present when converting the model using TensorRT **10.0**.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Help Needed] Convert ONNX into fp16 Engine #3921

Description

Environment

Operating System

Relevant Files

Steps To Reproduce

Misc

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Help Needed] Convert ONNX into fp16 Engine #3921

Description

Description

Environment

Operating System

Relevant Files

Steps To Reproduce

Misc

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions