You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[5620660][ONNX] Remove toposort after quantization (#524)
## What does this PR do?
**Type of change:** Bug fix
**Overview:** Loading the model with ONNX graphsurgeon after
quantization and FP16 conversion results in an ONNX model with FP16
output instead of FP32 even though the Cast_to_fp32 layer was correctly
placed in the graph output. This PR fixes that issue.
## Usage
```python
$ python -m modelopt.onnx.quantization --onnx_path=$MODEL_NAME.onnx --high_precision_dtype=fp16
```
## Testing
See bug 5620660.
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: No
- **Did you add or update any necessary documentation?**: No
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
No
Signed-off-by: gcunhase <[email protected]>
0 commit comments