You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[NVBUG: 5373030] Disable the weight adjustment for int32 bias from onnxruntime (#510)
## What does this PR do?
**Type of change:**
Bug Fix
**Overview:**
- Disable the weight adjustment for int32 bias in onnxruntime by default
## Usage
```python
python -m modelopt.onnx.quantization --onnx_path=code031_gemm_batch.onnx --simplify --calibration_eps trt --quantize_mode fp8 --disable_mha_qdq --high_precision_dtype fp16
```
## Testing
Able to quantize the code031_gemm_batch.onnx model
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: No
- **Did you add or update any necessary documentation?**: Yes
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
No
---------
Signed-off-by: ajrasane <[email protected]>
0 commit comments