You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[5620217]Mixed-precision handle 8bit layer name matching error (#535)
## What does this PR do?
Handle 8bit layer name matching error while running for mixed precision
config
**Type of change:** Bug fix
**Overview:** Due to variations in export methods, the model
weight_tensor.name may appear as either an ID or a name.
For example: onnx::MatMul_9335 or
model.layers.2.attn.qkv_proj.MatMul.weight.
Need to adjust the comparison of 8bit_layers with the node names
accordingly to handle this variation.
## Testing
- Tested using mixed_int4_experiment.py
- Executed with the downloaded model from
onnx-community/Qwen2.5-1.5B-Instruct
- Also tested using the onnruntime-genai exported model from
meta-llama/Llama-3.1-8B-Instruct
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
Signed-off-by: unknown <[email protected]>
0 commit comments