Commit 9cd0824
authored
Make all tensors on same device for svdquant with cpu-offloading (NVIDIA#550)
## What does this PR do?
**Type of change:** Bug Fix
**Overview:** ?
While running SVDQuant with cpu-offloading enabled using diffuser-ptq
example (sd3.5-medium model), error about "not all tensors on same
device" were observed at following steps:
1. awq-scale computation - get_scale() using x_max and w_max
2. loss update for each alpha - update_loss()
3. _apply_weight_pre_quant_scale() - while multiplying with
pre-quant-scale
4. apply_pre_quant_scale_and_smooth() - while multiplying with
pre-quant-scale
These errors should also be seen with flux model - with SVDQuant and
cpu-offloading enabled.
So, in this change, updating above places to ensure that concerned
tensors are on same device. Using ".to(device)" for this effect.
## Testing
- Tried SVDQuant with cpu-offloading enabled - with sd3.5-medium, on RTX
5090, Windows 11 22621. With this change, final ONNX model (transformer)
was produced without any error.
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes/No <!--- If No, explain
why. -->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
---------
Signed-off-by: vipandya <[email protected]>1 parent e3e399a commit 9cd0824
1 file changed
+11
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
251 | 251 | | |
252 | 252 | | |
253 | 253 | | |
254 | | - | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
255 | 257 | | |
256 | 258 | | |
257 | 259 | | |
| |||
300 | 302 | | |
301 | 303 | | |
302 | 304 | | |
303 | | - | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
304 | 308 | | |
305 | 309 | | |
306 | 310 | | |
| |||
507 | 511 | | |
508 | 512 | | |
509 | 513 | | |
510 | | - | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
511 | 518 | | |
512 | 519 | | |
513 | 520 | | |
| |||
521 | 528 | | |
522 | 529 | | |
523 | 530 | | |
524 | | - | |
| 531 | + | |
525 | 532 | | |
526 | 533 | | |
527 | 534 | | |
| |||
0 commit comments