Skip to content

Commit a14c3e6

Browse files
wenhuach21XuehaoSun
authored andcommitted
fix severe vram leak regression in auto-round format packing
1 parent 072cb8b commit a14c3e6

File tree

1 file changed

+1
-1
lines changed
  • auto_round/export/export_to_autoround

1 file changed

+1
-1
lines changed

auto_round/export/export_to_autoround/export.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -215,7 +215,7 @@ def pack_layer(layer_name, model, backend, device=None):
215215
qlayer.pack(layer, scale, device=device)
216216
else:
217217
qlayer.pack(layer, scale, zp, None, device=device)
218-
qlayer.to(device)
218+
qlayer.to(orig_device)
219219
else:
220220
scale = scale.to(torch.float32).t().contiguous()
221221
if isinstance(zp, torch.Tensor):

0 commit comments

Comments
 (0)