Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions modelopt/torch/export/unified_export_hf.py
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,10 @@ def _export_quantized_weight(

setattr(sub_module, weight_name, nn.Parameter(quantized_weight, requires_grad=False))

# Register the corrected weight_scale as a buffer
if weight_scale is not None:
sub_module.register_buffer(quantizer_attrs.weight_scale, weight_scale)
Comment on lines +335 to +337
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix duplicate buffer registration

register_buffer throws if the name is already registered. Earlier in this function we always register quantizer_attrs.weight_scale, so this new call will raise KeyError for every quantized module, breaking export. Update the existing buffer instead of re-registering it.

-    if weight_scale is not None:
-        sub_module.register_buffer(quantizer_attrs.weight_scale, weight_scale)
+    if weight_scale is not None:
+        setattr(sub_module, quantizer_attrs.weight_scale, weight_scale)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Register the corrected weight_scale as a buffer
if weight_scale is not None:
sub_module.register_buffer(quantizer_attrs.weight_scale, weight_scale)
# Register the corrected weight_scale as a buffer
if weight_scale is not None:
setattr(sub_module, quantizer_attrs.weight_scale, weight_scale)
🤖 Prompt for AI Agents
In modelopt/torch/export/unified_export_hf.py around lines 335-337, the code
unconditionally calls sub_module.register_buffer(quantizer_attrs.weight_scale,
weight_scale) which raises when that buffer name was already registered earlier;
instead check whether the buffer name already exists on sub_module and if so
update the existing buffer value (e.g., assign to
sub_module._buffers[quantizer_attrs.weight_scale] or equivalent), otherwise
register it; ensure you only call register_buffer when the name is absent so
duplicate registration (KeyError) is avoided.



def _export_hf_checkpoint(
model: nn.Module, dtype: torch.dtype | None = None
Expand Down
Loading