Skip to content

Conversation

Etelis
Copy link

@Etelis Etelis commented Sep 11, 2025

Follow-up to PR #459 (which was closed due to an accidental history rewrite). The branch has been restored to the pre-rewrite tip (2cf7124).

This PR:

  • Implements zero-point decompression for asymmetric quantization in the packed compressor
  • Adds tests (in-memory decompress_model, calibration fixtures)
  • Uses a standard-deviation-based similarity threshold for reconstruction

Refs vllm-project/llm-compressor#1704.

- Fix decompress_weight method in PackedQuantizationCompressor to support unpacking zero-points
- Add comprehensive tests for zero-point packing/unpacking with GROUP and CHANNEL strategies
- Add end-to-end integration tests for asymmetric quantization workflow
- Ensure packed tensors are contiguous for safetensors compatibility

Resolves issue referenced in vllm-project/llm-compressor#1704
@Etelis Etelis force-pushed the fix/zero-point-decompression branch from 2cf7124 to 126fc89 Compare September 11, 2025 14:57
@Etelis
Copy link
Author

Etelis commented Sep 11, 2025

@dsikka, @brian-dellabetta,

I've addressed all your concerns.
Sorry about that.

Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good!

Some questions about the e2e test but otherwise, looks great!

}

state_dict = model.state_dict()
compressed_state_dict = compressor.compress(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be simpler to use compress_model / decompress_model, both of which will work on the model in memory

compress / decompress are primarily meant for disk

if hasattr(module, param_name):
getattr(module, param_name).data = value.clone()
else:
module.register_parameter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why we need to separately register parameters here?

if hasattr(module, param_name):
getattr(module, param_name).data = value.clone()
else:
module.register_parameter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

Copy link
Contributor

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good pending @dsikka 's comments, thanks for updating!

Copy link
Member

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending @dsikka 's comments! Thank you for this contribution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants