Conversation
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request streamlines the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request updates the codebase to align with changes in the compressed-tensors library. The main changes include deprecating is_rank0 in favor of is_main_process and removing support for sparsity compression, which is now reflected by a warning. The tests have been significantly updated to remove sparsity-related checks and adapt to the new APIs.
I've found a couple of issues in the test suite updates. One is a bug in test_quant_model_reload where a function call has no effect due to not using its return value. The other is a potential robustness issue in the DummyLinearModel test helper. Please see my detailed comments.
I am having trouble creating individual review comments. Click here to see my feedback.
tests/llmcompressor/transformers/compression/test_compress_tensor_utils.py (90)
The function _remove_zp returns a new dictionary and does not modify its argument in-place. The result of _remove_zp(og_state_dict) is not assigned to any variable, so this line has no effect. This will likely cause the assertion on line 92 to fail if og_state_dict contains zero-point keys that are not present in reconstructed_state_dict.
You should reassign the result back to og_state_dict.
og_state_dict = _remove_zp(og_state_dict) # HACK: remove extra zero points added during quant init
tests/llmcompressor/transformers/compression/test_compress_tensor_utils.py (175-176)
The removal of the is not None checks for weight_scale and zero_point makes this test utility less robust. If None is passed for these arguments, nn.Parameter(None) will be created. This changes the behavior from the attribute not existing (as in the previous implementation) to the attribute existing with a None value. This could lead to unexpected behavior in code that checks for the presence of these attributes. It would be safer to restore the if ... is not None checks.
if weight_scale is not None:
self.linear.weight_scale = nn.Parameter(weight_scale, requires_grad=False)
if zero_point is not None:
self.linear.weight_zero_point = nn.Parameter(zero_point, requires_grad=False)
e29a4f1 to
e25b87b
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
HDCharles
left a comment
There was a problem hiding this comment.
can we have a PR for the is_rank0 changes and another doing the refactor stuff?
See vllm-project/compressed-tensors#631