Conversation
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
|
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for AutoRound quantization by adding a new AutoRoundModifier, an example script, and corresponding tests. My review focuses on improving the portability and clarity of the new example script and addressing several issues within the AutoRoundModifier implementation. Key feedback includes removing hardcoded paths and values, generalizing model-specific logic, and completing the handling of quantization parameters to ensure correctness and reusability.
Signed-off-by: yiliu30 <yi4.liu@intel.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for the AutoRound quantization algorithm, including a new AutoRoundModifier, an example script, and tests. The core implementation is a great start, but there are several areas that need improvement for robustness and usability. Specifically, the example scripts contain hardcoded paths and development artifacts that should be cleaned up. The AutoRoundModifier implementation has hardcoded values for the quantization scheme and device, which limits its flexibility. There is also a critical TODO regarding weight updates that needs to be addressed to ensure correctness. My review includes suggestions to address these points.
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for AutoRound quantization by adding an AutoRoundModifier. The changes include the modifier implementation, a new dependency, an example script, and tests. Overall, the implementation is solid, but there are a few issues to address. The example scripts contain hardcoded paths and typos that should be cleaned up. More importantly, the AutoRoundModifier implementation has a potential bug in handling multiple configuration groups and appears to be missing logic for updating the weight_zero_point after quantization. I've provided specific comments and suggestions to address these points.
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
|
/gemini summary |
Summary of ChangesThis pull request integrates the AutoRound quantization algorithm into the Highlights
Changelog
Activity
|
Signed-off-by: yiliu30 <yi4.liu@intel.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for the Intel AutoRound quantization algorithm, which is a valuable addition. The implementation includes the core AutoRoundModifier, an end-to-end example, and unit tests. The code is generally well-structured, but I've identified a few areas for improvement related to correctness and robustness, particularly concerning the handling of quantization parameters and module attribute modifications.
| for name, mod in state.model.named_modules(): | ||
| mod._tmp_name = name |
There was a problem hiding this comment.
Adding a temporary attribute _tmp_name to every module in the model can be fragile and may cause attribute name collisions. A more robust approach would be to avoid modifying the modules directly. For instance, you could use a dictionary that maps id(module) to its name, or use a closure in the hook registration to capture the module name. This would make the implementation cleaner and less intrusive.
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
|
1994 |
Address [todo: issue number]
Highlights
AutoRoundModifierto enable AutoRound quantization forwNa16.Next stage (in later PRs)