Skip to content

[bugfix] fix gptq transformers>=5.0#9042

Open
Jintao-Huang wants to merge 4 commits intomodelscope:mainfrom
Jintao-Huang:fix_gptmodel_qwen3_5
Open

[bugfix] fix gptq transformers>=5.0#9042
Jintao-Huang wants to merge 4 commits intomodelscope:mainfrom
Jintao-Huang:fix_gptmodel_qwen3_5

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds FP8 to the quantization support documentation and introduces a check for hf_device_map in the GPTQ quantization pipeline. Feedback was provided to avoid hardcoding the device to cuda:0 in the quantization logic and to fix inconsistent table alignment in the English documentation.

Comment on lines +283 to +284
if not hasattr(self.model, 'hf_device_map'):
self.model.hf_device_map = {'': torch.device('cuda:0')}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Hardcoding the device to cuda:0 can lead to failures or incorrect behavior on multi-GPU systems or CPU-only environments. It is safer to use the model's current device when initializing the hf_device_map.

Suggested change
if not hasattr(self.model, 'hf_device_map'):
self.model.hf_device_map = {'': torch.device('cuda:0')}
if not hasattr(self.model, 'hf_device_map'):
self.model.hf_device_map = {'': self.model.device}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants