[bugfix] Fix Qwen3.5 LoRA merge export producing wrong state_dict keys#9057
Open
Redamency wants to merge 1 commit intomodelscope:mainfrom
Open
[bugfix] Fix Qwen3.5 LoRA merge export producing wrong state_dict keys#9057Redamency wants to merge 1 commit intomodelscope:mainfrom
Redamency wants to merge 1 commit intomodelscope:mainfrom
Conversation
modelscope#9046) In transformers>=5.5.0, `save_pretrained` calls `revert_weight_conversion` which incorrectly applies weight key renaming for composite models like Qwen3.5. The conversion mapping `^model.language_model -> model` causes keys like `model.language_model.layers.X.*` to be doubly prefixed as `model.language_model.language_model.language_model.layers.X.*`, and `model.visual.*` to become `model.language_model.visual.*`. Fix: Pass `save_original_format=False` to `save_pretrained` to skip the buggy `revert_weight_conversion` step. The in-memory state_dict already has correct keys matching the model's safetensors format. A version check via `inspect.signature` ensures backward compatibility with older transformers versions that lack this parameter.
1 task
Contributor
There was a problem hiding this comment.
Code Review
This pull request updates save_checkpoint in swift/model/utils.py to conditionally pass save_original_format=False to save_pretrained when supported, preventing a weight conversion bug in transformers>=5.5. Regarding the review feedback, the import inspect statement should be moved to the top of the file to adhere to PEP 8 standards and avoid redundant imports during function execution.
| # that corrupts state_dict keys for composite models (e.g. Qwen3.5). | ||
| # See: https://github.com/modelscope/ms-swift/issues/9046 | ||
| save_kwargs = {} | ||
| import inspect |
Contributor
There was a problem hiding this comment.
According to PEP 8 style guidelines, imports should be placed at the top of the file. Please move import inspect to the top-level imports of this module. This improves code organization and avoids re-importing the module on every function call.
References
- PEP 8 states that imports should be at the top of the file, just after any module comments and docstrings, and before module globals and constants. Placing imports inside functions is discouraged. (link)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue
Fixes #9046
Root Cause
When using , the method applies which uses the model's conversion mapping (e.g. ) in reverse during saving. For composite models like Qwen3.5 (which has a submodule), this causes state_dict keys to be incorrectly prefixed.
For example:
This makes the exported model unusable.
Fix
Pass to to skip the buggy step. The fix uses to check parameter availability for backward compatibility with older transformers versions.
Testing
Verified with Qwen3.5-0.8B + LoRA merge export: