[Refactor] lora: reuse load_weights packed mapping#991
[Refactor] lora: reuse load_weights packed mapping#991dongbo910220 wants to merge 6 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This refactor centralizes the packed→sublayer mapping for diffusion LoRA on each model’s load_weights() stacked_params_mapping, removing duplicated mapping definitions and making LoRA behavior consistent with the weight-loading path.
Changes:
- Remove diffusion transformers’ class-level
packed_modules_mappingand expose their existingstacked_params_mappingviaself.stacked_params_mappinginsideload_weights(). - Update
DiffusionLoRAManagerto derive packed→sublayer mappings fromstacked_params_mapping, and adjust_expand_expected_modules_for_packed_layersdocumentation and logging accordingly. - Update LoRA manager tests to use
stacked_params_mapping-based behavior and verify replacement/activation logic for packed layers still works as intended.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
vllm_omni/diffusion/models/z_image/z_image_transformer.py |
Removes packed_modules_mapping and exposes stacked_params_mapping from load_weights() for Z-Image. |
vllm_omni/diffusion/models/wan2_2/wan2_2_transformer.py |
Same refactor for Wan 3D transformer; load_weights() now assigns self.stacked_params_mapping. |
vllm_omni/diffusion/models/sd3/sd3_transformer.py |
Same refactor for SD3 transformer, covering both self- and cross-attention fused projections. |
vllm_omni/diffusion/models/qwen_image/qwen_image_transformer.py |
Same refactor for Qwen Image transformer, including cross-attention and .to_out handling. |
vllm_omni/diffusion/models/ovis_image/ovis_image_transformer.py |
Same refactor for Ovis Image transformer, exposing packed self-/cross-attention mappings. |
vllm_omni/diffusion/models/longcat_image/longcat_image_transformer.py |
Same refactor for LongCat Image transformer, including FFN/context FFN name remapping. |
vllm_omni/diffusion/models/glm_image/glm_image_transformer.py |
Same refactor for GLM Image transformer, exposing fused QKV mappings. |
vllm_omni/diffusion/models/flux2_klein/flux2_klein_transformer.py |
Same refactor for Flux2 Klein transformer, covering fused QKV and add_kv_proj mappings. |
vllm_omni/diffusion/lora/utils.py |
Updates docstring to state packed→sublayer mapping is derived from each model’s stacked_params_mapping. |
vllm_omni/diffusion/lora/manager.py |
Changes _compute_packed_modules_mapping to derive mappings from stacked_params_mapping (with validation and conflict warnings) and tweaks related warning messages. |
tests/diffusion/lora/test_lora_manager.py |
Adjusts tests to use stacked_params_mapping on the pipeline, and aligns comments with the new mapping source while preserving behavior checks. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: dongbo910220 <1275604947@qq.com>
a0d52aa to
517eb73
Compare
|
@dongbo910220 Please resolve conflicts. Thanks! |
@Gaohan123 Resolved. Thanks! |
Signed-off-by: dongbo910220 <1275604947@qq.com>
Signed-off-by: dongbo910220 <1275604947@qq.com>
|
@Gaohan123 , ROCm CI is still on vLLM 0.15.0 (see docker/Dockerfile.rocm), so @config(config=...) fails with TypeError: config() got an unexpected keyword argument 'config'. This is an environment/version mismatch, not a logic regression. The project docs explicitly recommend vLLM 0.16.0 (docs/getting_started/quickstart.md). Could we upgrade the ROCm CI image to vLLM ≥ 0.16 (or reinstall vLLM in ROCm CI), or should we add a backward-compatibility shim? |
|
@dongbo910220 Hey, reusing each model's load_weights() stacked_params_mapping as the single source of truth for LoRA packed→sublayer mapping is a clean refactor — avoids divergent mappings. Tests are passing too. Is this ready for merge? |
@lishunyang12 Thanks! Yes, it’s ready to merge. Local tests are passing; the remaining CI issue is an ROCm environment mismatch (vLLM 0.15.x) and not caused by this change. |
lishunyang12
left a comment
There was a problem hiding this comment.
Left a couple of comments. The main concern is a regression in z_image's QKV mapping.
Signed-off-by: dongbo910220 <1275604947@qq.com>
Based on @ZJY0516 feedback, reuse each diffusion model’s load_weights() stacked_params_mapping as the single source of truth for packed→sublayer mapping used by LoRA.
Purpose
Test Plan
pytest -q tests/diffusion/lora
Test Result
passed
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)