Skip to content

[Refactor] lora: reuse load_weights packed mapping#991

Open
dongbo910220 wants to merge 6 commits intovllm-project:mainfrom
dongbo910220:refactor_diffusion_lora_packed_mapping
Open

[Refactor] lora: reuse load_weights packed mapping#991
dongbo910220 wants to merge 6 commits intovllm-project:mainfrom
dongbo910220:refactor_diffusion_lora_packed_mapping

Conversation

@dongbo910220
Copy link
Contributor

@dongbo910220 dongbo910220 commented Jan 27, 2026

Based on @ZJY0516 feedback, reuse each diffusion model’s load_weights() stacked_params_mapping as the single source of truth for packed→sublayer mapping used by LoRA.

Purpose

  • Refactor diffusion LoRA packed→sublayer mapping to be derived from each model’s load_weights() stacked_params_mapping, avoiding duplicated/possibly divergent mappings across code paths.

Test Plan

pytest -q tests/diffusion/lora

Test Result

passed


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This refactor centralizes the packed→sublayer mapping for diffusion LoRA on each model’s load_weights() stacked_params_mapping, removing duplicated mapping definitions and making LoRA behavior consistent with the weight-loading path.

Changes:

  • Remove diffusion transformers’ class-level packed_modules_mapping and expose their existing stacked_params_mapping via self.stacked_params_mapping inside load_weights().
  • Update DiffusionLoRAManager to derive packed→sublayer mappings from stacked_params_mapping, and adjust _expand_expected_modules_for_packed_layers documentation and logging accordingly.
  • Update LoRA manager tests to use stacked_params_mapping-based behavior and verify replacement/activation logic for packed layers still works as intended.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
File Description
vllm_omni/diffusion/models/z_image/z_image_transformer.py Removes packed_modules_mapping and exposes stacked_params_mapping from load_weights() for Z-Image.
vllm_omni/diffusion/models/wan2_2/wan2_2_transformer.py Same refactor for Wan 3D transformer; load_weights() now assigns self.stacked_params_mapping.
vllm_omni/diffusion/models/sd3/sd3_transformer.py Same refactor for SD3 transformer, covering both self- and cross-attention fused projections.
vllm_omni/diffusion/models/qwen_image/qwen_image_transformer.py Same refactor for Qwen Image transformer, including cross-attention and .to_out handling.
vllm_omni/diffusion/models/ovis_image/ovis_image_transformer.py Same refactor for Ovis Image transformer, exposing packed self-/cross-attention mappings.
vllm_omni/diffusion/models/longcat_image/longcat_image_transformer.py Same refactor for LongCat Image transformer, including FFN/context FFN name remapping.
vllm_omni/diffusion/models/glm_image/glm_image_transformer.py Same refactor for GLM Image transformer, exposing fused QKV mappings.
vllm_omni/diffusion/models/flux2_klein/flux2_klein_transformer.py Same refactor for Flux2 Klein transformer, covering fused QKV and add_kv_proj mappings.
vllm_omni/diffusion/lora/utils.py Updates docstring to state packed→sublayer mapping is derived from each model’s stacked_params_mapping.
vllm_omni/diffusion/lora/manager.py Changes _compute_packed_modules_mapping to derive mappings from stacked_params_mapping (with validation and conflict warnings) and tweaks related warning messages.
tests/diffusion/lora/test_lora_manager.py Adjusts tests to use stacked_params_mapping on the pipeline, and aligns comments with the new mapping source while preserving behavior checks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: dongbo910220 <1275604947@qq.com>
@dongbo910220 dongbo910220 force-pushed the refactor_diffusion_lora_packed_mapping branch from a0d52aa to 517eb73 Compare February 1, 2026 05:30
@Gaohan123
Copy link
Collaborator

@dongbo910220 Please resolve conflicts. Thanks!

@Gaohan123 Gaohan123 added this to the v0.16.0 milestone Feb 11, 2026
@dongbo910220
Copy link
Contributor Author

@dongbo910220 Please resolve conflicts. Thanks!

@Gaohan123 Resolved. Thanks!

@hsliuustc0106 hsliuustc0106 added the ready label to trigger buildkite CI label Feb 15, 2026
Signed-off-by: dongbo910220 <1275604947@qq.com>
Signed-off-by: dongbo910220 <1275604947@qq.com>
@dongbo910220
Copy link
Contributor Author

dongbo910220 commented Feb 15, 2026

@Gaohan123 , ROCm CI is still on vLLM 0.15.0 (see docker/Dockerfile.rocm), so @config(config=...) fails with TypeError: config() got an unexpected keyword argument 'config'. This is an environment/version mismatch, not a logic regression.

The project docs explicitly recommend vLLM 0.16.0 (docs/getting_started/quickstart.md).

Could we upgrade the ROCm CI image to vLLM ≥ 0.16 (or reinstall vLLM in ROCm CI), or should we add a backward-compatibility shim?

@lishunyang12
Copy link
Contributor

@dongbo910220 Hey, reusing each model's load_weights() stacked_params_mapping as the single source of truth for LoRA packed→sublayer mapping is a clean refactor — avoids divergent mappings. Tests are passing too. Is this ready for merge?

@dongbo910220
Copy link
Contributor Author

dongbo910220 commented Feb 21, 2026

@dongbo910220 Hey, reusing each model's load_weights() stacked_params_mapping as the single source of truth for LoRA packed→sublayer mapping is a clean refactor — avoids divergent mappings. Tests are passing too. Is this ready for merge?

@lishunyang12 Thanks! Yes, it’s ready to merge. Local tests are passing; the remaining CI issue is an ROCm environment mismatch (vLLM 0.15.x) and not caused by this change.

Copy link
Contributor

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments. The main concern is a regression in z_image's QKV mapping.

Signed-off-by: dongbo910220 <1275604947@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants