Skip to content

[BugFix]: fix a lot of bug#1565

Merged
hsliuustc0106 merged 4 commits intovllm-project:mainfrom
princepride:fix-a-lot-of-bug
Feb 28, 2026
Merged

[BugFix]: fix a lot of bug#1565
hsliuustc0106 merged 4 commits intovllm-project:mainfrom
princepride:fix-a-lot-of-bug

Conversation

@princepride
Copy link
Collaborator

@princepride princepride commented Feb 28, 2026

Purpose

Fix multiple bugs affecting Bagel and GLM-Image model pipelines, including model initialization failures, weight loading errors, and intermittent KV cache transfer issues.

1. Fix Bagel ViT weight loading failures (pipeline_bagel.py)

  • Parameter aliasing: SiglipNaViTWrapper created patch_embed_weight/patch_embed_bias attributes that aliased canonical parameters in patch_embedding, causing false positives in the strict weight loading check. Removed the aliases and access the patch embedding parameters directly in forward().
  • Config mismatch: The ViT config (vit_config.json) specifies 27 layers with a head, but the actual Bagel checkpoint has 26 layers and no head. Added overrides to align the SiglipVisionModel config with the checkpoint.

2. Fix GLM-Image transformer weight loading check failure (pipeline_glm_image.py)

GlmImagePipeline.load_weights() stripped the transformer. prefix before delegating to the transformer, but didn't re-add it to the returned set of loaded weight names. The strict check compared prefixed expected names against unprefixed loaded names, causing all transformer weights to be reported as uninitialized.

3. Fix intermittent KV cache transfer failure (shm_connector.py, chunk_transfer_adapter.py)

SharedMemoryConnector.get() returned (None, 0) on failure, which is truthy, causing the caller's polling loop to exit prematurely and treat a not-yet-ready transfer as a successful zero-byte read. Changed all failure return paths to None (falsy) and added a corresponding None guard in chunk_transfer_adapter.py before tuple unpacking.

4. Fix vllm 0.16.0 import incompatibility (glm_image_ar.py)

from vllm.attention.layer import Attention does not exist in vllm 0.16.0. Updated to the correct path from vllm.model_executor.layers.attention import Attention.

Test Plan

  • Run Bagel end-to-end inference:
python3 examples/offline_inference/bagel/end2end.py --prompts "A cute cat" --modality text2img
  • Run GLM-Image end-to-end inference:
python3 end2end.py --model-path /proj-tango-pvc/users/zhipeng.wang/workspace/models/GLM-Image --prompt "A cat sitting on the table" --output cat.png --config-path /proj-tango-pvc/users/zhipeng.wang/workspace/vllm-omni/vllm_omni/model_executor/stage_configs/glm_image.yaml

Test Result

  • Bagel:
image
  • GLM-Image:
image

Signed-off-by: princepride <wangzhipeng628@gmail.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7c4c436caa

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: princepride <wangzhipeng628@gmail.com>
@david6666666 david6666666 added the ready label to trigger buildkite CI label Feb 28, 2026
@hsliuustc0106 hsliuustc0106 merged commit 3d9fa8d into vllm-project:main Feb 28, 2026
6 of 7 checks passed
yJader pushed a commit to omni-nicelab/vllm-omni-batching that referenced this pull request Mar 3, 2026
Signed-off-by: princepride <wangzhipeng628@gmail.com>

Signed-off-by: jader <yjader@foxmail.com>
yJader pushed a commit to omni-nicelab/vllm-omni-batching that referenced this pull request Mar 3, 2026
Signed-off-by: princepride <wangzhipeng628@gmail.com>

Signed-off-by: jader <yjader@foxmail.com>
yJader pushed a commit to omni-nicelab/vllm-omni-batching that referenced this pull request Mar 3, 2026
Signed-off-by: princepride <wangzhipeng628@gmail.com>
hsliuustc0106 added a commit to hsliuustc0106/vllm-omni-skills that referenced this pull request Mar 4, 2026
### vllm-omni-perf
- Source: [PR #1619](vllm-project/vllm-omni#1619) - [Bugfix] Fix Qwen3-TTS code predictor crash due to missing vLLM config context
- Changes:
  - Bug fix: [Bugfix] Fix Qwen3-TTS code predictor crash due to missing vLLM config context

### vllm-omni-contrib
- Source: [PR #1615](vllm-project/vllm-omni#1615) - [Doc] Fix links in the configuration doc
- Changes:
  - Bug fix: [Doc] Fix links in the configuration doc

### vllm-omni-perf
- Source: [PR #1609](vllm-project/vllm-omni#1609) - [Bugfix] Fix filepath resolution for model with subdir and GLM-Image generation
- Changes:
  - Bug fix: [Bugfix] Fix filepath resolution for model with subdir and GLM-Image generation

### vllm-omni-image-gen
- Source: [PR #1609](vllm-project/vllm-omni#1609) - [Bugfix] Fix filepath resolution for model with subdir and GLM-Image generation
- Changes:
  - Bug fix: [Bugfix] Fix filepath resolution for model with subdir and GLM-Image generation
- Additions:
  - GLM-Image
  - GLM-Image
  - GLM-Image
  - GLM-Image
  - GLM-Image
  - GLM-Image
  - GLM-Image
  - GLM-Image

### vllm-omni-api
- Source: [PR #1609](vllm-project/vllm-omni#1609) - [Bugfix] Fix filepath resolution for model with subdir and GLM-Image generation
- Changes:
  - Bug fix: [Bugfix] Fix filepath resolution for model with subdir and GLM-Image generation

### vllm-omni-serving
- Source: [PR #1602](vllm-project/vllm-omni#1602) - [Bugfix] fix kernel error for qwen3-omni
- Changes:
  - Bug fix: [Bugfix] fix kernel error for qwen3-omni

### vllm-omni-perf
- Source: [PR #1598](vllm-project/vllm-omni#1598) - [BugFix] Fix load_weights error when loading HunyuanImage3.0
- Changes:
  - Bug fix: [BugFix] Fix load_weights error when loading HunyuanImage3.0

### vllm-omni-image-gen
- Source: [PR #1598](vllm-project/vllm-omni#1598) - [BugFix] Fix load_weights error when loading HunyuanImage3.0
- Changes:
  - Bug fix: [BugFix] Fix load_weights error when loading HunyuanImage3.0
- Additions:
  - HunyuanImage3
  - HunyuanImage3Pipeline
  - HunyuanImage3
  - HunyuanImage-3
  - HunyuanImage-3
  - HunyuanImage-3
  - HunyuanImage3Pipeline
  - HunyuanImage3Pipeline
  - HunyuanImage3Pipeline
  - HunyuanImage3Pipeline
  - HunyuanImage3Pipeline
  - HunyuanImage3Pipeline
  - HunyuanImage3Pipeline
  - HunyuanImage3Pipeline
  - HunyuanImage-3

### vllm-omni-quantization
- Source: [PR #1598](vllm-project/vllm-omni#1598) - [BugFix] Fix load_weights error when loading HunyuanImage3.0
- Changes:
  - Bug fix: [BugFix] Fix load_weights error when loading HunyuanImage3.0

### vllm-omni-distributed
- Source: [PR #1598](vllm-project/vllm-omni#1598) - [BugFix] Fix load_weights error when loading HunyuanImage3.0
- Changes:
  - Bug fix: [BugFix] Fix load_weights error when loading HunyuanImage3.0

### vllm-omni-contrib
- Source: [PR #1576](vllm-project/vllm-omni#1576) - 0.16.0 release

### vllm-omni-audio-tts
- Source: [PR #1570](vllm-project/vllm-omni#1570) - [bugfix] Fix unexpected argument 'is_finished' in function llm2code2wav_async_chunk of mimo-audio
- Changes:
  - Bug fix: [bugfix] Fix unexpected argument 'is_finished' in function llm2code2wav_async_chunk of mimo-audio

### vllm-omni-api
- Source: [PR #1566](vllm-project/vllm-omni#1566) - [Bugfix] Import InputPreprocessor into Renderer
- Changes:
  - Bug fix: [Bugfix] Import InputPreprocessor into Renderer

### vllm-omni-perf
- Source: [PR #1565](vllm-project/vllm-omni#1565) - [BugFix]: fix a lot of bug
- Changes:
  - Bug fix: [BugFix]: fix a lot of bug

### vllm-omni-contrib
- Source: [PR #1564](vllm-project/vllm-omni#1564) - [NPU][Bugfix] Align GPU side and recover qwen3-tts
- Changes:
  - Bug fix: [NPU][Bugfix] Align GPU side and recover qwen3-tts

### vllm-omni-audio-tts
- Source: [PR #1564](vllm-project/vllm-omni#1564) - [NPU][Bugfix] Align GPU side and recover qwen3-tts
- Changes:
  - Bug fix: [NPU][Bugfix] Align GPU side and recover qwen3-tts

### vllm-omni-perf
- Source: [PR #1562](vllm-project/vllm-omni#1562) - [BugFix] Fix unexpected crash when init OmniDiffusion
- Changes:
  - Bug fix: [BugFix] Fix unexpected crash when init OmniDiffusion

### vllm-omni-api
- Source: [PR #1562](vllm-project/vllm-omni#1562) - [BugFix] Fix unexpected crash when init OmniDiffusion
- Changes:
  - Bug fix: [BugFix] Fix unexpected crash when init OmniDiffusion

### vllm-omni-quantization
- Source: [PR #1562](vllm-project/vllm-omni#1562) - [BugFix] Fix unexpected crash when init OmniDiffusion
- Changes:
  - Bug fix: [BugFix] Fix unexpected crash when init OmniDiffusion

### vllm-omni-distributed
- Source: [PR #1562](vllm-project/vllm-omni#1562) - [BugFix] Fix unexpected crash when init OmniDiffusion
- Changes:
  - Bug fix: [BugFix] Fix unexpected crash when init OmniDiffusion

### vllm-omni-api
- Source: [PR #1554](vllm-project/vllm-omni#1554) - fix(qwen3-tts): fix Base ICL voice clone producing corrupted audio
- Changes:
  - Bug fix: fix(qwen3-tts): fix Base ICL voice clone producing corrupted audio

### vllm-omni-cicd
- Source: [PR #1543](vllm-project/vllm-omni#1543) - [CI] Modify some CI test cases to run on L4 environment to reduce H100 resource usage.

### vllm-omni-perf
- Source: [PR #1540](vllm-project/vllm-omni#1540) - Fix no embed text spk tokens
- Changes:
  - Bug fix: Fix no embed text spk tokens

### vllm-omni-distributed
- Source: [PR #1540](vllm-project/vllm-omni#1540) - Fix no embed text spk tokens
- Changes:
  - Bug fix: Fix no embed text spk tokens

### vllm-omni-perf
- Source: [PR #1539](vllm-project/vllm-omni#1539) - [Debug] Enable curl retry aligned with openai

### vllm-omni-quantization
- Source: [PR #1539](vllm-project/vllm-omni#1539) - [Debug] Enable curl retry aligned with openai

### vllm-omni-distributed
- Source: [PR #1539](vllm-project/vllm-omni#1539) - [Debug] Enable curl retry aligned with openai

### vllm-omni-image-gen
- Source: [PR #1538](vllm-project/vllm-omni#1538) - [CI][skip ci]Update H100 image link based on #1518

### vllm-omni-perf
- Source: [PR #1536](vllm-project/vllm-omni#1536) - [Bugfix] Fix transformers 5.x compat issues in online TTS serving
- Changes:
  - Bug fix: [Bugfix] Fix transformers 5.x compat issues in online TTS serving

### vllm-omni-serving
- Source: [PR #1536](vllm-project/vllm-omni#1536) - [Bugfix] Fix transformers 5.x compat issues in online TTS serving
- Changes:
  - Bug fix: [Bugfix] Fix transformers 5.x compat issues in online TTS serving

### vllm-omni-cicd
- Source: [PR #1534](vllm-project/vllm-omni#1534) - [Debug] Merge vllm pull 35368

### vllm-omni-contrib
- Source: [PR #1530](vllm-project/vllm-omni#1530) - [Docs] update async chunk docs diagram [skip ci]

### vllm-omni-distributed
- Source: [PR #1524](vllm-project/vllm-omni#1524) - [BugFix] Restore talker's config
- Changes:
  - Bug fix: [BugFix] Restore talker's config

### vllm-omni-api
- Source: [PR #1522](vllm-project/vllm-omni#1522) - [Bugfix] Use uds for zmq address if not set --stage-id
- Changes:
  - New feature: [Bugfix] Use uds for zmq address if not set --stage-id

### vllm-omni-perf
- Source: [PR #1521](vllm-project/vllm-omni#1521) - Revert gpu_1 job to use regular image

### vllm-omni-perf
- Source: [PR #1518](vllm-project/vllm-omni#1518) - Use pull through cache image for H100 pool

### vllm-omni-perf
- Source: [PR #1515](vllm-project/vllm-omni#1515) - [Bugfix] fix offline text_to_image error from #1009
- Changes:
  - Bug fix: [Bugfix] fix offline text_to_image error from #1009

### vllm-omni-image-gen
- Source: [PR #1515](vllm-project/vllm-omni#1515) - [Bugfix] fix offline text_to_image error from #1009
- Changes:
  - Bug fix: [Bugfix] fix offline text_to_image error from #1009
- Additions:
  - num-images-per-prompt

### vllm-omni-quantization
- Source: [PR #1515](vllm-project/vllm-omni#1515) - [Bugfix] fix offline text_to_image error from #1009
- Changes:
  - Bug fix: [Bugfix] fix offline text_to_image error from #1009

### vllm-omni-distributed
- Source: [PR #1515](vllm-project/vllm-omni#1515) - [Bugfix] fix offline text_to_image error from #1009
- Changes:
  - Bug fix: [Bugfix] fix offline text_to_image error from #1009

### vllm-omni-api
- Source: [PR #1509](vllm-project/vllm-omni#1509) - [Chore] remove unused logger in omni_diffusion (#531)

### vllm-omni-perf
- Source: [PR #1505](vllm-project/vllm-omni#1505) - [Doc] Update installation instructions for vllm 0.16.0

### vllm-omni-quantization
- Source: [PR #1505](vllm-project/vllm-omni#1505) - [Doc] Update installation instructions for vllm 0.16.0

### vllm-omni-distributed
- Source: [PR #1505](vllm-project/vllm-omni#1505) - [Doc] Update installation instructions for vllm 0.16.0

### vllm-omni-contrib
- Source: [PR #1505](vllm-project/vllm-omni#1505) - [Doc] Update installation instructions for vllm 0.16.0

### vllm-omni-video-gen
- Source: [PR #1504](vllm-project/vllm-omni#1504) - [Feature][Wan2.2] Speed up diffusion model startup by multi-thread weight loading
- Changes:
  - New feature: [Feature][Wan2.2] Speed up diffusion model startup by multi-thread weight loading

### vllm-omni-perf
- Source: [PR #1504](vllm-project/vllm-omni#1504) - [Feature][Wan2.2] Speed up diffusion model startup by multi-thread weight loading
- Changes:
  - New feature: [Feature][Wan2.2] Speed up diffusion model startup by multi-thread weight loading

### vllm-omni-api
- Source: [PR #1504](vllm-project/vllm-omni#1504) - [Feature][Wan2.2] Speed up diffusion model startup by multi-thread weight loading
- Changes:
  - New feature: [Feature][Wan2.2] Speed up diffusion model startup by multi-thread weight loading

### vllm-omni-cicd
- Source: [PR #1504](vllm-project/vllm-omni#1504) - [Feature][Wan2.2] Speed up diffusion model startup by multi-thread weight loading
- Changes:
  - New feature: [Feature][Wan2.2] Speed up diffusion model startup by multi-thread weight loading

### vllm-omni-contrib
- Source: [PR #1500](vllm-project/vllm-omni#1500) - [ROCm] [CI] [Docker] Point to use the latest vLLM v0.16.0 stable version

### vllm-omni-cicd
- Source: [PR #1492](vllm-project/vllm-omni#1492) - [Platform] Enable layerwise offload on all hardware

### vllm-omni-image-gen
- Source: [PR #1491](vllm-project/vllm-omni#1491) - [CI] Update Dockerfile for vllm-omni CI image and remove obsolete dep…

### vllm-omni-cicd
- Source: [PR #1488](vllm-project/vllm-omni#1488) - [XPU][NPU][ROCM] enable cpu_offloading flag for non_cuda

### vllm-omni-audio-tts
- Source: [PR #1482](vllm-project/vllm-omni#1482) - [Fix][Chore] Qwen3-TTS Modeling Minor Code Sanity Improvements
- Changes:
  - Bug fix: [Fix][Chore] Qwen3-TTS Modeling Minor Code Sanity Improvements

### vllm-omni-perf
- Source: [PR #1468](vllm-project/vllm-omni#1468) - [BugFix] process request.num_cached_tokens if it equals to the initial value
- Changes:
  - Bug fix: [BugFix] process request.num_cached_tokens if it equals to the initial value

### vllm-omni-audio-tts
- Source: [PR #1455](vllm-project/vllm-omni#1455) - [Bugfix] Fix case-sensitive task_type matching in Qwen3TTSModelForGeneration
- Changes:
  - Bug fix: [Bugfix] Fix case-sensitive task_type matching in Qwen3TTSModelForGeneration

### vllm-omni-cicd
- Source: [PR #1449](vllm-project/vllm-omni#1449) - [Test] Reduce Perf test case and fix modify stage config
- Changes:
  - Bug fix: [Test] Reduce Perf test case and fix modify stage config

### vllm-omni-cicd
- Source: [PR #1448](vllm-project/vllm-omni#1448) - [Bugfix] Race condition in MultiprocExecutor when concurent access to Scheduler
- Changes:
  - Bug fix: [Bugfix] Race condition in MultiprocExecutor when concurent access to Scheduler

### vllm-omni-cicd
- Source: [PR #1438](vllm-project/vllm-omni#1438) - [Qwen3TTS][Feat] Streaming output
- Changes:
  - New feature: [Qwen3TTS][Feat] Streaming output

### vllm-omni-api
- Source: [PR #1438](vllm-project/vllm-omni#1438) - [Qwen3TTS][Feat] Streaming output
- Changes:
  - New feature: [Qwen3TTS][Feat] Streaming output

### vllm-omni-contrib
- Source: [PR #1438](vllm-project/vllm-omni#1438) - [Qwen3TTS][Feat] Streaming output
- Changes:
  - New feature: [Qwen3TTS][Feat] Streaming output

### vllm-omni-audio-tts
- Source: [PR #1438](vllm-project/vllm-omni#1438) - [Qwen3TTS][Feat] Streaming output
- Changes:
  - New feature: [Qwen3TTS][Feat] Streaming output

### vllm-omni-cicd
- Source: [PR #1435](vllm-project/vllm-omni#1435) - [Doc][Test][Misc] ComfyUI test, more screenshot, and code cleaning

### vllm-omni-video-gen
- Source: [PR #1433](vllm-project/vllm-omni#1433) - [Debug] Multi-Request for Qwen 3 Omni use_audio_in_video

### vllm-omni-audio-tts
- Source: [PR #1433](vllm-project/vllm-omni#1433) - [Debug] Multi-Request for Qwen 3 Omni use_audio_in_video
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants