Revert gemma3n fast prefill changes #23897

sarckk · 2025-08-29T04:46:28Z

Purpose

Multi-modal processor tests for H2OVLChatModel and KimiVLForConditionalGeneration are broken on trunk after merging #22628.

I was previously unable to reproduce this issue locally as it requires that these test cases come after the Gemma3n test case. The failures are:

H20VLChatModel

TypeError: _LazyConfigMapping.__init__() missing 1 required positional argument: 'mapping'

Kimi (OOM)

(EngineCore_0 pid=1298081)   File "/data/users/yhshin/gitrepos/vllm/vllm/model_executor/models/moonvit.py", line 141, in sdpa_attention
(EngineCore_0 pid=1298081)     attn_output = F.scaled_dot_product_attention(q,
(EngineCore_0 pid=1298081)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_0 pid=1298081) torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 254.47 GiB. GPU 0 has a total capacity of 94.99 GiB of which 78.18 GiB is free. Including non-PyTorch memory, this process has 16.80 GiB memory in use. Of the allocated memory 16.03 GiB is allocated by PyTorch, and 104.27 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid ...

This PR reverts gemma3n changes in #22628 (commit c498483) which fixes the tests.

Test Plan

pytest tests/models/multimodal/processing/test_tensor_schema.py::test_model_tensor_schema -s

Test Result

Multi-Modal Processor Test should pass in CI

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This PR reverts the gemma3n fast prefill feature to fix failing tests. The changes within the model files are correct. However, the revert is incomplete, leaving behind dead code, configuration options, and tests related to the removed feature. I've added a critical comment pointing out the need for a complete revert to maintain code health.

heheda12345 · 2025-08-29T04:50:28Z

Need to ensure Multi-modal processor tests pass before merging this PR.

heheda12345 · 2025-08-29T06:51:51Z

@sarckk v1-test-e2e-plus-engine fails and is related to this PR.

Signed-off-by: Yong Hoon Shin <[email protected]>

Signed-off-by: Yong Hoon Shin <[email protected]> Signed-off-by: Matthew Bonanni <[email protected]>

Signed-off-by: Yong Hoon Shin <[email protected]>

Signed-off-by: Yong Hoon Shin <[email protected]> Signed-off-by: Shiyan Deng <[email protected]>

gemini-code-assist bot reviewed Aug 29, 2025

View reviewed changes

sarckk mentioned this pull request Aug 29, 2025

[V1] Enable prefill optimization for Gemma3n #22628

Merged

6 tasks

heheda12345 approved these changes Aug 29, 2025

View reviewed changes

heheda12345 enabled auto-merge (squash) August 29, 2025 04:49

heheda12345 disabled auto-merge August 29, 2025 04:49

heheda12345 added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 29, 2025

NickLucche mentioned this pull request Aug 29, 2025

[Core] Cleanup TPU model runner for MM #23894

Merged

5 tasks

sarckk added 2 commits August 29, 2025 09:09

Revert gemma3n fast prefill changes

21fb8ef

Signed-off-by: Yong Hoon Shin <[email protected]>

Skip Gemma3n fast prefill test

b6259ca

Signed-off-by: Yong Hoon Shin <[email protected]>

sarckk force-pushed the fix-mm-tests branch from b49844e to b6259ca Compare August 29, 2025 16:09

mergify bot added the v1 label Aug 29, 2025

heheda12345 enabled auto-merge (squash) August 29, 2025 16:39

zou3519 approved these changes Aug 29, 2025

View reviewed changes

zou3519 mentioned this pull request Aug 29, 2025

Update PyTorch to 2.8.0 #20358

Merged

10 tasks

simon-mo disabled auto-merge August 29, 2025 19:16

simon-mo merged commit 8c3e199 into vllm-project:main Aug 29, 2025
40 of 42 checks passed

sarckk mentioned this pull request Aug 30, 2025

[Bug]: Torch Compilation Failure for Gemma3n with LoRA Support - Dynamic Shape Constraints Violated #23970

Open

1 task

nopperl pushed a commit to pfnet/vllm that referenced this pull request Sep 3, 2025

Revert gemma3n fast prefill changes (vllm-project#23897)

db491c8

Signed-off-by: Yong Hoon Shin <[email protected]>

MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Sep 3, 2025

Revert gemma3n fast prefill changes (vllm-project#23897)

50474b2

Signed-off-by: Yong Hoon Shin <[email protected]> Signed-off-by: Matthew Bonanni <[email protected]>

MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Sep 3, 2025

Revert gemma3n fast prefill changes (vllm-project#23897)

8f41ebc

Signed-off-by: Yong Hoon Shin <[email protected]>

842974287 pushed a commit to 842974287/vllm that referenced this pull request Sep 3, 2025

Revert gemma3n fast prefill changes (vllm-project#23897)

1fea497

Signed-off-by: Yong Hoon Shin <[email protected]> Signed-off-by: Shiyan Deng <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Revert gemma3n fast prefill changes #23897

Revert gemma3n fast prefill changes #23897

Uh oh!

sarckk commented Aug 29, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

heheda12345 commented Aug 29, 2025

Uh oh!

heheda12345 commented Aug 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Revert gemma3n fast prefill changes #23897

Revert gemma3n fast prefill changes #23897

Uh oh!

Conversation

sarckk commented Aug 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

H20VLChatModel

Kimi (OOM)

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

heheda12345 commented Aug 29, 2025

Uh oh!

heheda12345 commented Aug 29, 2025

Uh oh!

Uh oh!

Uh oh!

sarckk commented Aug 29, 2025 •

edited by github-actions bot

Loading