Precision synchronization issue fix #4429

DreamerLeader · 2025-11-25T07:57:00Z

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.11.0
vLLM main: vllm-project/vllm@2918c1b

github-actions · 2025-11-25T07:57:56Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request aims to fix a precision synchronization issue. My review focuses on improving code efficiency and readability. I've identified a duplicated loop in mooncake_engine.py that impacts performance and suggested a refactoring. Additionally, I've pointed out a small simplification in mooncake_store_connector_v1.py to improve code clarity.

gemini-code-assist · 2025-11-25T07:58:40Z

vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py

+                skip_save = False
+                if num_computed_token >= len(request.prompt_token_ids):
+                    skip_save = True


This block of code can be simplified to a single line for better readability and conciseness, which is a standard Python practice.

Suggested change

skip_save = False

if num_computed_token >= len(request.prompt_token_ids):

skip_save = True

skip_save = num_computed_token >= len(request.prompt_token_ids)

LCAIZJ · 2025-11-25T12:43:26Z

vllm_ascend/distributed/mooncake/mooncake_engine.py

            if save_spec is None or not save_spec.can_save:
                continue
-
+            torch.npu.synchronize()


You probably need to delete torch.npu.current_stream().synchronize() from the https://github.com/vllm-project/vllm-ascend/blob/main/vllm_ascend/distributed/mooncake/kv_transfer.py file, right? Also, is there any difference between torch.npu.current_stream().synchronize() and torch.npu.synchronize()?

DreamerLeader added 2 commits November 25, 2025 15:48

Update mooncake_store_connector_v1.py

925caab

Update mooncake_engine.py

7a21d5f

gemini-code-assist bot reviewed Nov 25, 2025

View reviewed changes

LCAIZJ reviewed Nov 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Precision synchronization issue fix #4429

Precision synchronization issue fix #4429

DreamerLeader commented Nov 25, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 25, 2025

Uh oh!

LCAIZJ Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Precision synchronization issue fix #4429

Are you sure you want to change the base?

Precision synchronization issue fix #4429

Conversation

DreamerLeader commented Nov 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

LCAIZJ Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DreamerLeader commented Nov 25, 2025 •

edited by github-actions bot

Loading