Unset NZ in BF16/FP16 #4415

henryxuxu0716 · 2025-11-25T02:23:54Z

What this PR does / why we need it?

浮点场景默认不做NZ转换
-->

Does this PR introduce any user-facing change?

How was this patch tested?

gemini-code-assist

Code Review

此 PR 旨在默认情况下对浮点场景禁用 NZ 转换。代码修改了 is_enable_nz 函数，使其行为依赖于数据类型。

我发现了两个关键问题：

在 vllm_ascend/utils.py 中，is_enable_nz 函数的实现有副作用，它会修改一个全局状态。一旦使用浮点类型调用，它将永久禁用 NZ 转换，这可能会影响后续使用整数类型的操作。
在 vllm_ascend/ops/common_fused_moe.py 中，is_enable_nz 函数被错误地以 tensor.type（一个方法）而不是 tensor.dtype（类型属性）调用。

建议修复这两个问题以确保代码的正确性和稳定性。

gemini-code-assist · 2025-11-25T02:25:10Z

vllm_ascend/ops/common_fused_moe.py

            layer.w2_weight = torch.nn.Parameter(w2_data, requires_grad=False)

-        if not is_310p() and is_enable_nz():
+        if not is_310p() and is_enable_nz(layer.w13_weight.data.type):


此处调用 is_enable_nz 时使用了 layer.w13_weight.data.type，这似乎是一个错误。torch.Tensor.type 是一个返回类型字符串（例如 'torch.npu.FloatTensor'）的方法，而不是 torch.dtype 对象。is_enable_nz 函数期望接收一个 torch.dtype 对象。

你应该使用 .dtype 属性来获取张量的数据类型。

Suggested change

if not is_310p() and is_enable_nz(layer.w13_weight.data.type):

if not is_310p() and is_enable_nz(layer.w13_weight.data.dtype):

gemini-code-assist · 2025-11-25T02:25:10Z

vllm_ascend/utils.py

+    if dtype in [torch.float16, torch.bfloat16]:
+        _ENABLE_NZ = 0


is_enable_nz 函数中对全局变量 _ENABLE_NZ 的修改会产生意外的副作用。当使用浮点类型（torch.float16 或 torch.bfloat16）调用此函数时，_ENABLE_NZ 会被永久设置为 0。这将导致后续所有对 is_enable_nz 的调用（无论数据类型如何）都返回 0（或 False），从而禁用了 NZ 格式转换。

为了避免这种副作用，建议不要修改全局变量，而是根据 dtype 直接返回适当的值。

Suggested change

if dtype in [torch.float16, torch.bfloat16]:

_ENABLE_NZ = 0

if dtype in [torch.float16, torch.bfloat16]:

return 0

…ctions (vllm-project#4370) ### What this PR does / why we need it? add single node PD disaggregation instructions for Qwen 2.5VL model. ### Does this PR introduce _any_ user-facing change? no --------- Signed-off-by: mazhixin <[email protected]> Signed-off-by: mazhixin000 <[email protected]> Co-authored-by: mazhixin <[email protected]> Signed-off-by: 刘哲续 <[email protected]>

Signed-off-by: 刘哲续 <[email protected]>

github-actions · 2025-11-25T02:59:06Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

刘哲续 and others added 2 commits November 25, 2025 10:02

modify nz in bf16

0ca6bda

Merge branch 'vllm-project:v0.11.0-dev' into v0.11.0-dev

791020e

henryxuxu0716 changed the title ~~V0.11.0 dev~~ Unset NZ in BF16/FP Nov 25, 2025

henryxuxu0716 changed the title ~~Unset NZ in BF16/FP~~ Unset NZ in BF16/FP16 Nov 25, 2025

gemini-code-assist bot reviewed Nov 25, 2025

View reviewed changes

github-actions bot added module:ops module:core labels Nov 25, 2025

mazhixin000 and others added 4 commits November 25, 2025 10:36

modify nz in bf16

a462fcd

Signed-off-by: 刘哲续 <[email protected]>

modify nz in bf16

7027439

Signed-off-by: 刘哲续 <[email protected]>

modify nz in bf16

adfce69

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unset NZ in BF16/FP16 #4415

Unset NZ in BF16/FP16 #4415

Uh oh!

henryxuxu0716 commented Nov 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 25, 2025

Uh oh!

gemini-code-assist bot Nov 25, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	if not is_310p() and is_enable_nz(layer.w13_weight.data.type):
	if not is_310p() and is_enable_nz(layer.w13_weight.data.dtype):

Unset NZ in BF16/FP16 #4415

Are you sure you want to change the base?

Unset NZ in BF16/FP16 #4415

Uh oh!

Conversation

henryxuxu0716 commented Nov 25, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants