[NPU] [Features] [Bugfix] Support mindiesd adaln#1537
[NPU] [Features] [Bugfix] Support mindiesd adaln#1537gcanlin merged 14 commits intovllm-project:mainfrom
Conversation
Signed-off-by: jiangmengyu18 <451528648@qq.com>
Signed-off-by: jiangmengyu18 <451528648@qq.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8b3b5a0972
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…yu18/vllm-omni into support-mindiesd-adaln
Signed-off-by: jiangmengyu18 <451528648@qq.com>
…yu18/vllm-omni into support-mindiesd-adaln
gcanlin
left a comment
There was a problem hiding this comment.
Look good now. Thanks!
cc @hsliuustc0106
lishunyang12
left a comment
There was a problem hiding this comment.
Bugfix looks right -- the old code passing scale/shift as weight/bias to npu_layer_norm_eval was definitely wrong for batch > 1. Left one comment on the exception handling.
Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com> Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
| import torch_npu | ||
|
|
||
| output = torch_npu.npu_layer_norm_eval( | ||
| x, normalized_shape=[self.hidden_size], weight=(1 + scale_result), bias=shift_result, eps=self.eps |
There was a problem hiding this comment.
Duplicated line. Remove one.
There was a problem hiding this comment.
you mean the rebundant logger.warning_once? I have pull request to solve it.
There was a problem hiding this comment.
lishunyang12
left a comment
There was a problem hiding this comment.
Bugfix looks correct — applying scale/shift outside npu_layer_norm_eval avoids the silent broadcasting issue. Left a nit about a duplicated line.
Signed-off-by: jiangmengyu18 <451528648@qq.com>
Signed-off-by: jiangmengyu18 <451528648@qq.com> Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com> Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
Signed-off-by: jiangmengyu18 <451528648@qq.com> Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com> Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
The following is a detailed explanation of the bug about adaln implemented by torch_npu.
First of all,
torch_npu.npu_layer_norm_evalis an optimized implementation oftorch.nn.functional.layer_normon NPU. The reason for using torch_npu.npu_layer_norm_eval here is that it is exactly equivalent to AdaLN when batch_size = 1, but it may cause precision issues when batch_size > 1.Since AdaLN normalizes the last dimension of
x, which isd,normalized_shapemust be[d], and the shapes ofweightandbiasmust also be[d]. However, the shapes ofscale_resultandshift_resultare[b, 1, d]. Althoughtorch_npu.npu_layer_norm_evaldoes not raise an error when the shapes ofweightandbiasare[b, 1, d], it will takeweight[0][0]andbias[0][0]for broadcasting, instead of using the full weight and bias. It will lead to precision issues when batch_size > 1.For example:

error result:

reproduce the error result a second time:

correct reuslt:

Test Plan
Test Result
The table below shows the time consumption of AdaLN when using native, torch_npu, and mindiesd, respectively.
Performance:

native:
torch_npu:

mindiesd:

Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)