Skip to content

split_qkv_rmsnorm_rope_kernel问题 #212

@codeblind2333

Description

@codeblind2333

在910A上使用SGLang启动Qwen3,报如下错误,请问这是什么意思呢?
Capturing batches (bs=24 avail_mem=3.45 GB): 0%| | 0/7 [00:06<?, ?it/s]
[2025-11-28 15:05:04 TP1] Scheduler hit an exception: Traceback (most recent call last):
File "/usr/local/python3.11.13/lib/python3.11/site-packages/sglang/srt/model_executor/cuda_graph_runner.py", line 344, in init
self.capture()
File "/usr/local/python3.11.13/lib/python3.11/site-packages/sglang/srt/model_executor/cuda_graph_runner.py", line 502, in capture
_capture_one_stream()
File "/usr/local/python3.11.13/lib/python3.11/site-packages/sglang/srt/model_executor/cuda_graph_runner.py", line 486, in _capture_one_stream
) = self.capture_one_batch_size(bs, forward, stream_idx)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/sglang/srt/model_executor/cuda_graph_runner.py", line 693, in capture_one_batch_size
run_once()
File "/usr/local/python3.11.13/lib/python3.11/site-packages/sglang/srt/model_executor/cuda_graph_runner.py", line 680, in run_once
logits_output_or_pp_proxy_tensors = forward(
^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/sglang/srt/models/qwen3.py", line 427, in forward
hidden_states = self.model(
^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/sglang/srt/models/qwen2.py", line 362, in forward
hidden_states, residual = layer(
^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/sglang/srt/models/qwen3.py", line 309, in forward
hidden_states, residual = self.layer_communicator.prepare_mlp(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/sglang/srt/layers/communicator.py", line 497, in prepare_mlp
return self._communicate_with_all_reduce_and_layer_norm_fn(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/sglang/srt/layers/communicator.py", line 780, in _gather_hidden_states_and_residual
_ = prepare_weight_cache(hidden_states, context.cache)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/site-packages/sglang/srt/utils/common.py", line 664, in prepare_weight_cache
torch_npu.npu_prefetch(
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/_ops.py", line 1243, in call
return self._op(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is split_qkv_rmsnorm_rope_kernel.
Since the operator is called asynchronously, the stacktrace may be inaccurate. If you want to get the accurate stacktrace, please set the environment variable ASCEND_LAUNCH_BLOCKING=1.
Note: ASCEND_LAUNCH_BLOCKING=1 will force ops to run in synchronous mode, resulting in performance degradation. Please unset ASCEND_LAUNCH_BLOCKING in time after debugging.
[ERROR] 2025-11-28-15:05:03 (PID:53712, Device:1, RankID:-1) ERR00100 PTA call acl api failed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions