-
Notifications
You must be signed in to change notification settings - Fork 565
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
The output of `python collect_env.py`
vllm: v0.11.0
vllm-ascend: v0.11.0-dev
🐛 Describe the bug
Run:
export VLLM_ASCEND_ENABLE_FLASHCOMM1=1
vllm serve /root/.cache/modelscope/hub/models/Qwen/Qwen2.5-VL-7B-Instruct \
--max_model_len 16384 \
--max-num-batched-tokens 16384 \
--tensor-parallel-size 4It directly broke:
(Worker_TP0 pid=51886) ERROR 11-14 08:13:55 [multiproc_executor.py:671] WorkerProc hit an exception.
(Worker_TP0 pid=51886) ERROR 11-14 08:13:55 [multiproc_executor.py:671] Traceback (most recent call last):
(Worker_TP0 pid=51886) ERROR 11-14 08:13:55 [multiproc_executor.py:671] File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 666, in worker_busy_loop
(Worker_TP0 pid=51886) ERROR 11-14 08:13:55 [multiproc_executor.py:671] output = func(*args, **kwargs)
(Worker_TP0 pid=51886) ERROR 11-14 08:13:55 [multiproc_executor.py:671] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker_v1.py", line 227, in determine_available_memory
(Worker_TP0 pid=51886) ERROR 11-14 08:13:55 [multiproc_executor.py:671] self.model_runner.profile_run()
(Worker_TP0 pid=51886) ERROR 11-14 08:13:55 [multiproc_executor.py:671] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 2557, in profile_run
(Worker_TP0 pid=51886) ERROR 11-14 08:13:55 [multiproc_executor.py:671] hidden_states = hidden_states[logit_indices]
(Worker_TP0 pid=51886) ERROR 11-14 08:13:55 [multiproc_executor.py:671] RuntimeError: copy_between_host_and_device_opapi:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:56 NPU function error: SUSPECT REMOTE ERROR, error code is 507057Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working