Skip to content

[Bug]: V0.9.1 开启full_cuda_graph: true 后会hang住 #4180

@csy0225

Description

@csy0225

Your current environment

The output of `python collect_env.py`
Your output of above commands here

🐛 Describe the bug

vllm-ascend 版本:v0.9.1
模型:Qwen2.5-32B-Instruct
在采用 --compilation-config '{"full_cuda_graph": true}' 后,推理会hang住,通过 py-spy dump 发现卡在 npu 的graph_task_update_end 位置,该如何定位解决呢?

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions