-
Notifications
You must be signed in to change notification settings - Fork 572
Open
Description
Your current environment
The output of above commands
npu-smi 25.2.0
CANN version=8.3.RC1.alpha003
sentence-transformers 5.1.2
torch 2.7.1+cpu
torch_npu 2.7.1.dev20250724
torchaudio 2.8.0
torchvision 0.22.1
transformers 4.57.1
vllm 0.11.1.dev0+gb8b302cde.d20251107.empty /workspace/vllm
vllm_ascend 0.11.0rc1.dev147+gd0086d432 /workspace/vllm-ascend
How would you like to use vllm on ascend
问题:基于单机16卡 部署GLM4.6-W8A8模型,序列长度可达到198k,双机16卡基于dp2、tp8显存不够,需要配置tp16,参考案例配置参数、环境变量与ranktable等拉起服务报错;使用ray可以拉起但仅支持单算子,且ray启动过程NPU偶现注册失败,图模式无法拉起。
日志、脚本、ranktable详见附件
Metadata
Metadata
Assignees
Labels
No labels