vllm on ray claster with asymmetric hardware #12455

arm2arm · 2025-01-26T20:58:41Z

arm2arm
Jan 26, 2025

Dear vllm experts, I am trying to deploy vllm in distributed mode, we have at our research institute 4 nodes each with 1xA100, they are working pretty good with distributed ray cluster. Now we got an another node with 2xL40S, ray can show all 6 gpus, but one node with 2 gpus . how to start vllm to use all gpus?
currently we use:

vllm serve  Qwen/Qwen2.5-72B-Instruct-GPTQ-Int4    --tensor-parallel-size 1  --dtype=half    --pipeline-parallel-size 5  --distributed-executor-backend ray  --max-model-len 8192 --trust-remote-code

wedobetter · 2025-02-03T20:36:23Z

wedobetter
Feb 3, 2025

Either run it on 4 GPU's or buy another 2.
The problem is attention heads and vocab need to be divisible by the number of instances and typically they come in 40, 64, and multiples of 8 sizes. Qwen-72b-Instruct has 64 attention heads.
I had the same problem with 7 GPU's, I ended up buying one more.
Otherwise the other GPU's can be used for other purposes such as:

TTS / STT
Model guard (detect malicious prompts)
Judge model
Embedding models (RAG)

Given that you have generous GPU specs, another option would be to run 3x replicas with each replica taking 2 GPUs.

2 replies

Diffizle Feb 25, 2025

have u tried pipeline parallelism? can it work with an odd number of gpus?

wedobetter Feb 25, 2025

I use pipeline parallelism and it works even with an odd number of GPUs, however models won’t load unless vocab size and number of attention heads can be split evenly across the GPUs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

vllm on ray claster with asymmetric hardware #12455

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

vllm on ray claster with asymmetric hardware #12455

Uh oh!

arm2arm Jan 26, 2025

Replies: 1 comment · 2 replies

Uh oh!

Uh oh!

wedobetter Feb 3, 2025

Uh oh!

Diffizle Feb 25, 2025

Uh oh!

wedobetter Feb 25, 2025

arm2arm
Jan 26, 2025

Replies: 1 comment 2 replies

wedobetter
Feb 3, 2025