Replies: 1 comment 2 replies
-
Either run it on 4 GPU's or buy another 2.
Given that you have generous GPU specs, another option would be to run 3x replicas with each replica taking 2 GPUs. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Dear vllm experts, I am trying to deploy vllm in distributed mode, we have at our research institute 4 nodes each with 1xA100, they are working pretty good with distributed ray cluster. Now we got an another node with 2xL40S, ray can show all 6 gpus, but one node with 2 gpus . how to start vllm to use all gpus?
currently we use:
Beta Was this translation helpful? Give feedback.
All reactions