DeepSeek R1 unsloth 2.51 bits多GPU对比单GPU性能无优势 #921
-
测试结果,8卡4090D对比1张4090D性能无优化,
对应优化规则如下:
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
@Azure-Tang need help |
Beta Was this translation helpful? Give feedback.
-
You're correct. Currently, KT's multi-GPU implementation is based on pipeline, which is designed for users with multiple GPUs but limited VRAM on each device. At this stage, the multi-gpu doesn't provide acceleration benefits, but rather enables model deployment across multiple smaller GPUs. We are actively working on improving this functionality to deliver performance enhancements in future releases. |
Beta Was this translation helpful? Give feedback.
-
问一下你这个机器单CPU跑Q4多少速度? |
Beta Was this translation helpful? Give feedback.
You're correct. Currently, KT's multi-GPU implementation is based on pipeline, which is designed for users with multiple GPUs but limited VRAM on each device. At this stage, the multi-gpu doesn't provide acceleration benefits, but rather enables model deployment across multiple smaller GPUs.
We are actively working on improving this functionality to deliver performance enhancements in future releases.