2颗Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz,ddr4类型的512内存,两块A10,只能达到6tokens/s,如何优化 #1139
Replies: 2 comments 6 replies
-
限制仅允许使用一块GPU对性能应当会有改善(参考文档修改环境变量或docker参数,或者仅安装1 pcs), |
Beta Was this translation helpful? Give feedback.
-
我的这个使用单gpu跑的时候内存总共只占用了72G,不知道是不是我哪里设置的不对 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
2颗Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz,ddr4类型的512内存,两块A10,跑了deepseek-r1 Q4精度的模型,使用DeepSeek-V3-Chat-multi-gpu.yaml优化文件,只能达到6tokens/s,两块显存目前才各利用6G多显存如何优化才能提升性能?
Beta Was this translation helpful? Give feedback.
All reactions