[Multi-GPUs] Issue with Significant Discrepancy in Task Completion Time Between GPUs #2238
Replies: 2 comments 3 replies
-
maybe python GIL is the cause 🤔 try linux to see if the problem persist |
Beta Was this translation helpful? Give feedback.
-
Yes, I've run into this problem as well. I even set the pcie lanes to 1x for all slots, I get the same issue, first gpu runs at full speed then massive degradation in 2nd and 3rd gpu performance. Just to really confirm this I set all gpus on 1x pcie risers, it made no difference, same behavior. It seems like there is some power limit throttling going on, where the subsequent gpus can't ramp up and just limp along. I also suspect this has something to do with how Whisper is interfacing with the hardware and/or Pytorch? I can run multiple Stable diffusion instances on multiple gpus with no issue on the same machine, behaves normally, only with Whisper does it do this. This is very frustrating, I hope the team can take a look at this :) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
*Additional Context:
Could you please assist in identifying and resolving this performance discrepancy to ensure both GPUs work at their optimal efficiency?
Additionally, I ran whisper_cpp, and the performance of the two GPUs was excellent, maximizing efficiency. Therefore, I believe the issue lies with Whisper's model optimization, resulting in one GPU performing well while the other does not.
Beta Was this translation helpful? Give feedback.
All reactions