Non-linear performance scaling on CPU over 8 physical cores? #10993
Closed
drHuangMHT
started this conversation in
General
Replies: 1 comment
-
Just upgraded from Windows Server 2019 to 2022 and problem solved, sort of. Time took for each iteration is halved, but overall utilization is still around 65% with one instance. Close for now. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I know it's a dumb idea to run models on CPUs, but I do observe some weird behavior on my dual socket EPYC system.
When I launch one instance of webui I see utilization being around 67%, and not all logical cores are used. As for two instances the time for each iteration is the same but effectively doubled output. However, when I upgrade from 8 cores(phy) to 16 cores(phy) per socket the inference time does not change. Everything is the same as previous 8 cores setup, but performance is also the same despite having doubled amount of CPU cores. Cinebench result scaled as expected.
I heard you guys screaming "buy a used GPU pls", but is this expected?
Beta Was this translation helpful? Give feedback.
All reactions