MJX GPU vs CPU performance #2812
Replies: 1 comment
-
Hi Matei, it is the latter (total SPS). As I've said many times in the past, GPUs are not faster than CPUs, they are simply, on average, bigger. A CPU with the same transistor count as a say, an NVIDIA 4090, is—unfortunately—a rare thing. But if you have access to a high core-count CPU (the 96-core ThreadRipper, the 144 core NVIDIA Grace, etc.), and you don't have high throughput needs to a discrete chip (as one does when doing RL not on Apple Silicon), then just stick to CPUs for your physics. All the difficulties GPUs have with branching and early termination are gone and you get your choice of 32/64bit numerical precision. I won't even tell you how much easier it is to debug the code since that's my problem... but OMG. I wish it wasn't the case that we had to write entirely new codebases just because the market for powerful CPUs is small. Unfortunately the benefits the GPUs offer for trivially-parallelizable code (neural networks) are large and the demand is huge, so we can expect users to keep pairing beefy GPUs with puny CPUs for many years to come 😞 Feel free to send a PR to clarify. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Looking at the (terrific) MJX documentation, I have a question regarding the performance comparison in the Sharp Bits section.
Comparing the 64-core CPU against the A100 GPU, my understanding of the data is that the CPU is running 128 parallel instances of the scene (2*numcore) , while the GPU is running 8,192 parallel instances. However, the CPU gets more steps per second, even for a single humanoid (2.7M vs. 950K).
Are these numbers in steps per second per instance, or total steps per second? If it's the latter, then it would mean the CPU wins out even for a single humanoid. If it's the former, the GPU is clearly ahead, as it's running two orders of magnitude more scenes.
Thanks a lot!
Beta Was this translation helpful? Give feedback.
All reactions