-
I was trying to do parallel training for the CPU version of deepmd-kit to compare the performance b/w CPU single thread,multi-thread, and multi-node. Command -- hostfile Content -- I want to know why it is not running in parallel or if this feature is limited to the GPU version of deepmd-kit.. Usually, we are able to launch multiple processes for regular code via mpirun. If I want to train to CPU cluster with multi-cores, how should I go about it? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Did you install horovod? |
Beta Was this translation helpful? Give feedback.
-
Thanks, @njzjz. It worked after installing horovod. |
Beta Was this translation helpful? Give feedback.
Thanks, @njzjz. It worked after installing horovod.