Parallelism cannot be tuned by changing environment variables (TF_INTRA_OP_PARALLELISM_THREADS,OMP_NUM_THREADS) #2789
-
I'm trying to train a 4-elements deep potential on a workstation equipped with 4 RTX 3080Ti cards and one i9-10900X CPU. The CPU has one socket and 10 cores. When only one GPU card is used, the training speed is around
Here two GPU cards are used, and the
However, the record.txt reads:
It seems that no matter how the |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
I'm totally new to parallel training in DP, I think I misunderstood the mechanism of parallel training... According to the manual https://docs.deepmodeling.com/projects/deepmd/en/master/train/parallel-training.html#tuning-learning-rate, I should manually decrease the "numb_step" . For example, I used to train the model for 8,000,000 steps. If I use two cards, then I should manually set the numb_steps to be 8,000,000/2=4,000,000 steps to achieve similar accuracy to that of one card. Am I correct? |
Beta Was this translation helpful? Give feedback.
-
Parallel training is equivalent to increasing the batch size. |
Beta Was this translation helpful? Give feedback.
Parallel training is equivalent to increasing the batch size.