You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was tto run deepmd in background by adding '&' and wait until the task is accomplished. The job was submitted to slurm system and the script was like
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I was tto run deepmd in background by adding '&' and wait until the task is accomplished. The job was submitted to slurm system and the script was like
#!/bin/bash
#SBATCH -p gpu
#SBATCH -N 1
#SBATCH --gpus-per-node=1
#SBATCH --cpus-per-gpu=18
export KMP_BLOCKTIME=0
export KMP_AFFINITY=granularity=fine,verbose,compact,1,0
export OMP_NUM_THREADS=9
export TF_INTRA_OP_PARALLELISM_THREADS=9
export TF_INTER_OP_PARALLELISM_THREADS=2
cd **
dp train input.json -l train.log --init-frz-model **.pb &
wait
The job kept running after the training process was over until I kill it manually.
It appears that the job was hindered by a process monitoring the performance of gpu😂. I apologize for my recklessness.
Beta Was this translation helpful? Give feedback.
All reactions