Resources on how to use the GPU clusters: das5, ivi and LISA.
-
DO NOT stay idle on a GPU node.
-
Learn on how to create sessions with either
screenortmux. Opening a session and then submitting your job allows you to disconnect and then later resume (from a different location) and monitor the progress.tmuxis not available on lisa. -
nvidia-smiis useful to check the usage of GPUs on a node. -
CUDA_VISIBLE_DEVICES=0 python myscript.pywill only make the GPU:0 visible to python. Alternatively, you can specify it within your python script. -
You can create a bash script with multiple parallel jobs that you will submit via
slurm. Here is an example:
CUDA_VISIBLE_DEVICES=0 python exp1.py & \
CUDA_VISIBLE_DEVICES=1,2 python exp2.py & \
CUDA_VISIBLE_DEVICES=3 python exp3.py & \
waitA quick tutorial on how to use das4 is available here.