Skip to content

Commit 3dd5f2a

Browse files
committed
adjusted OMP_NUM_THREADS
1 parent 8e57af6 commit 3dd5f2a

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

docs/software/ml/pytorch.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -333,7 +333,7 @@ However, this workflow is more involved and intended for advanced Spack users.
333333
#################################
334334
# OpenMP environment variables #
335335
#################################
336-
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK # (2)!
336+
export OMP_NUM_THREADS=8 # (2)!
337337

338338
#################################
339339
# PyTorch environment variables #
@@ -378,7 +378,10 @@ However, this workflow is more involved and intended for advanced Spack users.
378378

379379
1. The `--uenv` option is used to specify the uenv to use for the job.
380380
The `--view=default` option is used to load all the packages provided by the uenv.
381-
2. Only set `OMP_NUM_THREADS` if you are using OpenMP in your code.
381+
2. Set `OMP_NUM_THREADS` if you are using OpenMP in your code.
382+
The number of threads should be not greater than the number of cores per task (`$SLURM_CPUS_PER_TASK`).
383+
The optimal number depends on the workload and should be determined by testing.
384+
Consider for example that typical workloads using PyTorch may fork the processes, so the number of threads should be around the number of cores per task divided by the number of processes.
382385
3. These variables are used by PyTorch to initialize the distributed backend.
383386
The `MASTER_ADDR` and `MASTER_PORT` variables are used to determine the address and port of the master node.
384387
Additionally we also need `RANK` and `LOCAL_RANK` but these must be set per-process, see below.

0 commit comments

Comments
 (0)