Skip to content

Commit 9462399

Browse files
authored
Updated deprecated Slurm variable name (#191)
`SLURM_GPUS_PER_TASK` seems to be deprecated on Alps. The current submission script fails because the variable is passed to torch as an empty string. `SLURM_GPUS_ON_NODE` returns the proper number.
1 parent 7fc845c commit 9462399

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/guides/mlp_tutorials/llm-nanotron-training.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -229,7 +229,7 @@ srun -ul --environment=nanotron bash -c "
229229
--master-addr=\${MASTER_ADDR} \
230230
--master-port=\${MASTER_PORT} \
231231
--nnodes=\${SLURM_NNODES} \
232-
--nproc-per-node=\${SLURM_GPUS_PER_TASK} \
232+
--nproc-per-node=\${SLURM_GPUS_ON_NODE} \
233233
\"
234234
235235
torchrun \${TORCHRUN_ARGS} run_train.py --config-file examples/config_tiny_llama_wikitext.yaml

0 commit comments

Comments
 (0)