diff --git a/pytorch-ddp-accelerate-transformers.md b/pytorch-ddp-accelerate-transformers.md index 07ac39f725..8815b7f5f2 100644 --- a/pytorch-ddp-accelerate-transformers.md +++ b/pytorch-ddp-accelerate-transformers.md @@ -173,7 +173,7 @@ The optimizer needs to be declared based on the model *on the specific device* ( Lastly, to run the script PyTorch has a convenient `torchrun` command line module that can help. Just pass in the number of nodes it should use as well as the script to run and you are set: ```bash -torchrun --nproc_per_nodes=2 --nnodes=1 example_script.py +torchrun --nproc_per_node=2 --nnodes=1 example_script.py ``` The above will run the training script on two GPUs that live on a single machine and this is the barebones for performing only distributed training with PyTorch.