Pytorch distributed training demo using Single device with multiple GPUs
[![NPM Version][npm-image]][npm-url] [![Build Status][travis-image]][travis-url] [![Downloads Stats][npm-downloads]][npm-url]
Go through the whole process of distributed traning by training a simple CNN model.
OS X & Linux:
git clone [email protected]:SANJINGSHOU14/Pytorch-distributed-training-demo.gitWindows:
just download the zip
Train the model on the single device with 4 GPUs
torchrun
--standalone
--nnodes=1
--nproc-per-node=4
YOUR_TRAINING_SCRIPT.py (--arg1 ... train script args...)remember replace YOUR_TRAINING_SCRIPT.py with your script name.
pip install torchChao – [email protected]
Distributed under the GNU3 license. See LICENSE for more information.
- Fork it (https://github.com/SANJINGSHOU14/Pytorch-distributed-training-demo/fork)
- Create your feature branch (
git checkout -b feature/fooBar) - Commit your changes (
git commit -am 'Add some fooBar') - Push to the branch (
git push origin feature/fooBar) - Create a new Pull Request