Pytorch-distributed-training-demo

Pytorch distributed training demo using Single device with multiple GPUs

[![NPM Version][npm-image]][npm-url] [![Build Status][travis-image]][travis-url] [![Downloads Stats][npm-downloads]][npm-url]

Go through the whole process of distributed traning by training a simple CNN model.

Installation

OS X & Linux:

git clone [email protected]:SANJINGSHOU14/Pytorch-distributed-training-demo.git

Windows:

just download the zip

Usage example

Train the model on the single device with 4 GPUs

torchrun
    --standalone
    --nnodes=1
    --nproc-per-node=4
    YOUR_TRAINING_SCRIPT.py (--arg1 ... train script args...)

remember replace YOUR_TRAINING_SCRIPT.py with your script name.

Development setup

pip install torch

Release History

Contributing

Fork it (https://github.com/SANJINGSHOU14/Pytorch-distributed-training-demo/fork)
Create your feature branch (git checkout -b feature/fooBar)
Commit your changes (git commit -am 'Add some fooBar')
Push to the branch (git push origin feature/fooBar)
Create a new Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pytorch-distributed-training-demo

Installation

Usage example

Development setup

Release History

Meta

Contributing

About

Uh oh!

Releases

Packages

Languages

License

chaochen1998/Pytorch-distributed-training-demo

Folders and files

Latest commit

History

Repository files navigation

Pytorch-distributed-training-demo

Installation

Usage example

Development setup

Release History

Meta

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages