Hi, I just wonder the difference of train script between "single_gpu" and "data_parallel", since they seem like have the same structure and module, also using the same API.
By the way, would you introduce how to use the distributed one? I am a little bit confuse about how to set the url and how to start using this.
Thx.