Understanding the replicated DataLoaders in DDP #9251

jatentaki · 2021-09-01T10:54:21Z

jatentaki
Sep 1, 2021

I have two questions regarding the behavior of DataLoaders when multi-gpu training with ddp/ddp_spawn. Let me first define that I use "GPU worker" to mean the process using each of the N GPUs for model forward/backward and "data worker" to mean the processes created by torch.data.utils.DataLoader to load and preprocess batches.

How many data workers are there per GPU worker? I see that with ddp the dataset is being recreated for each GPU worker but the total number of data workers seems to be constant: does each GPU worker get its share of N_total_data_workers / N_gpu_workers? Is this documented somewhere?
I have a pipeline where the data workers actually use some GPU functionality (render some synthetic data via OpenGL) and I need to specify which GPU they should use. How can I figure out which GPU worker a data worker belongs to, such that I can load balance that rendering across GPUs?

awaelchli · 2021-09-04T21:08:17Z

awaelchli
Sep 4, 2021

For DDP there are as many data workers per GPU as you specify in the DataLoader with the num_workers argument. Example: 4 GPU processes, num_workers=4, in total 16 data workers. Not sure if this specific detail is mentioned in our docs but it's also just a logical consequence of how DDP is launched.
Not sure if it works in data worker processes but you can try torch.cuda.current_device(). If that doesn't work you will need to pass the GPU id to the dataset as a parameter.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Understanding the replicated DataLoaders in DDP #9251

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Understanding the replicated DataLoaders in DDP #9251

Uh oh!

jatentaki Sep 1, 2021

Replies: 1 comment

Uh oh!

Uh oh!

awaelchli Sep 4, 2021

jatentaki
Sep 1, 2021

awaelchli
Sep 4, 2021