ddp replicates whole script #7950
-
Hi all, I am running a script that performs some calculations and then does a testing loop using multi GPU using a single node with DDP. some prepocesing code
...
model = LitModel(path_weights)
dataloader = DataLoader(dataset, batch_size=512,
shuffle=False, num_workers=32)
trainer = pl.Trainer(accelerator='ddp', gpus=8)
trainer.test(model, dataloader) The issue that I have is that it seems the the whole script is being computed several times, I can see that the preprocessing code is being called 8 times. It seems to be the same issue as posted in SO (https://stackoverflow.com/questions/66261729/pytorch-lightning-duplicates-main-script-in-ddp-mode) Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
after reading more carefully I realized that it is what it is supposed to do, I would like to know however, if is there a way to avoid replicating some heavy calculations several times, say I want to do the postprocessing only ones and share the results with all the other processes? i |
Beta Was this translation helpful? Give feedback.
after reading more carefully I realized that it is what it is supposed to do, I would like to know however, if is there a way to avoid replicating some heavy calculations several times, say I want to do the postprocessing only ones and share the results with all the other processes? i