How exactly does replace_sampler_ddp work in lightning source code? #9057
Unanswered
wangchu1
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment 2 replies
-
Dear @fate3439, This is actually done there: https://github.com/PyTorchLightning/pytorch-lightning/blob/e1442d247e0e4967dd2772bdcf5166226c974f89/pytorch_lightning/trainer/data_loading.py#L307. Here is the flow. FitLoop -> reset_train_dataloader -> call model.train_dataloader or datamodule.train_dataloader and apply modification to the DataLoader such as injecting the distributed sampler within the DataLoader. Best, |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
By searching the traces of replace_sampler_ddp in github code search, I can pinpoint the related function in the following link.
https://github.com/PyTorchLightning/pytorch-lightning/blob/e1442d247e0e4967dd2772bdcf5166226c974f89/pytorch_lightning/trainer/data_loading.py#L122
Now, my question is, when & where does the above auto_add_sampler function get called to assign distributed loaders for the trainer? Very interested in where did the code base set shuffle flag for train_loader and val_loader. Suppose I call:
trainer.fit(model, train_dataloader, val_dataloader)
where train_dataloader and val_dataloader are pytorch data loaders wrapped on a map like dataset. When we don't use lightning data module, will the trainer code still handles the auto adding ddp sampler function, and create shuffle for train and no shuffle for val?
Thanks for any help!
Beta Was this translation helpful? Give feedback.
All reactions