Bypassing DataLoaders and Iterating Over Samples Manually #7635
Unanswered
amorehead
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello.
I have recently been working on a dense prediction problem where each 2D target matrix is of a variable length and width, so I have been using a batch size of 1. After benchmarking how long it takes to use the PyTorch DataLoader class on the target CPU architecture for training (i.e. Power9/PowerPC), I discovered a significant performance decrease when using DataLoaders for iterating over my data compared to manually using a for loop and calling my Dataset classes' __get__item() function (i.e. on the order of 5x slower using DataLoader). Given this slowdown, I was looking for a way to bypass using DataLoaders in PyTorch Lightning and instead manually iterate over the data. Is there any sensible way to do this?
Beta Was this translation helpful? Give feedback.
All reactions