Bypassing DataLoaders and Iterating Over Samples Manually #7635

amorehead · 2021-05-20T21:22:29Z

amorehead
May 20, 2021

Hello.

I have recently been working on a dense prediction problem where each 2D target matrix is of a variable length and width, so I have been using a batch size of 1. After benchmarking how long it takes to use the PyTorch DataLoader class on the target CPU architecture for training (i.e. Power9/PowerPC), I discovered a significant performance decrease when using DataLoaders for iterating over my data compared to manually using a for loop and calling my Dataset classes' __get__item() function (i.e. on the order of 5x slower using DataLoader). Given this slowdown, I was looking for a way to bypass using DataLoaders in PyTorch Lightning and instead manually iterate over the data. Is there any sensible way to do this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bypassing DataLoaders and Iterating Over Samples Manually #7635

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Bypassing DataLoaders and Iterating Over Samples Manually #7635

Uh oh!

Uh oh!

amorehead May 20, 2021

Replies: 0 comments

amorehead
May 20, 2021