Is pytorch lightning slow between epochs #12080
Unanswered
pamparana34
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 3 comments
-
Hey @pamparana34
|
Beta Was this translation helpful? Give feedback.
0 replies
-
1: No custom |
Beta Was this translation helpful? Give feedback.
0 replies
-
@pamparana34 You can use the profiler to get some insight into your code. trainer = Trainer(profiler="simple")
trainer = Trainer(profiler="advanced") https://pytorch-lightning.readthedocs.io/en/1.5.10/common/trainer.html#profiler |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am using pytorch lightning and during the epoch I am getting around 25 iterations/second, which is comparable to vanilla pytorch code.
However, the time taken between an epoch ending and the next one starting seems to be relatively large, order of few seconds. Is this normal behaviour on the side of pytorch lightning. This really massively increases my training time.
So, I am creating my
train_dataloader
as follows:The
persistent_workers
flag seems important else my number of iterations severely drops.My train step is nothing special:
My dataset does some heavy lifting in its constructor but this is only created once as far as I can tell. I am at a loss as to why the delay would be so pronounced. To put this into perspective, my whole epoch takes a second but to start the next epoch takes another 3....
Is there anyway I can get a hint of what might be happening? Or is there some tricks I can use to speed this up?
Beta Was this translation helpful? Give feedback.
All reactions