Use lightning to run a single X * batch-sized fit step(s)? #13134

aniongithub · 2022-05-23T20:44:06Z

aniongithub
May 23, 2022

Currently, calling trainer.fit runs the trainer until the specified training conditions are met, or early exit is indicated.

I'd like to use pytorch-lightning as part of a larger, graph-like training and inference data flow. This means an epoch or more long blocking fit call wouldn't really work for my use-case, because my orchestrator might run some other code both before the LightningModule/LightningDataModule (perhaps producing/consuming the inputs and outputs of the module).

Is there a way to run this such that each call to pytorch-lightning runs one step of the under-the-hood steps shown here? This would let me orchestrate the larger dataflow while invoking lightning to deal with distributed training of each batch.

Is using fast_dev_run the correct way of doing this? Can I simply keep calling:

trainer.fit(...)

on a trainer created with fast_dev_run=X in a loop, where X = trainer.accumulate_grad_batches whenever my orchestrator reaches a pytorch-lightning node that needs to run one training step inside my larger workflow?

If not, what other options are available to me to do this?

Thanks in advance!

Edit: Trying it out, It appears that fast_dev_run calls setup for every call to fit. I was able to prevent re-initialization of my datasets using a flag, but regardless of that,

-----------------------------------------------
0 | model     | ResNet           | 11.2 M
1 | criterion | CrossEntropyLoss | 0     
2 | accuracy  | Accuracy         | 0     
-----------------------------------------------

I never see loss or accuracy change over the course of many iterations. So it appears fast_dev_run isn't what I'm looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use lightning to run a single X * batch-sized fit step(s)? #13134

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Use lightning to run a single X * batch-sized fit step(s)? #13134

Uh oh!

Uh oh!

aniongithub May 23, 2022

Replies: 0 comments

aniongithub
May 23, 2022