How to make model independent on Trainer's strategy during inference? #11243
Unanswered
jakub-h
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have a model trained using
ddp
and saved usingpytorch_lightning.callbacks.model_checkpoint.ModelCheckpoint
.When I load it afterwards with
and try to make predictions for a batch (correctly preprocessed by the same procedure used during training)
I get almost the same outputs regardless the input (there is a difference, but is negligible). However, when I use a Trainer (with identical parameters as used during the training + DataModule used in training) for the inference, I obtain correct predictions. By correct I mean meaningful and dependent on the input (i.e. not almost constant for any input). Trainer used for training:
Then I experimented with the Trainer's params and figured out that when I use
sync_batchnorm=False
,dp
instead ofddp
orcpu
instead ofgpu
for the inference Trainer, I get the wrong results again.Am I missing something crucial? I would like to use the trained model independently on the trainer and I don't see any reason why a model trained with
ddp
could not be run on e.g. CPU for the inference. Specifically, I want to use Captum which works with PyTorch models without Trainers.I work on CentOS Stream 8 server with
torch==1.10.0+cu111
andpytorch-lightning==1.5.0
. Thank you for any guidance, I'm clueless.Beta Was this translation helpful? Give feedback.
All reactions