load_from_checkpoint giving different validation results #6678
-
I'm creating a classifier that first trains a VAE then passes it into a convolutional network. The psudo code below kind of describes it: class VAE(pl.LightningModule):
# ...
class ConvNetwork(pl.LightningModule):
def __init__(self, vae):
# Trying both ways: pass in entire model vs loading checkpoint
# self.vae = vae
# self.vae = VAE.load_from_checkpoint(vae)
freeze_training(self.vae) # sets all params to requries_grad=False
self.sub_network = nn.Sequential(
# Mix of convolutional layers, ReLU activations, and Batch Normalization
)
def forward(self, data):
vae_decoded_results = self.vae(data)
results_that_differ_wildly = self.sub_network(vae_decoded_results)
If I train the VAE and pass in the entire model before training the convolutional network, I get good training/validation results. What I would prefer, however, is to train the VAE in a separate script, save off checkpoints, then pass the path of the checkpoint into the convolutional network. Then in the convolutional network's init I load the vae network, freeze training on it, and proceed to train the convolutional network. When I do this, my training results seem okay, but my validation results are all over the place. Some things I've checked:
I can't for the life of me figure out why using the results from a loaded model would so wildly differ from the results of a model I train and pass in all in one script, especially when the parameters and vae output appear to match. I'm sure I'm just missing something stupid. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
Just a wild guess, but maybe the model is in |
Beta Was this translation helpful? Give feedback.
Just a wild guess, but maybe the model is in
train
-mode after loading from a checkpoint. Have you triedmodel.eval()
in addition to setting therequires_grad
? I'm thinking about BN layers and so on, where this is important (see here).