"load_from_checkpoint" for Module that uses "from_pretrained" #12513

zorikg · 2022-03-29T19:51:08Z

zorikg
Mar 29, 2022

Let's say that I fine-tune a model following the example from the guide (attached the code below for convenience). It is not clear to me how do I load this model for inference or additional training.

If I do BertMNLIFinetuner.load_from_checkpoint(<path_to_saved_checkpoint>), then the behavior that I observe is that the Ctor is still called and so is the line with self.bert = BertModel.from_pretrained("bert-base-cased", output_attentions=True), which essentially overrides the model weights (right?).

What is the recommended practice for loading checkpoint of models that were created with this pattern (i.e. they load a pre-trained model as part of the LightningModule construction)?

Thanks,
Zorik

class BertMNLIFinetuner(LightningModule):
    def __init__(self):
        super().__init__()

        self.bert = BertModel.from_pretrained("bert-base-cased", output_attentions=True)
        self.W = nn.Linear(bert.config.hidden_size, 3)
        self.num_classes = 3

    def forward(self, input_ids, attention_mask, token_type_ids):

        h, _, attn = self.bert(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)

        h_cls = h[:, 0]
        logits = self.W(h_cl

tshu-w · 2022-03-30T01:33:42Z

tshu-w
Mar 30, 2022

As far as I can see from my experiments, although init reloads the pretrained model, the checkpoint you saved will eventually overwrite it. You can also see this discussion #9236.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

"load_from_checkpoint" for Module that uses "from_pretrained" #12513

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

"load_from_checkpoint" for Module that uses "from_pretrained" #12513

Uh oh!

zorikg Mar 29, 2022

Replies: 1 comment

Uh oh!

tshu-w Mar 30, 2022

zorikg
Mar 29, 2022

tshu-w
Mar 30, 2022