Skip to content

Unable to load trained model #8

@YukinoshitaKaren

Description

@YukinoshitaKaren

I have tried to train llama2-7b-chat-hf, and I got 6 safetensors. But when I tried to load them using AutoModelForCausalLM.from_pretrained, something went wrong:

RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
        size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([65537024]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
        size mismatch for model.norm.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4096]).
        size mismatch for lm_head.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
        You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions