Skip to content

Conversion of MegatronLM checkpoint to HF transformer checkpoint fails. (ALIBI used during training) Β #21

@gagangayari

Description

@gagangayari

I have a Megatron LM checkpoint trained using ALIBI. Since ALIBI doesn't add positional embeddings, I don't have it in my checkpoints as well.

During conversion of my checkpoint to HF transformers checkpoint, using src/transformers/models/megatron_gpt_bigcode/checkpoint_reshaping_and_interoperability.py , I get the below error.

AttributeError: 'dict' object has not attribute 'to'

This is because, I believe, the function get_element_from_dict_by_path is not consistent with it's return type.
image

It returns positional embeddings(tensors) when I have the positional embedding.
It returns empty dictionary when I don't have it. (in my case)

The issue arises later when we try to convert data type of the output from the above function in line 412.

image

Can we add support for checkpoints trained using ALIBI ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions