Skip to content

convert_llama_weights_to_hf.py - problem converting llama 3.2 1B  #33760

@raulod

Description

@raulod

System Info

latest transformer
python 3.11.5

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

We are using convert_llama_weights_to_hf.py to convert original Llama-3.2-1B to HF but the resulting files are not complete. We successfuly used the same script with 3.1 models.

Below is the commnad line. I passed model size as 8B since 1B is not supported by convert_llama_weights_to_hf.py

python .venv/lib/python3.11/site-packages/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir Meta-Llama-3.2-1B/ --model_size 8B --output_dir hf --llama_version 3.1

It does convert, however the resulting HF file appears to be incomplete. When further converted generated HF to gguf, the resulting model fails to load with error "missing merges.txt"

We then downloaded same model from HuggingFace website and noticed that it has same safetensors file size but different json files, for example, below is the HF files produces by convert_llama_weights_to_hf.py

-rw-r--r--  1 root root        689 Sep 27 15:21 config.json
-rw-r--r--  1 root root        121 Sep 27 15:21 generation_config.json
-rw-r--r--  1 root root 2471645608 Sep 27 15:21 model.safetensors
-rw-r--r--  1 root root         73 Sep 27 15:21 special_tokens_map.json
-rw-r--r--  1 root root      50526 Sep 27 15:21 tokenizer_config.json
-rw-r--r--  1 root root   17208712 Sep 27 15:21 tokenizer.json

Below is files downloaded from hugging face website

-rw-r--r--  1 root root        843 Sep 27 14:25 config.json
-rw-r--r--  1 root root        186 Sep 27 14:25 generation_config.json
drwxr-xr-x  9 root root        174 Sep 27 14:59 .git
-rw-r--r--  1 root root       1519 Sep 27 14:25 .gitattributes
-rw-r--r--  1 root root       7712 Sep 27 14:25 LICENSE.txt
-rw-r--r--  1 root root 2471645608 Sep 27 14:59 model.safetensors
drwxr-xr-x  2 root root         75 Sep 27 14:59 original
-rw-r--r--  1 root root      35371 Sep 27 14:25 README.md
-rw-r--r--  1 root root        301 Sep 27 14:25 special_tokens_map.json
-rw-r--r--  1 root root      50500 Sep 27 14:25 tokenizer_config.json
-rw-r--r--  1 root root    9085657 Sep 27 14:25 tokenizer.json
-rw-r--r--  1 root root       6021 Sep 27 14:25 USE_POLICY.md

Expected behavior

Generated and downloaded from HuggingFace Files should have been the same.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions