-
Notifications
You must be signed in to change notification settings - Fork 30.9k
Description
System Info
latest transformer
python 3.11.5
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
We are using convert_llama_weights_to_hf.py
to convert original Llama-3.2-1B to HF but the resulting files are not complete. We successfuly used the same script with 3.1 models.
Below is the commnad line. I passed model size as 8B since 1B is not supported by convert_llama_weights_to_hf.py
python .venv/lib/python3.11/site-packages/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir Meta-Llama-3.2-1B/ --model_size 8B --output_dir hf --llama_version 3.1
It does convert, however the resulting HF file appears to be incomplete. When further converted generated HF to gguf, the resulting model fails to load with error "missing merges.txt"
We then downloaded same model from HuggingFace website and noticed that it has same safetensors file size but different json files, for example, below is the HF files produces by convert_llama_weights_to_hf.py
-rw-r--r-- 1 root root 689 Sep 27 15:21 config.json
-rw-r--r-- 1 root root 121 Sep 27 15:21 generation_config.json
-rw-r--r-- 1 root root 2471645608 Sep 27 15:21 model.safetensors
-rw-r--r-- 1 root root 73 Sep 27 15:21 special_tokens_map.json
-rw-r--r-- 1 root root 50526 Sep 27 15:21 tokenizer_config.json
-rw-r--r-- 1 root root 17208712 Sep 27 15:21 tokenizer.json
Below is files downloaded from hugging face website
-rw-r--r-- 1 root root 843 Sep 27 14:25 config.json
-rw-r--r-- 1 root root 186 Sep 27 14:25 generation_config.json
drwxr-xr-x 9 root root 174 Sep 27 14:59 .git
-rw-r--r-- 1 root root 1519 Sep 27 14:25 .gitattributes
-rw-r--r-- 1 root root 7712 Sep 27 14:25 LICENSE.txt
-rw-r--r-- 1 root root 2471645608 Sep 27 14:59 model.safetensors
drwxr-xr-x 2 root root 75 Sep 27 14:59 original
-rw-r--r-- 1 root root 35371 Sep 27 14:25 README.md
-rw-r--r-- 1 root root 301 Sep 27 14:25 special_tokens_map.json
-rw-r--r-- 1 root root 50500 Sep 27 14:25 tokenizer_config.json
-rw-r--r-- 1 root root 9085657 Sep 27 14:25 tokenizer.json
-rw-r--r-- 1 root root 6021 Sep 27 14:25 USE_POLICY.md
Expected behavior
Generated and downloaded from HuggingFace Files should have been the same.