Skip to content

Commit 011027c

Browse files
authored
Update ExecuTorch instructions in the model release template (#2975)
* up * Update quantize_and_upload.py
1 parent cffba61 commit 011027c

File tree

1 file changed

+20
-15
lines changed

1 file changed

+20
-15
lines changed

.github/scripts/torchao_model_releases/quantize_and_upload.py

Lines changed: 20 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -584,34 +584,39 @@ def _untie_weights_and_save_locally(model_id):
584584
Once ExecuTorch is [set-up](https://pytorch.org/executorch/main/getting-started.html), exporting and running the model on device is a breeze.
585585
586586
ExecuTorch's LLM export scripts require the checkpoint keys and parameters have certain names, which differ from those used in Hugging Face.
587-
So we first use a conversion script that converts the Hugging Face checkpoint key names to ones that ExecuTorch expects:
587+
So we first use a script that converts the Hugging Face checkpoint key names to ones that ExecuTorch expects:
588+
The following script does this for you.
588589
589590
[TODO: fix command below where necessary]
590591
```Shell
591592
python -m executorch.examples.models.qwen3.convert_weights $(hf download {quantized_model}) pytorch_model_converted.bin
592593
```
593594
594-
Once we have the checkpoint, we export it to ExecuTorch with the XNNPACK backend as follows.
595-
(ExecuTorch LLM export script requires config.json have certain key names. The correct config to use for the LLM export script is located at [TODO: fill in, e.g., examples/models/qwen3/config/4b_config.json] within the ExecuTorch repo.)
595+
Once we have the checkpoint, we export it to ExecuTorch with a max_seq_length/max_context_length of 1024 to the XNNPACK backend as follows.
596+
597+
[TODO: fix config path in note where necessary]
598+
(Note: ExecuTorch LLM export script requires config.json have certain key names. The correct config to use for the LLM export script is located at examples/models/qwen3/config/4b_config.json within the ExecuTorch repo.)
596599
597600
[TODO: fix command below where necessary]
598601
```Shell
599602
python -m executorch.examples.models.llama.export_llama \
600-
--model "qwen3_4b" \
601-
--checkpoint pytorch_model_converted.bin \
602-
--params examples/models/qwen3/config/4b_config.json \
603-
--output_name="model.pte" \
604-
-kv \
605-
--use_sdpa_with_kv_cache \
606-
-X \
607-
--xnnpack-extended-ops \
608-
--max_context_length 1024 \
609-
--max_seq_length 1024 \
610-
--dtype fp32 \
611-
--metadata '{{"get_bos_id":199999, "get_eos_ids":[200020,199999]}}'
603+
--model "qwen3_4b" \
604+
--checkpoint pytorch_model_converted.bin \
605+
--params examples/models/qwen3/config/4b_config.json \
606+
--output_name model.pte \
607+
-kv \
608+
--use_sdpa_with_kv_cache \
609+
-X \
610+
--xnnpack-extended-ops \
611+
--max_context_length 1024 \
612+
--max_seq_length 1024 \
613+
--dtype fp32 \
614+
--metadata '{"get_bos_id":199999, "get_eos_ids":[200020,199999]}'
612615
```
613616
614617
After that you can run the model in a mobile app (see [Running in a mobile app](#running-in-a-mobile-app)).
618+
619+
(We try to keep these instructions up-to-date, but if you find they do not work, check out our [CI test in ExecuTorch](https://github.com/pytorch/executorch/blob/main/.ci/scripts/test_torchao_huggingface_checkpoints.sh) for the latest source of truth, and let us know we need to update our model card.)
615620
"""
616621

617622

0 commit comments

Comments
 (0)