Update ExecuTorch instructions in the model release template (#2975)

metascroy · web-flow · commit 011027cdd133 · 2025-09-11T17:10:41.000-07:00
* up

* Update quantize_and_upload.py
diff --git a/.github/scripts/torchao_model_releases/quantize_and_upload.py b/.github/scripts/torchao_model_releases/quantize_and_upload.py
@@ -584,34 +584,39 @@ def _untie_weights_and_save_locally(model_id):
 Once ExecuTorch is [set-up](https://pytorch.org/executorch/main/getting-started.html), exporting and running the model on device is a breeze.
 
 ExecuTorch's LLM export scripts require the checkpoint keys and parameters have certain names, which differ from those used in Hugging Face.
-So we first use a conversion script that converts the Hugging Face checkpoint key names to ones that ExecuTorch expects:
+So we first use a script that converts the Hugging Face checkpoint key names to ones that ExecuTorch expects:
+The following script does this for you.
 
 [TODO: fix command below where necessary]
 ```Shell
 python -m executorch.examples.models.qwen3.convert_weights $(hf download {quantized_model}) pytorch_model_converted.bin
 ```
 
-Once we have the checkpoint, we export it to ExecuTorch with the XNNPACK backend as follows.
-(ExecuTorch LLM export script requires config.json have certain key names.  The correct config to use for the LLM export script is located at [TODO: fill in, e.g., examples/models/qwen3/config/4b_config.json] within the ExecuTorch repo.)
+Once we have the checkpoint, we export it to ExecuTorch with a max_seq_length/max_context_length of 1024 to the XNNPACK backend as follows. 
+
+[TODO: fix config path in note where necessary]
+(Note: ExecuTorch LLM export script requires config.json have certain key names. The correct config to use for the LLM export script is located at examples/models/qwen3/config/4b_config.json within the ExecuTorch repo.)
 
 [TODO: fix command below where necessary]
 ```Shell
 python -m executorch.examples.models.llama.export_llama \
-    --model "qwen3_4b" \
-	--checkpoint pytorch_model_converted.bin \
-	--params examples/models/qwen3/config/4b_config.json \
-	--output_name="model.pte" \
-	-kv \
-	--use_sdpa_with_kv_cache \
-	-X \
-	--xnnpack-extended-ops \
-	--max_context_length 1024 \
-	--max_seq_length 1024 \
-	--dtype fp32 \
-	--metadata '{{"get_bos_id":199999, "get_eos_ids":[200020,199999]}}'
+  --model "qwen3_4b" \
+  --checkpoint pytorch_model_converted.bin \
+  --params examples/models/qwen3/config/4b_config.json \
+  --output_name model.pte \
+  -kv \
+  --use_sdpa_with_kv_cache \
+  -X \
+  --xnnpack-extended-ops \
+  --max_context_length 1024 \
+  --max_seq_length 1024 \
+  --dtype fp32 \
+  --metadata '{"get_bos_id":199999, "get_eos_ids":[200020,199999]}'
 ```
 
 After that you can run the model in a mobile app (see [Running in a mobile app](#running-in-a-mobile-app)).
+
+(We try to keep these instructions up-to-date, but if you find they do not work, check out our [CI test in ExecuTorch](https://github.com/pytorch/executorch/blob/main/.ci/scripts/test_torchao_huggingface_checkpoints.sh) for the latest source of truth, and let us know we need to update our model card.)
 """