Skip to content

Commit 4164026

Browse files
committed
Update README
1 parent 26078ae commit 4164026

File tree

1 file changed

+14
-46
lines changed

1 file changed

+14
-46
lines changed

extension/llm/export/README.md

Lines changed: 14 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,9 @@ The LLM export process transforms a model from its original format to an optimiz
2323

2424
## Usage
2525

26-
The export API supports two configuration approaches:
26+
The export API supports a Hydra-style CLI where you can you configure using yaml and also CLI args.
2727

28-
### Option 1: Hydra CLI Arguments
28+
### Hydra CLI Arguments
2929

3030
Use structured configuration arguments directly on the command line:
3131

@@ -41,7 +41,7 @@ python -m extension.llm.export.export_llm \
4141
quantization.qmode=8da4w
4242
```
4343

44-
### Option 2: Configuration File
44+
### Configuration File
4545

4646
Create a YAML configuration file and reference it:
4747

@@ -78,53 +78,21 @@ debug:
7878
verbose: true
7979
```
8080
81-
**Important**: You cannot mix both approaches. Use either CLI arguments OR a config file, not both.
81+
You can you also still provide additional overrides using the CLI args as well:
8282
83-
## Example Commands
84-
85-
### Export Qwen3 0.6B with XNNPACK backend and quantization
8683
```bash
87-
python -m extension.llm.export.export_llm \
88-
base.model_class=qwen3_0_6b \
89-
base.params=examples/models/qwen3/0_6b_config.json \
90-
base.metadata='{"get_bos_id": 151644, "get_eos_ids":[151645]}' \
91-
model.use_kv_cache=true \
92-
model.use_sdpa_with_kv_cache=true \
93-
model.dtype_override=FP32 \
94-
export.max_seq_length=512 \
95-
export.output_name=qwen3_0_6b.pte \
96-
quantization.qmode=8da4w \
97-
backend.xnnpack.enabled=true \
98-
backend.xnnpack.extended_ops=true \
99-
debug.verbose=true
84+
python -m extension.llm.export.export_llm
85+
--config my_config.yaml
86+
base.model_class="llama2"
87+
+export.max_context_length=1024
10088
```
10189

102-
### Export Phi-4-Mini with custom checkpoint
103-
```bash
104-
python -m extension.llm.export.export_llm \
105-
base.model_class=phi_4_mini \
106-
base.checkpoint=/path/to/phi4_checkpoint.pth \
107-
base.params=examples/models/phi-4-mini/config.json \
108-
base.metadata='{"get_bos_id":151643, "get_eos_ids":[151643]}' \
109-
model.use_kv_cache=true \
110-
model.use_sdpa_with_kv_cache=true \
111-
export.max_seq_length=256 \
112-
export.output_name=phi4_mini.pte \
113-
backend.xnnpack.enabled=true \
114-
debug.verbose=true
115-
```
90+
Note that if a config file is specified and you want to specify a CLI arg that is not in the config, you need to prepend with a `+`. You can read more about this in the Hydra [docs](https://hydra.cc/docs/advanced/override_grammar/basic/).
11691

117-
### Export with CoreML backend (iOS optimization)
118-
```bash
119-
python -m extension.llm.export.export_llm \
120-
base.model_class=llama3 \
121-
model.use_kv_cache=true \
122-
export.max_seq_length=128 \
123-
backend.coreml.enabled=true \
124-
backend.coreml.compute_units=ALL \
125-
quantization.pt2e_quantize=coreml_c4w \
126-
debug.verbose=true
127-
```
92+
93+
## Example Commands
94+
95+
Please refer to the docs for some of our example suported models ([Llama](https://github.com/pytorch/executorch/blob/main/examples/models/llama/README.md), [Qwen3](https://github.com/pytorch/executorch/tree/main/examples/models/qwen3/README.md), [Phi-4-mini](https://github.com/pytorch/executorch/tree/main/examples/models/phi_4_mini/README.md)).
12896

12997
## Configuration Options
13098

@@ -134,4 +102,4 @@ For a complete reference of all available configuration options, see the [LlmCon
134102

135103
- [Llama Examples](../../../examples/models/llama/README.md) - Comprehensive Llama export guide
136104
- [LLM Runner](../runner/) - Running exported models
137-
- [ExecuTorch Documentation](https://pytorch.org/executorch/) - Framework overview
105+
- [ExecuTorch Documentation](https://pytorch.org/executorch/) - Framework overview

0 commit comments

Comments
 (0)