Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Commit be3f271

Browse files
authored
Update evaluation.md
Highlight ability to use different options and encourage users to experiment with them.
1 parent f460c3e commit be3f271

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

torchchat/utils/docs/evaluation.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,12 +55,14 @@ Running multiple tasks directly by creating a PTE mobile model:
5555
python3 torchchat.py eval stories15M --pte-path stories15M.pte --tasks wikitext hellaswag
5656
```
5757

58-
Now let's evaluate the effect of quantization on evaluation results:
58+
Now let's evaluate the effect of quantization on evaluation results by exporting with quantization using `--quantize` and an exemplary quantization configuration:
5959
```
6060
python3 torchchat.py export stories15M --output-pte-path stories15M.pte --quantize torchchat/quant_config/mobile.json
6161
python3 torchchat.py eval stories15M --pte-path stories15M.pte --tasks wikitext hellaswag
6262
```
6363

64+
Now try your own export options to explore different trade-offs between model size, evaluation speed and accuracy using model quantization!
65+
6466
### Evaluation with model exported to DSO with AOT Inductor (AOTI)
6567

6668
Running an exported model with AOT Inductor (DSO model). Advantageously, because we can

0 commit comments

Comments
 (0)