Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Commit bd7354e

Browse files
authored
Update evaluation.md (#1442)
* Update evaluation.md 1 - Remove outdated reference to running eval.py directly 2 - explain how we run ET/AOTI models with eval. 3 - Add an example with quantization to show how we can use eval to determine how to process models. * Update evaluation.md * Update evaluation.md Highlight ability to use different options and encourage users to experiment with them. * Update evaluation.md Wording corrections * Update build_native.sh Update to C++11 ABI for AOTI, similar to ET
1 parent b4547fd commit bd7354e

File tree

1 file changed

+19
-7
lines changed

1 file changed

+19
-7
lines changed

torchchat/utils/docs/evaluation.md

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ The evaluation mode of `torchchat.py` script can be used to evaluate your langua
2323

2424
## Examples
2525

26-
### Evaluation example with model in Python
26+
### Evaluation example with model in Python environment
2727

2828
Running wikitext for 10 iterations
2929
```
@@ -35,33 +35,45 @@ Running wikitext with torch.compile for 10 iterations
3535
python3 torchchat.py eval stories15M --compile --tasks wikitext --limit 10
3636
```
3737

38-
Running multiple tasks and calling eval.py directly (with torch.compile):
38+
Running multiple tasks with torch.compile for evaluation and prefill:
3939
```
40-
python3 torchchat.py eval stories15M --compile --tasks wikitext hellaswag
40+
python3 torchchat.py eval stories15M --compile --compile-prefill --tasks wikitext hellaswag
4141
```
4242

4343
### Evaluation with model exported to PTE with ExecuTorch
4444

45-
Running an exported model with ExecuTorch (as PTE)
45+
Running an exported model with ExecuTorch (as PTE). Advantageously, because you can
46+
load an exported PTE model back into the Python environment with torchchat,
47+
you can run evaluation on the exported model!
4648
```
4749
python3 torchchat.py export stories15M --output-pte-path stories15M.pte
4850
python3 torchchat.py eval stories15M --pte-path stories15M.pte
4951
```
5052

51-
Running multiple tasks and calling eval.py directly (with PTE):
53+
Running multiple tasks directly on the created PTE mobile model:
5254
```
5355
python3 torchchat.py eval stories15M --pte-path stories15M.pte --tasks wikitext hellaswag
5456
```
5557

58+
Now let's evaluate the effect of quantization on evaluation results by exporting with quantization using `--quantize` and an exemplary quantization configuration:
59+
```
60+
python3 torchchat.py export stories15M --output-pte-path stories15M.pte --quantize torchchat/quant_config/mobile.json
61+
python3 torchchat.py eval stories15M --pte-path stories15M.pte --tasks wikitext hellaswag
62+
```
63+
64+
Now try your own export options to explore different trade-offs between model size, evaluation speed and accuracy using model quantization!
65+
5666
### Evaluation with model exported to DSO with AOT Inductor (AOTI)
5767

58-
Running an exported model with AOT Inductor (DSO model)
68+
Running an exported model with AOT Inductor (DSO model). Advantageously, because you can
69+
load an exported DSO model back into the Python environment with torchchat,
70+
you can run evaluation on the exported model!
5971
```
6072
python3 torchchat.py export stories15M --dtype fast16 --output-dso-path stories15M.so
6173
python3 torchchat.py eval stories15M --dtype fast16 --dso-path stories15M.so
6274
```
6375

64-
Running multiple tasks and calling eval.py directly (with AOTI):
76+
Running multiple tasks with AOTI:
6577
```
6678
python3 torchchat.py eval stories15M --dso-path stories15M.so --tasks wikitext hellaswag
6779
```

0 commit comments

Comments
 (0)