Update evaluation.md

mikekgfb · web-flow · commit be3f271e7a04 · 2024-12-25T12:12:12.000-08:00
Highlight ability to use different options and encourage users to experiment with them.
diff --git a/torchchat/utils/docs/evaluation.md b/torchchat/utils/docs/evaluation.md
@@ -55,12 +55,14 @@ Running multiple tasks directly by creating a PTE mobile model:
 python3 torchchat.py eval stories15M --pte-path stories15M.pte --tasks wikitext hellaswag
 ```
 
-Now let's evaluate the effect of quantization on evaluation results:
+Now let's evaluate the effect of quantization on evaluation results by exporting with quantization using `--quantize` and an exemplary quantization configuration:
 ```
 python3 torchchat.py export stories15M --output-pte-path stories15M.pte --quantize torchchat/quant_config/mobile.json
 python3 torchchat.py eval stories15M --pte-path stories15M.pte --tasks wikitext hellaswag
 ```
 
+Now try your own export options to explore different trade-offs between model size, evaluation speed and accuracy using model quantization!
+
 ### Evaluation with model exported to DSO with AOT Inductor (AOTI)
 
 Running an exported model with AOT Inductor (DSO model).  Advantageously, because we can