You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This tool evaluates the perplexity of ONNX Runtime GenAI models using the [WikiText-2](https://huggingface.co/datasets/wikitext) dataset. Perplexity is a standard metric for language models: lower values indicate better predictive performance.
5
+
This tool evaluates the perplexity of ONNX Runtime GenAI models and HuggingFace models using the [WikiText-2](https://huggingface.co/datasets/wikitext) dataset. Perplexity is a standard metric for language models: lower values indicate better predictive performance.
6
6
7
7
## Attribution
8
8
@@ -11,6 +11,7 @@ This script is originally based on [perplexity_metrics.py](https://github.com/mi
11
11
- Multiple context lengths
12
12
- Configurable chunk sizes
13
13
- Enhanced prefill chunking handling
14
+
- HuggingFace model evaluation support
14
15
15
16
## Scripts
16
17
@@ -20,8 +21,10 @@ This script is originally based on [perplexity_metrics.py](https://github.com/mi
20
21
## Requirements
21
22
22
23
- Python 3.8+
24
+
- CUDA 12.x (if using GPU acceleration)
23
25
- Install dependencies:
24
26
27
+
**For CUDA 12.x (recommended for CUDA 12.1-12.9):**
25
28
```bash
26
29
pip install -r requirements.txt
27
30
```
@@ -34,53 +37,96 @@ This script is originally based on [perplexity_metrics.py](https://github.com/mi
34
37
35
38
## Supported Models
36
39
40
+
### ONNX Runtime GenAI Models
37
41
- Any ONNX Runtime GenAI model exported with a compatible `genai_config.json` and tokenizer.
python run_perplexity.py --hf_model gpt2 --hf_device cpu --i 1024
98
+
```
99
+
100
+
### Evaluate Both ONNX and HuggingFace Models Together
101
+
102
+
Compare ONNX and HuggingFace models side-by-side:
103
+
104
+
```bash
105
+
python run_perplexity.py \
106
+
--models /path/to/onnx_model \
107
+
--hf_model meta-llama/Llama-2-7b-hf \
108
+
--hf_dtype float16 \
109
+
--i 1024 \
110
+
--output comparison_results.csv
111
+
```
112
+
113
+
### HuggingFace Model Arguments
114
+
115
+
-`--hf_model`: HuggingFace model name or local path (e.g., `meta-llama/Llama-2-7b-hf`)
116
+
-`--hf_device`: Device to run on (`cuda`, `cpu`, `cuda:0`, etc.) - default: `cuda`
117
+
-`--hf_dtype`: Data type for model weights - options: `float16`, `bfloat16`, `float32`, `fp16`, `bf16`, `fp32` - default: model default (usually float32)
0 commit comments