Skip to content

Commit dd9c8a5

Browse files
authored
Merge pull request #53 from FasterDecoding/llm_judge
Llm judge update
2 parents 077977a + 5574287 commit dd9c8a5

File tree

4 files changed

+527
-6
lines changed

4 files changed

+527
-6
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -169,4 +169,6 @@ test_medusa*
169169

170170
# test
171171
notebooks/test*.ipynb
172-
notebooks/*.pdf
172+
notebooks/*.pdf
173+
llm_judge/*.sh
174+
llm_judge/data/mt_bench_test

llm_judge/README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,15 @@ We report the 3 times running results of the Medusa X Vicuna v1.3 7/13/33b on a
1313

1414

1515
```
16-
export CUDA_VISIBLE_DEVICES= 0 # set the GPU id
16+
export CUDA_VISIBLE_DEVICES=0 # set the GPU id
1717
python gen_model_answer_medusa.py --model-path FasterDecoding/medusa-vicuna-7b-v1.3 --model-id medusa-vicuna-7b-v1.3-0
1818
python gen_model_answer_medusa.py --model-path FasterDecoding/medusa-vicuna-13b-v1.3 --model-id medusa-vicuna-13b-v1.3-0
1919
python gen_model_answer_medusa.py --model-path FasterDecoding/medusa-vicuna-33b-v1.3 --model-id medusa-vicuna-33b-v1.3-0
2020
```
2121

22+
- Run baseline: replace `gen_model_answer_medusa.py` with `gen_model_answer_baseline.py` (Please note we only implement the greedy inference for wall-time comparison. If you want to use the sampling generator, please refer to the original repository.)
23+
24+
2225
- Query the results
2326

2427
```

0 commit comments

Comments
 (0)