FasterDecoding
diff --git a/‎.gitignore‎
Lines changed: 3 additions & 1 deletion b/‎.gitignore‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎llm_judge/README.md‎
Lines changed: 4 additions & 1 deletion b/‎llm_judge/README.md‎
Lines changed: 4 additions & 1 deletion
@@ -169,4 +169,6 @@ test_medusa*
 
 # test
 notebooks/test*.ipynb
-notebooks/*.pdf
+notebooks/*.pdf
+llm_judge/*.sh
+llm_judge/data/mt_bench_test
@@ -13,12 +13,15 @@ We report the 3 times running results of the Medusa X Vicuna v1.3 7/13/33b on a
 
 
 ```
-export CUDA_VISIBLE_DEVICES= 0 # set the GPU id
+export CUDA_VISIBLE_DEVICES=0 # set the GPU id
 python gen_model_answer_medusa.py  --model-path FasterDecoding/medusa-vicuna-7b-v1.3 --model-id medusa-vicuna-7b-v1.3-0
 python gen_model_answer_medusa.py  --model-path FasterDecoding/medusa-vicuna-13b-v1.3 --model-id medusa-vicuna-13b-v1.3-0
 python gen_model_answer_medusa.py  --model-path FasterDecoding/medusa-vicuna-33b-v1.3 --model-id medusa-vicuna-33b-v1.3-0
 ```
 
+- Run baseline: replace `gen_model_answer_medusa.py` with `gen_model_answer_baseline.py` (Please note we only implement the greedy inference for wall-time comparison. If you want to use the sampling generator, please refer to the original repository.)
+
+
 - Query the results
 
 ```