diff --git a/install/requirements.txt b/install/requirements.txt index bda626257..d051d29cd 100644 --- a/install/requirements.txt +++ b/install/requirements.txt @@ -30,3 +30,6 @@ streamlit # Server mode flask + +# eval +lm_eval==0.4.2 diff --git a/torchchat/utils/docs/evaluation.md b/torchchat/utils/docs/evaluation.md index a3e865169..490500223 100644 --- a/torchchat/utils/docs/evaluation.md +++ b/torchchat/utils/docs/evaluation.md @@ -9,7 +9,7 @@ Torchchat provides evaluation functionality for your language model on a variety of tasks using the -[lm-evaluation-harness](https://github.com/facebookresearch/lm_eval) +[lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) library. ## Usage @@ -34,6 +34,6 @@ Running multiple tasks and calling eval.py directly: python3 torchchat.py eval stories15M --pte-path stories15M.pte --tasks wikitext hellaswag ``` -For more information and a list of tasks/metrics see [lm-evaluation-harness](https://github.com/facebookresearch/lm_eval). +For more information and a list of tasks/metrics see [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness). [end default]: end