You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -89,7 +89,7 @@ accelerate launch main.py \
89
89
*`limit` represents the number of problems to solve, if it's not provided all problems in the benchmark are selected.
90
90
*`allow_code_execution` is for executing the generated code: it is off by default, read the displayed warning before calling it to enable execution.
91
91
* Some models with custom code on the HF hub like [SantaCoder](https://huggingface.co/bigcode/santacoder) require calling `--trust_remote_code`, for private models add `--use_auth_token`.
92
-
*`save_generations` saves the post-processed generations in a json file. You can also save references by calling `--save_references`
92
+
*`save_generations` saves the post-processed generations in a json file at `save_generations_path` (by default `generations.json`). You can also save references by calling `--save_references`
93
93
94
94
Some tasks don't require code execution such as
95
95
`codexglue_code_to_text-<LANGUAGE>`/`codexglue_code_to_text-python-left`/`conala`/`concode` that use BLEU evaluation. In addition, we generate one candidate solution for each problem in these tasks, so use `n_samples=1` and `batch_size=1`. (Note that `batch_size` should always be equal or less than `n_samples`).
@@ -108,7 +108,7 @@ If you already have the generations in a json file from this evaluation harness
108
108
Below is an example, be mind of specifying arguments proper to the task you are evaluating on, and note that `model` value here only serves for documenting the experiment.
0 commit comments