Skip to content

Commit da3c126

Browse files
committed
Moved scripts/lm_eval/ to examples/lm_eval/
1 parent cc9b6db commit da3c126

File tree

13 files changed

+14
-14
lines changed

13 files changed

+14
-14
lines changed

.gitignore

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
results/
2-
scripts/lm_eval/prompts/system_message.txt
3-
scripts/lm_eval/prompts/evaluator_system_message.txt
2+
examples/lm_eval/prompts/system_message.txt
3+
examples/lm_eval/prompts/evaluator_system_message.txt
44

55
# Python
66
__pycache__/

scripts/lm_eval/README.md renamed to examples/lm_eval/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
## Usage
88

99
```bash
10-
$ python3 scripts/lm_eval/lm-eval.py -h
10+
$ python3 examples/lm_eval/lm-eval.py -h
1111
usage: lm-eval.py [-h] [--config CONFIG] [--init_file INIT_FILE] [--evaluator_file EVALUATOR_FILE] [--iterations ITERATIONS] [--limit LIMIT] [--tasks TASKS]
1212
[--output_path OUTPUT_PATH]
1313

@@ -30,26 +30,26 @@ options:
3030

3131
Early examples that **were meant to** indicate that more evolution iterations improve task performance -- I suspect the prompting may not be ideal yet:
3232
```
33-
$ python3 scripts/lm_eval/lm-eval.py --tasks gsm8k --limit 10 --iterations 1
33+
$ python3 examples/lm_eval/lm-eval.py --tasks gsm8k --limit 10 --iterations 1
3434
[..]
3535
Headline metrics:
3636
gsm8k exact_match,strict-match 80.000%
3737
[..]
3838
3939
40-
$ python3 scripts/lm_eval/lm-eval.py --tasks gsm8k --limit 10 --iterations 3
40+
$ python3 examples/lm_eval/lm-eval.py --tasks gsm8k --limit 10 --iterations 3
4141
[..]
4242
Headline metrics:
4343
gsm8k exact_match,strict-match 90.000%
4444
[..]
4545
46-
$ python3 scripts/lm_eval/lm-eval.py --tasks gsm8k --limit 10 --iterations 10
46+
$ python3 examples/lm_eval/lm-eval.py --tasks gsm8k --limit 10 --iterations 10
4747
[..]
4848
Headline metrics:
4949
gsm8k exact_match,strict-match 80.000%
5050
[..]
5151
52-
$ python3 scripts/lm_eval/lm-eval.py --tasks gsm8k --limit 10 --iterations 15
52+
$ python3 examples/lm_eval/lm-eval.py --tasks gsm8k --limit 10 --iterations 15
5353
[..]
5454
Headline metrics:
5555
gsm8k exact_match,strict-match 70.000%

scripts/lm_eval/config.yml renamed to examples/lm_eval/config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ prompt:
2323
num_top_programs: 3
2424
use_template_stochasticity: true
2525
# System prompt is created dynamically during the benchmark in file system_message.txt!
26-
template_dir: "scripts/lm_eval/prompts"
26+
template_dir: "examples/lm_eval/prompts"
2727

2828
# Database configuration
2929
database:
File renamed without changes.

scripts/lm_eval/lm-eval.py renamed to examples/lm_eval/lm-eval.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -42,9 +42,9 @@ def __init__(
4242
self.config_file = config_file
4343

4444
# folder must match prompt:template_dir in config.yml!
45-
self.prompt_path = "scripts/lm_eval/prompts/system_message.txt"
46-
self.evaluator_prompt_path = "scripts/lm_eval/prompts/evaluator_system_message.txt"
47-
self.best_path = "scripts/lm_eval/openevolve_output/best/best_program.txt"
45+
self.prompt_path = "examples/lm_eval/prompts/system_message.txt"
46+
self.evaluator_prompt_path = "examples/lm_eval/prompts/evaluator_system_message.txt"
47+
self.best_path = "examples/lm_eval/openevolve_output/best/best_program.txt"
4848
self.base_system_message = "You are an expert task solver, with a lot of commonsense, math, language and coding knowledge.\n\nConsider this task:\n```{prompt}´´´"
4949

5050
def generate(self, prompts: List[str], max_gen_toks: int = None, stop=None, **kwargs):
@@ -138,14 +138,14 @@ def generate_until(self, requests: Iterable[Any], **kw) -> List[str]:
138138
p = argparse.ArgumentParser(
139139
description="OpenEvolve <-> lm-evaluation-harness adapter.",
140140
)
141-
p.add_argument("--config", default="scripts/lm_eval/config.yml", help="config file")
141+
p.add_argument("--config", default="examples/lm_eval/config.yml", help="config file")
142142
p.add_argument(
143143
"--init_file",
144-
default="scripts/lm_eval/initial_content_stub.txt",
144+
default="examples/lm_eval/initial_content_stub.txt",
145145
help="initial content file",
146146
)
147147
p.add_argument(
148-
"--evaluator_file", default="scripts/lm_eval/evaluator_stub.py", help="evaluator file"
148+
"--evaluator_file", default="examples/lm_eval/evaluator_stub.py", help="evaluator file"
149149
)
150150
p.add_argument("--iterations", default=5, type=int, help="number of iterations")
151151
p.add_argument("--limit", default=None, type=int, help="limit the number of examples per task that are executed")
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)