Skip to content

Commit 91408e0

Browse files
committed
Add config and expand docs
Signed-off-by: Claudio Spiess <[email protected]>
1 parent ee45d5e commit 91408e0

File tree

4 files changed

+45
-7
lines changed

4 files changed

+45
-7
lines changed

docs/autopdl.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,3 +91,24 @@ variables: # define discrete options to sample from
9191
```python title="examples/optimizer/gsm8k_evaluator.py" linenums="1"
9292
--8<-- "./examples/optimizer/gsm8k_evaluator.py"
9393
```
94+
95+
We can see an example of a script to run the optimization process in `examples/optimizer/optimize.py`.
96+
Usage:
97+
98+
```
99+
python optimize.py optimize -h
100+
usage: optimize.py optimize [-h] --config CONFIG --dataset-path DATASET_PATH [--experiments-path EXPERIMENTS_PATH]
101+
[--yield_output | --no-yield_output] [--dry | --no-dry]
102+
pdl_file
103+
```
104+
105+
We also need a dataset to optimize against, with `train`, `test`, and `validation` splits. To produce such a dataset, we can use HuggingFace Datasets `load_dataset` and `save_to_disk`. This example requires the dataset to have columns `question`, `reasoning`, and `answer`, which can be created from the original `openai/gsm8k` dataset. Processing scripts are under development and will follow shortly.
106+
107+
We can run an example like so:
108+
109+
```
110+
cd examples/optimizer
111+
python optimize.py optimize --config config.yml --dataset-path datasets/gsm8k gsm8k.pdl
112+
```
113+
114+
Once the process is complete, a file `optimized_gsm8k.pdl` is written. This file contains the optimal configuration and is directly executable by the standard PDL interpreter.

examples/optimizer/config.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
benchmark: "gsm8k"
2+
initial_test_set_size: 1
3+
max_test_set_size: 1
4+
num_candidates: 5
5+
num_demonstrations: 3
6+
parallelism: 1
7+
shuffle_test: false
8+
test_set_name: "test"
9+
train_set_name: "train"
10+
timeout: 120
11+
experiment_prefix: "granite_3_8b_instruct_gsm8k_3_shot_"
12+
variables:
13+
model:
14+
- "watsonx_text/ibm/granite-3-8b-instruct"
15+
prompt_pattern:
16+
- "cot"
17+
num_demonstrations:
18+
- 3

examples/optimizer/optimize.py

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,16 +6,15 @@
66

77
import yaml
88
from datasets import load_from_disk
9+
from fever_evaluator import FEVEREvaluator
10+
from gsm8k_evaluator import Gsm8kEvaluator
11+
from gsmhard_evaluator import GsmHardEvaluator
12+
from mbpp_dataset import MBPPDataset
13+
from mbpp_evaluator import MBPPEvaluator
914

1015
from pdl.optimize.config_parser import OptimizationConfig
1116
from pdl.optimize.pdl_optimizer import PDLOptimizer
1217

13-
from .fever_evaluator import FEVEREvaluator
14-
from .gsm8k_evaluator import Gsm8kEvaluator
15-
from .gsmhard_evaluator import GsmHardEvaluator
16-
from .mbpp_dataset import MBPPDataset
17-
from .mbpp_evaluator import MBPPEvaluator
18-
1918
if __name__ == "__main__":
2019
parser = argparse.ArgumentParser(
2120
description="PDL optimization and benchmarking tool",

mkdocs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ nav:
4747
- API Reference: api_reference.md
4848
- Contribute: contrib.md
4949
- Viewer: viewer.md
50-
- AutoPDL: autopdl.md
50+
# - AutoPDL: autopdl.md # Hide documentation for now
5151

5252
# Define some IBM colors
5353
extra_css:

0 commit comments

Comments
 (0)