Skip to content

Commit 5001f92

Browse files
committed
Add documentation
Signed-off-by: Claudio Spiess <[email protected]>
1 parent 32f511c commit 5001f92

File tree

2 files changed

+458
-4
lines changed

2 files changed

+458
-4
lines changed

docs/autopdl.md

Lines changed: 90 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,93 @@
1+
---
2+
hide:
3+
- navigation
4+
- toc
5+
- footer
6+
---
17

28
# AutoPDL Tutorial
39

4-
The following sections show how to use the AutoPDL optimizer to produce optimized PDL programs for specific tasks.
10+
The following sections show how to use the AutoPDL optimizer to produce optimized PDL programs for specific tasks.
11+
12+
To optimize a PDL program, we need the program, an optimizer configuration, a dataset, and an _evaluator_. An evaluator is a Python subclass of `OptimizerEvaluator` that evaluates a candidate, which is a generated configuration instance consisting of e.g. fewshot examples. The evaluator class follows this structure:
13+
14+
```python title="src/pdl/optimize/optimizer_evaluator.py" linenums="1"
15+
class OptimizerEvaluator(Thread):
16+
"""Evaluates a candidate (configuration, i.e. fewshots, style) against **one** test example."""
17+
18+
def __init__(
19+
self,
20+
pdl_program: Program,
21+
example: dict,
22+
candidate: dict,
23+
index: int,
24+
timeout: int,
25+
yield_output: bool,
26+
config: OptimizationConfig,
27+
cwd: Path,
28+
answer_key: str = "answer",
29+
) -> None:
30+
super().__init__()
31+
self.pdl_program = pdl_program
32+
...
33+
34+
def get_scope(self) -> ScopeType:
35+
"""
36+
Constructs a PDL scope for the candidate,
37+
can take self.candidate and self.config into account
38+
"""
39+
40+
def extract_answer(self, document: str) -> Any:
41+
"""
42+
Extracts the final answer from the PDL result document,
43+
i.e. the string the PDL program returns
44+
"""
45+
46+
def answer_correct(self, document: str, answer: Any, truth: Any) -> bool:
47+
"""
48+
Checks the extracted answer against the groundtruth value,
49+
in self.example[self.answer_key]
50+
"""
51+
```
52+
53+
Let's go through an example for `GSM8K`. Our PDL program uses different prompt patterns from the prompt library, and the variables `prompt_pattern`, `question`, `model`, and `demonstrations` are inserted at runtime by the evaluator.
54+
55+
56+
```yaml title="examples/optimizer/gsm8k.pdl" linenums="1"
57+
--8<-- "./examples/optimizer/gsm8k.pdl"
58+
```
59+
60+
We write a configuration file for the optimizer, see `src/pdl/optimize/config_parser.py` for all fields:
61+
62+
``` { .yaml .copy .annotate title="gsm8k_optimizer_config.yml" linenums="1" }
63+
benchmark: gsm8k # Name our benchmark
64+
budget: null # Set a budget, can be number of iterations, or a duration string e.g. "2h"
65+
budget_growth: double # double validation set size each iteration
66+
# or to_max: reach max_test_set_size by final iteration
67+
initial_test_set_size: 2 # size of test set in first iteration
68+
max_test_set_size: 10 # maximum test set size
69+
num_candidates: 100 # how many candidates to evaluate
70+
num_demonstrations: 5 # how many demonstrations to include per candidate
71+
parallelism: 1 # how many threads to run evaluations across
72+
shuffle_test: false # shuffling of test set
73+
test_set_name: test # name of test set
74+
train_set_name: train # name of train set
75+
validation_set_name: validation # name of validation set
76+
demonstrations_variable_name: demonstrations # variable name to insert demonstrations into
77+
variables: # define discrete options to sample from
78+
model: # set ${ model } variable
79+
- watsonx/meta-llama/llama-3-1-8b-instruct
80+
prompt_pattern: # set ${ prompt_pattern } variable to one of these
81+
- cot
82+
- react
83+
- rewoo
84+
num_demonstrations: # overrides num demonstrations above
85+
- 0
86+
- 3
87+
- 5
88+
```
89+
90+
91+
```python title="examples/optimizer/gsm8k_evaluator.py" linenums="1"
92+
--8<-- "./examples/optimizer/gsm8k_evaluator.py"
93+
```

0 commit comments

Comments
 (0)