You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Live Explorer can be installed as follows (MacOS):
52
52
```
53
-
brew install pdl
53
+
brew install pdl
54
54
```
55
55
56
56
For other platforms, see installation notes.
57
57
58
58
You can run PDL with LLM models in local using [Ollama](https://ollama.com), or other cloud service.
59
-
See [here](https://ibm.github.io/prompt-declaration-language/tutorial/#using-ollama-models) for
59
+
See [here](https://ibm.github.io/prompt-declaration-language/tutorial/#using-ollama-models) for
60
60
instructions on how to install an Ollama model locally.
61
61
62
62
Most examples in this repository use IBM Granite models on [Ollama](https://ollama.com) and some are on [Replicate](https://replicate.com/). In order to run these examples, you need to create a free account
@@ -172,7 +172,7 @@ text:
172
172
temperature: 0
173
173
```
174
174
175
-
Notice the syntactic differences. Model ids on watsonx start with `watsonx`.
175
+
Notice the syntactic differences. Model ids on watsonx start with `watsonx`.
176
176
177
177
Watsonx also provides a text completion endpoint as shown in the following example. A text completion endpoint does not take chat
178
178
templates into account:
@@ -266,10 +266,10 @@ When we execute this program with the PDL interpreter, we obtain the following t
266
266
@SuppressWarnings("unchecked")
267
267
public static Map<String, String> deserializeOffsetMap(String lastSourceOffset) throws IOException {
268
268
Map<String, String> offsetMap;
269
-
if (lastSourceOffset == null || lastSourceOffset.isEmpty()) {
270
-
offsetMap = new HashMap<>();
269
+
if (lastSourceOffset == null || lastSourceOffset.isEmpty()) {
The following sections show how to use the AutoPDL optimizer to produce optimized PDL programs for specific tasks.
10
+
The following sections show how to use the AutoPDL optimizer introduced by [Spiess et al. (2025)](https://openreview.net/forum?id=CAeISyE3aR) in "AutoPDL: Automatic Prompt Optimization for LLM Agents" ([arXiv](https://arxiv.org/abs/2504.04365)), to produce optimized PDL programs for specific tasks. Please ensure PDL was installed with extras e.g.
To optimize a PDL program, we need the program, an optimizer configuration, a dataset, and an _evaluator_. An evaluator is a Python subclass of `OptimizerEvaluator` that evaluates a candidate, which is a generated configuration instance consisting of e.g. fewshot examples. The evaluator class follows this structure:
13
21
@@ -52,41 +60,15 @@ class OptimizerEvaluator(Thread):
52
60
53
61
Let's go through an example for `GSM8K`. Our PDL program uses different prompt patterns from the prompt library, and the variables `prompt_pattern`, `question`, `model`, and `demonstrations` are inserted at runtime by the evaluator.
budget: null # Set a budget, can be number of iterations, or a duration string e.g. "2h"
65
-
budget_growth: double # double validation set size each iteration
66
-
# or to_max: reach max_test_set_size by final iteration
67
-
initial_test_set_size: 2 # size of test set in first iteration
68
-
max_test_set_size: 10 # maximum test set size
69
-
num_candidates: 100 # how many candidates to evaluate
70
-
num_demonstrations: 5 # how many demonstrations to include per candidate
71
-
parallelism: 1 # how many threads to run evaluations across
72
-
shuffle_test: false # shuffling of test set
73
-
test_set_name: test # name of test set
74
-
train_set_name: train # name of train set
75
-
validation_set_name: validation # name of validation set
76
-
demonstrations_variable_name: demonstrations # variable name to insert demonstrations into
77
-
variables: # define discrete options to sample from
78
-
model: # set ${ model } variable
79
-
- watsonx/meta-llama/llama-3-1-8b-instruct
80
-
prompt_pattern: # set ${ prompt_pattern } variable to one of these
81
-
- cot
82
-
- react
83
-
- rewoo
84
-
num_demonstrations: # overrides num demonstrations above
85
-
- 0
86
-
- 3
87
-
- 5
88
-
```
67
+
We write a configuration file for the optimizer, and save it as `gsm8k_optimizer_config.yml`. See `src/pdl/optimize/config_parser.py` for all fields. Please note that this example uses the `watsonx` inference service, so an API key is required, although you can also use a local model or any other inference service.
We also need a dataset to optimize against, with `train`, `test`, and `validation` splits. To produce such a dataset, we can use HuggingFace Datasets `load_dataset` and `save_to_disk`. This example requires the dataset to have columns `question`, `reasoning`, and `answer`, which can be created from the original `openai/gsm8k` dataset. Processing scripts are under development and will follow shortly.
87
+
We also need a dataset to optimize against, with `train`, `test`, and `validation` splits. To produce such a dataset, we can use HuggingFace Datasets `load_dataset` and `save_to_disk`. This example requires the dataset to have columns `question`, `reasoning`, and `answer`, which can be created from the original `openai/gsm8k` dataset.
88
+
89
+
We provide three scripts in `examples/optimizer` to create datasets, including the rule based agentic trajectories. These are `process_gsm8k.py`, `process_fever.py`, and `process_mbpp.py`. They load the original datasets, process them, and save them to disk in the required format. Dataset specific instructions may be found in the respective script files. Note that the scripts create a folder named `var` in the current directory, which contains the processed dataset in a format that can be used by the optimizer. Therefore, they should be run in the root of the PDL repository.
106
90
107
-
We can run an example like so:
91
+
Let's run the GSM8K dataset processing script:
92
+
93
+
```{ .bash .copy .annotate linenums="1" }
94
+
python examples/optimizer/process_gsm8k.py
95
+
```
108
96
97
+
Which should save the processed dataset in `var/gsm8k_trajectified` and output something like:
98
+
99
+
```text
100
+
Saving the dataset (1/1 shards): 100%|█████████████████████████████████████████████████████████████████| 6449/6449 [00:00<00:00, 557195.73 examples/s]
101
+
Saving the dataset (1/1 shards): 100%|█████████████████████████████████████████████████████████████████| 1319/1319 [00:00<00:00, 363559.64 examples/s]
102
+
Saving the dataset (1/1 shards): 100%|█████████████████████████████████████████████████████████████████| 1024/1024 [00:00<00:00, 271472.56 examples/s]
Once the process is complete, a file `optimized_gsm8k.pdl` is written. This file contains the optimal configuration and is directly executable by the standard PDL interpreter.
175
+
Note that it is not unusual to observe PDL exceptions during the optimization process.
176
+
177
+
```text
178
+
[15:44:14] Type errors during spec checking:
179
+
../../contrib/prompt_library/ReAct.pdl:0 - should be an object
180
+
../../contrib/prompt_library/ReAct.pdl:0 - Type errors during spec checking:
181
+
../../contrib/prompt_library/ReAct.pdl:0 - should be an object
182
+
Retrying: False
183
+
Runtime FAILED and took seconds: 10.21
184
+
```
185
+
186
+
Such exceptions, here for example in `ReAct.pdl`, are caused by the _typed_ model call in `ReAct.pdl:98`. If the model output does not result in a parsable JSON that matches the expected type `{ name: string, arguments: object }`, the PDL interpreter raises an exception.
187
+
188
+
Once the process is complete, a file `optimized_gsm8k.pdl` is written in same directory as the source PDL file. This file contains the optimal configuration and is directly executable by the standard PDL interpreter. A log of the optimization process is written to `experiments/` by default.
0 commit comments