Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
03dc654
feat: remove tauri cli support for running python interpreter
starpit Apr 10, 2025
1379876
chore: bump rust dependencies to resolve alert
starpit Apr 10, 2025
8cb0ae8
test: tauri github actions test should apt install the deb
starpit Apr 10, 2025
cb0b34b
feat: support for --data and --data-file in rust interpreter
starpit Apr 10, 2025
8f122b6
Add optimizer module
claudiosv Apr 10, 2025
72d0e64
Remove BAM
claudiosv Apr 13, 2025
3d773a8
Initial prompt library
claudiosv Apr 10, 2025
0f31164
Clean up & bring in changes
claudiosv Apr 14, 2025
f08326c
AST dump fixes
claudiosv Apr 21, 2025
36cc142
Documentation and fixes
claudiosv Apr 14, 2025
566d52c
Finish prompt lib docs
claudiosv Apr 21, 2025
b6b9f2a
Address feedback
claudiosv Apr 23, 2025
dfab1e8
Lint
claudiosv Apr 23, 2025
c1542cb
fix: begin phasing in Metadata (common defs, etc. attrs) into rust AST
starpit Apr 10, 2025
ade9c69
chore: update granite-io dependency (#896)
mandel Apr 10, 2025
d73622c
refactor: move def attr into Metadata (rust interpreter)
starpit Apr 11, 2025
087e1d2
feat: introduce Expr typing and apply it to IfBlock.condition
starpit Apr 11, 2025
0ffa988
feat: update rust Call AST to use Expr for condition attr
starpit Apr 11, 2025
0962c0d
chore: bump to rust 2024 edition
starpit Apr 11, 2025
a4f935f
feat: continue to flesh out block metadata structure in rust
starpit Apr 11, 2025
a3a9849
refactor: add metadata attr to remaining rust block asts
starpit Apr 11, 2025
a0232f9
feat: update rust Repeat AST to use Expr for `for` attr (#904)
mandel Apr 11, 2025
71bd1da
refactor: introduce Advanced enum to rust AST
starpit Apr 11, 2025
23eff68
refactor: refactor rust ast to place metadata in common struct
starpit Apr 13, 2025
98e5944
fix: improve deserialization of python-generated model block traces
starpit Apr 13, 2025
2df4341
fix: in rust ast, allow ModelBlock model to be an expr
starpit Apr 13, 2025
a92ce92
feat: initial pdl__id and --trace support for rust interpreter
starpit Apr 13, 2025
ba861d3
fix: update rust interpreter to create Data blocks for expr eval, and…
starpit Apr 14, 2025
0ca9930
fix: populate trace context field in rust interpreter
starpit Apr 14, 2025
9dce51a
Update stop sequences in parameters (#861)
jgchn Apr 14, 2025
3bc9291
refactor: extract platform-generic logic from run_ollama_model() handler
starpit Apr 14, 2025
6aede65
fix: rust interpreter was not handling pdl__context for re-runs of tr…
starpit Apr 15, 2025
5f69858
feat: improve support for importing stdlib in python code blocks
starpit Apr 15, 2025
99fca95
skeleton-of-thought example (#919)
vazirim Apr 15, 2025
9ad6ef2
Bump litellm and openai versions (#920)
vazirim Apr 16, 2025
725549c
fix: improve support for rust interpreter python imports from venv
starpit Apr 16, 2025
8cf57a8
chore: bump tauri and npm dependencies
starpit Apr 16, 2025
db0119b
chore: bump ui to 0.6.1 (#921)
starpit Apr 18, 2025
164a40f
fix: rust ast support for gsm8k, including jsonl parser
starpit Apr 18, 2025
e8f8ae9
chore: bump rust dependencies
starpit Apr 18, 2025
c013192
feat: improve support for tool calling in ollama-rs
starpit Apr 18, 2025
3ba5cab
Fully qualify import (#930)
esnible Apr 22, 2025
3106dce
feat: some regex parser support for rust interpreter
starpit Apr 22, 2025
16d35e6
Change to sys.path for python code block (#931)
vazirim Apr 22, 2025
881e1a1
granite-io hallucination demo example and notebook (#932)
vazirim Apr 22, 2025
6acb270
Lint
claudiosv May 4, 2025
a810769
Refactor
claudiosv May 5, 2025
a19cbef
Tests
claudiosv May 5, 2025
0272c66
Formatting
claudiosv May 5, 2025
c01605f
Lint
claudiosv May 5, 2025
6e7747f
Skip tests for now
claudiosv May 5, 2025
66e1b61
Update schema
claudiosv May 5, 2025
5bb9d70
Add contrib. prompt library (#927)
claudiosv Apr 23, 2025
4a9aed4
Fixed the bug where pdl.__version__ was not set (#882)
vite-falcon Apr 30, 2025
c3f635f
chore: update pre-commit tools (#937)
mandel May 2, 2025
311b59b
Use granite-io async interface (#936)
mandel May 2, 2025
1361fc8
Lint
claudiosv May 5, 2025
091b68d
chore: bump ui dependences
starpit May 3, 2025
3a43a4d
fix: skip failing execution tests (#938)
mandel May 6, 2025
c388836
independent implementation (#934)
vazirim May 6, 2025
8c5fbce
feat: add a `parse_dict` function to `pdl_parser` (#943)
mandel May 7, 2025
bb5367e
Address feedback 1
claudiosv May 9, 2025
ec21e7e
Address feedback 2
claudiosv May 10, 2025
6c7ae94
Lint & fix mypy module warning
claudiosv May 10, 2025
1f8a748
Skip tests again
claudiosv May 10, 2025
cf8d805
Try to resolve wikipedia package in ci
claudiosv May 10, 2025
eb7878e
Fix rebase
claudiosv May 10, 2025
c706ea7
Fix pyproject
claudiosv May 10, 2025
ff5b999
Merge remote-tracking branch 'origin/main' into optimizer-merge
claudiosv May 10, 2025
2b3a607
Move multiprocess to optional
claudiosv May 10, 2025
0913a18
Add ipython to pdl-live
claudiosv May 10, 2025
32f511c
Fix trace metadata
claudiosv May 12, 2025
5001f92
Add documentation
claudiosv May 15, 2025
ee45d5e
Add optimizer test PDL
claudiosv May 16, 2025
91408e0
Add config and expand docs
claudiosv May 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ repos:
rev: 'v1.15.0'
hooks:
- id: mypy
args: [--explicit-package-bases]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this line, mypy reports this error:

src/pdl/_version.py: error: Source file found twice under different module names: "optimizer.fever_evaluator" and "examples.optimizer.fever_evaluator"
src/pdl/_version.py: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#mapping-file-paths-to-modules for more info
src/pdl/_version.py: note: Common resolutions include: a) adding `__init__.py` somewhere, b) using `--explicit-package-bases` or adjusting MYPYPATH
Found 1 error in 1 file (errors prevented further checking)

This is because examples.optimizer.fever_evaluator is imported in test_optimizer.py as from examples.optimizer.fever_evaluator import FEVEREvaluator, but also from within examples/optimizer/optimize.py as from .fever_evaluator import FEVEREvaluator

verbose: true
additional_dependencies: ['types-PyYAML']
# type check the Python code using pyright
Expand Down
47 changes: 24 additions & 23 deletions contrib/prompt_library/ReAct.pdl
Original file line number Diff line number Diff line change
Expand Up @@ -10,26 +10,28 @@ defs:
trajectory: ${ trajectory }
repeat:
text:
- def: type
text: ${ trajectory.keys()|first }
contribute: []
- if: ${ type == 'question'}
then: |
Question: ${ trajectory[type]|trim }
- if: ${ type == 'task'}
then: |
Task: ${ trajectory[type]|trim }
- if: ${ type == 'thought'}
then: |
Tho: ${ trajectory[type]|trim }
- if: ${ type == 'action'}
then: |
Act: ${ trajectory[type]|trim }
- if: ${ type == 'observation'}
then: |
Obs: ${ trajectory[type]|trim }
- if: ${ type not in ['question', 'task', 'thought', 'action', 'observation'] }
then: "${ type }: ${ trajectory[type]|trim }"
- defs:
type:
text: ${ trajectory.keys()|first }
- match: ${ type }
with:
- case: question
then: |
Question: ${ trajectory[type]|trim }
- case: task
then: |
Task: ${ trajectory[type]|trim }
- case: thought
then: |
Tho: ${ trajectory[type]|trim }
- case: action
then: |
Act: ${ trajectory[type]|trim }
- case: observation
then: |
Obs: ${ trajectory[type]|trim }
- if: ${ type not in ['question', 'task', 'thought', 'action', 'observation'] }
then: "${ type }: ${ trajectory[type]|trim }"
- "\n"

react:
Expand Down Expand Up @@ -101,13 +103,12 @@ defs:
then:
text:
- "\nObs: "
- if: ${ action.name in tools }
- if: ${ action.name.lower() in tools }
then:
call: ${ tools[action.name] }
call: ${ tools[action.name.lower()] }
args:
arguments: ${ action.arguments }
else: "Invalid action. Valid actions are ${ tool_names[:-1]|join(', ') }, and ${ tool_names[-1] }."
# - "\n"
else:
def: exit
contribute: []
Expand Down
43 changes: 24 additions & 19 deletions contrib/prompt_library/ReWoo.pdl
Original file line number Diff line number Diff line change
Expand Up @@ -20,24 +20,29 @@ defs:
text: ${ trajectory.keys()|first }
content:
text: ${ trajectory.values()|first }
- if: ${ type in ['task', 'question'] }
then: |-
Task: ${ content|trim }
- if: ${ type == 'thought'}
then: |-
- match: ${ type }
with:
- case: task
then: |-
Task: ${ content|trim }
- case: question
then: |-
Task: ${ content|trim }
- case: thought
then: |-

Plan: ${ content|trim }
- if: ${ type == 'action'}
then:
text:
- " #E${ i } = ${ content|trim }"
- defs:
i:
data: ${ i+1 }
- if: ${ type == 'observation'}
then: ""
- if: ${ type not in ['question', 'task', 'thought', 'action', 'observation'] }
then: "${ type }: ${ content|trim }\n"
Plan: ${ content|trim }
- case: action
then:
text:
- " #E${ i } = ${ content|trim }"
- defs:
i:
data: ${ i+1 }
- case: observation
then: ""
- if: ${ type not in ['question', 'task', 'thought', 'action', 'observation'] }
then: "${ type }: ${ content|trim }\n"
- "\n"

rewoo:
Expand Down Expand Up @@ -120,9 +125,9 @@ defs:
ACTION_RAW = ACTION_RAW.replace(k, v)
result = ACTION_RAW
tool_output:
if: ${ ACTION.name in tools }
if: ${ ACTION.name.lower() in tools }
then:
call: ${ tools[ACTION.name] }
call: ${ tools[ACTION.name.lower()] }
args:
arguments: ${ ACTION.arguments }
else: "Invalid action. Valid actions are ${ tools.keys() }"
Expand Down
114 changes: 114 additions & 0 deletions docs/autopdl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
---
hide:
- navigation
- toc
- footer
---

# AutoPDL Tutorial

The following sections show how to use the AutoPDL optimizer to produce optimized PDL programs for specific tasks.

To optimize a PDL program, we need the program, an optimizer configuration, a dataset, and an _evaluator_. An evaluator is a Python subclass of `OptimizerEvaluator` that evaluates a candidate, which is a generated configuration instance consisting of e.g. fewshot examples. The evaluator class follows this structure:

```python title="src/pdl/optimize/optimizer_evaluator.py" linenums="1"
class OptimizerEvaluator(Thread):
"""Evaluates a candidate (configuration, i.e. fewshots, style) against **one** test example."""

def __init__(
self,
pdl_program: Program,
example: dict,
candidate: dict,
index: int,
timeout: int,
yield_output: bool,
config: OptimizationConfig,
cwd: Path,
answer_key: str = "answer",
) -> None:
super().__init__()
self.pdl_program = pdl_program
...

def get_scope(self) -> ScopeType:
"""
Constructs a PDL scope for the candidate,
can take self.candidate and self.config into account
"""

def extract_answer(self, document: str) -> Any:
"""
Extracts the final answer from the PDL result document,
i.e. the string the PDL program returns
"""

def answer_correct(self, document: str, answer: Any, truth: Any) -> bool:
"""
Checks the extracted answer against the groundtruth value,
in self.example[self.answer_key]
"""
```

Let's go through an example for `GSM8K`. Our PDL program uses different prompt patterns from the prompt library, and the variables `prompt_pattern`, `question`, `model`, and `demonstrations` are inserted at runtime by the evaluator.


```yaml title="examples/optimizer/gsm8k.pdl" linenums="1"
--8<-- "./examples/optimizer/gsm8k.pdl"
```

We write a configuration file for the optimizer, see `src/pdl/optimize/config_parser.py` for all fields:

``` { .yaml .copy .annotate title="gsm8k_optimizer_config.yml" linenums="1" }
benchmark: gsm8k # Name our benchmark
budget: null # Set a budget, can be number of iterations, or a duration string e.g. "2h"
budget_growth: double # double validation set size each iteration
# or to_max: reach max_test_set_size by final iteration
initial_test_set_size: 2 # size of test set in first iteration
max_test_set_size: 10 # maximum test set size
num_candidates: 100 # how many candidates to evaluate
num_demonstrations: 5 # how many demonstrations to include per candidate
parallelism: 1 # how many threads to run evaluations across
shuffle_test: false # shuffling of test set
test_set_name: test # name of test set
train_set_name: train # name of train set
validation_set_name: validation # name of validation set
demonstrations_variable_name: demonstrations # variable name to insert demonstrations into
variables: # define discrete options to sample from
model: # set ${ model } variable
- watsonx/meta-llama/llama-3-1-8b-instruct
prompt_pattern: # set ${ prompt_pattern } variable to one of these
- cot
- react
- rewoo
num_demonstrations: # overrides num demonstrations above
- 0
- 3
- 5
```


```python title="examples/optimizer/gsm8k_evaluator.py" linenums="1"
--8<-- "./examples/optimizer/gsm8k_evaluator.py"
```

We can see an example of a script to run the optimization process in `examples/optimizer/optimize.py`.
Usage:

```
python optimize.py optimize -h
usage: optimize.py optimize [-h] --config CONFIG --dataset-path DATASET_PATH [--experiments-path EXPERIMENTS_PATH]
[--yield_output | --no-yield_output] [--dry | --no-dry]
pdl_file
```

We also need a dataset to optimize against, with `train`, `test`, and `validation` splits. To produce such a dataset, we can use HuggingFace Datasets `load_dataset` and `save_to_disk`. This example requires the dataset to have columns `question`, `reasoning`, and `answer`, which can be created from the original `openai/gsm8k` dataset. Processing scripts are under development and will follow shortly.

We can run an example like so:

```
cd examples/optimizer
python optimize.py optimize --config config.yml --dataset-path datasets/gsm8k gsm8k.pdl
```

Once the process is complete, a file `optimized_gsm8k.pdl` is written. This file contains the optimal configuration and is directly executable by the standard PDL interpreter.
Empty file added examples/optimizer/__init__.py
Empty file.
18 changes: 18 additions & 0 deletions examples/optimizer/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
benchmark: "gsm8k"
initial_test_set_size: 1
max_test_set_size: 1
num_candidates: 5
num_demonstrations: 3
parallelism: 1
shuffle_test: false
test_set_name: "test"
train_set_name: "train"
timeout: 120
experiment_prefix: "granite_3_8b_instruct_gsm8k_3_shot_"
variables:
model:
- "watsonx_text/ibm/granite-3-8b-instruct"
prompt_pattern:
- "cot"
num_demonstrations:
- 3
63 changes: 63 additions & 0 deletions examples/optimizer/fever.pdl
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
description: Demo of ReAct template fever
defs:
cot:
import: ../../contrib/prompt_library/CoT
react:
import: ../../contrib/prompt_library/ReAct
rewoo:
import: ../../contrib/prompt_library/ReWoo
tools:
import: ../../contrib/prompt_library/tools

search_tools:
data:
- name: Search
description: Search Wikipedia for a summary
parameters:
type: object
properties:
topic:
type: string
description: The topic of interest
required:
- topic

task: |-
Task: On June 2017, the following claim was made: ${ claim }
Q: Was this claim true or false?
match: ${ prompt_pattern }
with:
# CoT
- case: cot
then:
text:
call: ${ cot.chain_of_thought }
args:
examples: "${ demonstrations }"
question: "${ task }"
model: "${ model }"

# ReAct
- case: react
then:
text:
call: ${ react.react }
args:
task: ${ task }
model: ${ model }
tool_schema: ${ search_tools }
tools: ${ tools.tools }
trajectories: ${ demonstrations }

# ReWOO
- case: rewoo
then:
text:
call: ${ rewoo.rewoo }
args:
task: ${ task }
model: ${ model }
tool_schema: ${ search_tools }
tools: ${ tools.tools }
trajectories: ${ demonstrations }
show_plans: false
Loading