-
Notifications
You must be signed in to change notification settings - Fork 47
PDL Optimizer #941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
PDL Optimizer #941
Changes from all commits
Commits
Show all changes
75 commits
Select commit
Hold shift + click to select a range
03dc654
feat: remove tauri cli support for running python interpreter
starpit 1379876
chore: bump rust dependencies to resolve alert
starpit 8cb0ae8
test: tauri github actions test should apt install the deb
starpit cb0b34b
feat: support for --data and --data-file in rust interpreter
starpit 8f122b6
Add optimizer module
claudiosv 72d0e64
Remove BAM
claudiosv 3d773a8
Initial prompt library
claudiosv 0f31164
Clean up & bring in changes
claudiosv f08326c
AST dump fixes
claudiosv 36cc142
Documentation and fixes
claudiosv 566d52c
Finish prompt lib docs
claudiosv b6b9f2a
Address feedback
claudiosv dfab1e8
Lint
claudiosv c1542cb
fix: begin phasing in Metadata (common defs, etc. attrs) into rust AST
starpit ade9c69
chore: update granite-io dependency (#896)
mandel d73622c
refactor: move def attr into Metadata (rust interpreter)
starpit 087e1d2
feat: introduce Expr typing and apply it to IfBlock.condition
starpit 0ffa988
feat: update rust Call AST to use Expr for condition attr
starpit 0962c0d
chore: bump to rust 2024 edition
starpit a4f935f
feat: continue to flesh out block metadata structure in rust
starpit a3a9849
refactor: add metadata attr to remaining rust block asts
starpit a0232f9
feat: update rust Repeat AST to use Expr for `for` attr (#904)
mandel 71bd1da
refactor: introduce Advanced enum to rust AST
starpit 23eff68
refactor: refactor rust ast to place metadata in common struct
starpit 98e5944
fix: improve deserialization of python-generated model block traces
starpit 2df4341
fix: in rust ast, allow ModelBlock model to be an expr
starpit a92ce92
feat: initial pdl__id and --trace support for rust interpreter
starpit ba861d3
fix: update rust interpreter to create Data blocks for expr eval, and…
starpit 0ca9930
fix: populate trace context field in rust interpreter
starpit 9dce51a
Update stop sequences in parameters (#861)
jgchn 3bc9291
refactor: extract platform-generic logic from run_ollama_model() handler
starpit 6aede65
fix: rust interpreter was not handling pdl__context for re-runs of tr…
starpit 5f69858
feat: improve support for importing stdlib in python code blocks
starpit 99fca95
skeleton-of-thought example (#919)
vazirim 9ad6ef2
Bump litellm and openai versions (#920)
vazirim 725549c
fix: improve support for rust interpreter python imports from venv
starpit 8cf57a8
chore: bump tauri and npm dependencies
starpit db0119b
chore: bump ui to 0.6.1 (#921)
starpit 164a40f
fix: rust ast support for gsm8k, including jsonl parser
starpit e8f8ae9
chore: bump rust dependencies
starpit c013192
feat: improve support for tool calling in ollama-rs
starpit 3ba5cab
Fully qualify import (#930)
esnible 3106dce
feat: some regex parser support for rust interpreter
starpit 16d35e6
Change to sys.path for python code block (#931)
vazirim 881e1a1
granite-io hallucination demo example and notebook (#932)
vazirim 6acb270
Lint
claudiosv a810769
Refactor
claudiosv a19cbef
Tests
claudiosv 0272c66
Formatting
claudiosv c01605f
Lint
claudiosv 6e7747f
Skip tests for now
claudiosv 66e1b61
Update schema
claudiosv 5bb9d70
Add contrib. prompt library (#927)
claudiosv 4a9aed4
Fixed the bug where pdl.__version__ was not set (#882)
vite-falcon c3f635f
chore: update pre-commit tools (#937)
mandel 311b59b
Use granite-io async interface (#936)
mandel 1361fc8
Lint
claudiosv 091b68d
chore: bump ui dependences
starpit 3a43a4d
fix: skip failing execution tests (#938)
mandel c388836
independent implementation (#934)
vazirim 8c5fbce
feat: add a `parse_dict` function to `pdl_parser` (#943)
mandel bb5367e
Address feedback 1
claudiosv ec21e7e
Address feedback 2
claudiosv 6c7ae94
Lint & fix mypy module warning
claudiosv 1f8a748
Skip tests again
claudiosv cf8d805
Try to resolve wikipedia package in ci
claudiosv eb7878e
Fix rebase
claudiosv c706ea7
Fix pyproject
claudiosv ff5b999
Merge remote-tracking branch 'origin/main' into optimizer-merge
claudiosv 2b3a607
Move multiprocess to optional
claudiosv 0913a18
Add ipython to pdl-live
claudiosv 32f511c
Fix trace metadata
claudiosv 5001f92
Add documentation
claudiosv ee45d5e
Add optimizer test PDL
claudiosv 91408e0
Add config and expand docs
claudiosv File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,114 @@ | ||
| --- | ||
| hide: | ||
| - navigation | ||
| - toc | ||
| - footer | ||
| --- | ||
|
|
||
| # AutoPDL Tutorial | ||
|
|
||
| The following sections show how to use the AutoPDL optimizer to produce optimized PDL programs for specific tasks. | ||
|
|
||
| To optimize a PDL program, we need the program, an optimizer configuration, a dataset, and an _evaluator_. An evaluator is a Python subclass of `OptimizerEvaluator` that evaluates a candidate, which is a generated configuration instance consisting of e.g. fewshot examples. The evaluator class follows this structure: | ||
|
|
||
| ```python title="src/pdl/optimize/optimizer_evaluator.py" linenums="1" | ||
| class OptimizerEvaluator(Thread): | ||
| """Evaluates a candidate (configuration, i.e. fewshots, style) against **one** test example.""" | ||
|
|
||
| def __init__( | ||
| self, | ||
| pdl_program: Program, | ||
| example: dict, | ||
| candidate: dict, | ||
| index: int, | ||
| timeout: int, | ||
| yield_output: bool, | ||
| config: OptimizationConfig, | ||
| cwd: Path, | ||
| answer_key: str = "answer", | ||
| ) -> None: | ||
| super().__init__() | ||
| self.pdl_program = pdl_program | ||
| ... | ||
|
|
||
| def get_scope(self) -> ScopeType: | ||
| """ | ||
| Constructs a PDL scope for the candidate, | ||
| can take self.candidate and self.config into account | ||
| """ | ||
|
|
||
| def extract_answer(self, document: str) -> Any: | ||
| """ | ||
| Extracts the final answer from the PDL result document, | ||
| i.e. the string the PDL program returns | ||
| """ | ||
|
|
||
| def answer_correct(self, document: str, answer: Any, truth: Any) -> bool: | ||
| """ | ||
| Checks the extracted answer against the groundtruth value, | ||
| in self.example[self.answer_key] | ||
| """ | ||
| ``` | ||
|
|
||
| Let's go through an example for `GSM8K`. Our PDL program uses different prompt patterns from the prompt library, and the variables `prompt_pattern`, `question`, `model`, and `demonstrations` are inserted at runtime by the evaluator. | ||
|
|
||
|
|
||
| ```yaml title="examples/optimizer/gsm8k.pdl" linenums="1" | ||
| --8<-- "./examples/optimizer/gsm8k.pdl" | ||
| ``` | ||
|
|
||
| We write a configuration file for the optimizer, see `src/pdl/optimize/config_parser.py` for all fields: | ||
|
|
||
| ``` { .yaml .copy .annotate title="gsm8k_optimizer_config.yml" linenums="1" } | ||
| benchmark: gsm8k # Name our benchmark | ||
| budget: null # Set a budget, can be number of iterations, or a duration string e.g. "2h" | ||
| budget_growth: double # double validation set size each iteration | ||
| # or to_max: reach max_test_set_size by final iteration | ||
| initial_test_set_size: 2 # size of test set in first iteration | ||
| max_test_set_size: 10 # maximum test set size | ||
| num_candidates: 100 # how many candidates to evaluate | ||
| num_demonstrations: 5 # how many demonstrations to include per candidate | ||
| parallelism: 1 # how many threads to run evaluations across | ||
| shuffle_test: false # shuffling of test set | ||
| test_set_name: test # name of test set | ||
| train_set_name: train # name of train set | ||
| validation_set_name: validation # name of validation set | ||
| demonstrations_variable_name: demonstrations # variable name to insert demonstrations into | ||
| variables: # define discrete options to sample from | ||
| model: # set ${ model } variable | ||
| - watsonx/meta-llama/llama-3-1-8b-instruct | ||
| prompt_pattern: # set ${ prompt_pattern } variable to one of these | ||
| - cot | ||
| - react | ||
| - rewoo | ||
| num_demonstrations: # overrides num demonstrations above | ||
| - 0 | ||
| - 3 | ||
| - 5 | ||
| ``` | ||
|
|
||
|
|
||
| ```python title="examples/optimizer/gsm8k_evaluator.py" linenums="1" | ||
| --8<-- "./examples/optimizer/gsm8k_evaluator.py" | ||
| ``` | ||
|
|
||
| We can see an example of a script to run the optimization process in `examples/optimizer/optimize.py`. | ||
| Usage: | ||
|
|
||
| ``` | ||
| python optimize.py optimize -h | ||
| usage: optimize.py optimize [-h] --config CONFIG --dataset-path DATASET_PATH [--experiments-path EXPERIMENTS_PATH] | ||
| [--yield_output | --no-yield_output] [--dry | --no-dry] | ||
| pdl_file | ||
| ``` | ||
|
|
||
| We also need a dataset to optimize against, with `train`, `test`, and `validation` splits. To produce such a dataset, we can use HuggingFace Datasets `load_dataset` and `save_to_disk`. This example requires the dataset to have columns `question`, `reasoning`, and `answer`, which can be created from the original `openai/gsm8k` dataset. Processing scripts are under development and will follow shortly. | ||
|
|
||
| We can run an example like so: | ||
|
|
||
| ``` | ||
| cd examples/optimizer | ||
| python optimize.py optimize --config config.yml --dataset-path datasets/gsm8k gsm8k.pdl | ||
| ``` | ||
|
|
||
| Once the process is complete, a file `optimized_gsm8k.pdl` is written. This file contains the optimal configuration and is directly executable by the standard PDL interpreter. |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| benchmark: "gsm8k" | ||
| initial_test_set_size: 1 | ||
| max_test_set_size: 1 | ||
| num_candidates: 5 | ||
| num_demonstrations: 3 | ||
| parallelism: 1 | ||
| shuffle_test: false | ||
| test_set_name: "test" | ||
| train_set_name: "train" | ||
| timeout: 120 | ||
| experiment_prefix: "granite_3_8b_instruct_gsm8k_3_shot_" | ||
| variables: | ||
| model: | ||
| - "watsonx_text/ibm/granite-3-8b-instruct" | ||
| prompt_pattern: | ||
| - "cot" | ||
| num_demonstrations: | ||
| - 3 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| description: Demo of ReAct template fever | ||
| defs: | ||
| cot: | ||
| import: ../../contrib/prompt_library/CoT | ||
| react: | ||
| import: ../../contrib/prompt_library/ReAct | ||
| rewoo: | ||
| import: ../../contrib/prompt_library/ReWoo | ||
| tools: | ||
| import: ../../contrib/prompt_library/tools | ||
|
|
||
| search_tools: | ||
| data: | ||
| - name: Search | ||
| description: Search Wikipedia for a summary | ||
| parameters: | ||
| type: object | ||
| properties: | ||
| topic: | ||
| type: string | ||
| description: The topic of interest | ||
| required: | ||
| - topic | ||
|
|
||
| task: |- | ||
| Task: On June 2017, the following claim was made: ${ claim } | ||
| Q: Was this claim true or false? | ||
| match: ${ prompt_pattern } | ||
| with: | ||
| # CoT | ||
| - case: cot | ||
| then: | ||
| text: | ||
| call: ${ cot.chain_of_thought } | ||
| args: | ||
| examples: "${ demonstrations }" | ||
| question: "${ task }" | ||
| model: "${ model }" | ||
|
|
||
| # ReAct | ||
| - case: react | ||
| then: | ||
| text: | ||
| call: ${ react.react } | ||
| args: | ||
| task: ${ task } | ||
| model: ${ model } | ||
| tool_schema: ${ search_tools } | ||
| tools: ${ tools.tools } | ||
| trajectories: ${ demonstrations } | ||
|
|
||
| # ReWOO | ||
| - case: rewoo | ||
| then: | ||
| text: | ||
| call: ${ rewoo.rewoo } | ||
| args: | ||
| task: ${ task } | ||
| model: ${ model } | ||
| tool_schema: ${ search_tools } | ||
| tools: ${ tools.tools } | ||
| trajectories: ${ demonstrations } | ||
| show_plans: false |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this line,
mypyreports this error:This is because
examples.optimizer.fever_evaluatoris imported intest_optimizer.pyasfrom examples.optimizer.fever_evaluator import FEVEREvaluator, but also from withinexamples/optimizer/optimize.pyasfrom .fever_evaluator import FEVEREvaluator