Skip to content

Commit 235431c

Browse files
authored
change the autopdl tutorial (#1159)
Signed-off-by: Mandana Vaziri <[email protected]>
1 parent 6fda16d commit 235431c

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

docs/autopdl.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,8 @@ An AutoPDL configuration file describes the state-space and parameters for the s
5050
--8<-- "./examples/optimizer/grammar_correction.yaml"
5151
```
5252

53-
Field `pdl_path` is the path to the PDL program to optimize. `dataset` points to the dataset to be used. In this case, it's an object with paths for train/validation/test splits. In general, `dataset` could be a string pointing to Huggingface dataset (that would then be automatically downloaded). `demonstrations_variable_name` gives the name of the PDL variable that will hold the demonstrations in the optimized program. `demonstration_columns` indicates the field names in the dataset that will be used to create demonstrations, and `instance_columns` are those fields that will be used to formulate an instance query (see the query in the PDL program above, which uses `input`). The `groundtruth_column` holds the field with the ground truth (in this case `output`). `eval_pdl` is the path of the PDL program that encapsulates the loss function.
53+
Field `pdl_path` is the path to the PDL program to optimize. `dataset` points to the dataset to be used. In this case, it's an object with paths for train/validation/test splits.
54+
`demonstrations_variable_name` gives the name of the PDL variable that will hold the demonstrations in the optimized program. `demonstration_columns` indicates the field names in the dataset that will be used to create demonstrations, and `instance_columns` are those fields that will be used to formulate an instance query (see the query in the PDL program above, which uses `input`). The `groundtruth_column` holds the field with the ground truth (in this case `output`). `eval_pdl` is the path of the PDL program that encapsulates the loss function.
5455

5556
`initial_validation_set_size` is the initial size of the validation set (i.e., the number of tests used initially to validate candidates). `max_validation_set_size` indicates the maximum to which this validation set will grow. For more details on the successive halving algorithm used in AutoPDL see [here](https://arxiv.org/abs/2504.04365). `max_test_set_size: 10` is the maximum of the test set used to evaluate at the end of the evaluation run. `num_candidates` indicates the number of candidates to consider (sampled randomly). `parallelism` indicates the level of parallelism used by the optimizer.
5657

0 commit comments

Comments
 (0)