|
1 | | -# Certified Synthesis |
| 1 | +# Certifying Synthesis Results |
2 | 2 |
|
3 | | -Generation of correctness certificates for synthesized programs. Currently, we support Coq as the certification backend. |
| 3 | +This page contains instructions for running and evaluating the certification component of SuSLik, corresponding to the |
| 4 | +following paper. |
4 | 5 |
|
5 | | -## Coq |
| 6 | +**[Certifying the Synthesis of Heap-Manipulating Programs](https://doi.org/10.1145/3473589)** |
| 7 | + Yasunari Watanabe, Kiran Gopinathan, George Pîrlea, Nadia Polikarpova, and Ilya Sergey. ICFP'21 |
| 8 | + - [Artifact, 21 May 2021](http://doi.org/10.5281/zenodo.5005829) |
6 | 9 |
|
7 | | -### Requirements |
| 10 | +Currently, we support three target verification frameworks in Coq—HTT, VST, and Iris—to generate correctness |
| 11 | +certificates for synthesized programs. |
8 | 12 |
|
9 | | -- [Coq](https://coq.inria.fr/) (>= "8.9.0" & < "8.12~") |
10 | | -- [Mathematical Components](http://math-comp.github.io/math-comp/) `ssreflect` (>= "1.10.0" & < "1.11~") |
11 | | -- [FCSL PCM library](https://github.com/imdea-software/fcsl-pcm) (>= "1.0.0" & < "1.3~") |
12 | | -- [HTT](https://github.com/TyGuS/htt) |
| 13 | +## Requirements |
13 | 14 |
|
14 | | -### Building Definitions and Proofs |
| 15 | +Follow the instructions in the README of each repository below to install the necessary |
| 16 | +Coq libraries. Each repository also has a `benchmarks/` directory where examples of generated |
| 17 | +certificates are viewable. |
15 | 18 |
|
16 | | -For Coq requirements available on [OPAM](https://opam.ocaml.org/doc/Install.html), we recommend installing with it: |
| 19 | +- HTT ([`TyGuS/ssl-htt`](https://github.com/TyGuS/ssl-htt)) |
| 20 | +- VST ([`TyGuS/ssl-vst`](https://github.com/TyGuS/ssl-vst)) |
| 21 | +- Iris ([`TyGuS/ssl-iris`](https://github.com/TyGuS/ssl-iris)) |
17 | 22 |
|
| 23 | +## Synthesis with Certification |
| 24 | + |
| 25 | +Add the following flags to run synthesis with certification. |
| 26 | + |
| 27 | +- `--certTarget <value>`: Currently supports values `htt`, `vst`, `iris`. |
| 28 | +- `--certDest <value>` (optional): Specifies the directory in which to generate a certificate file. Logs output to console if not provided. |
| 29 | + |
| 30 | +For example, the following command produces a HTT certificate of the specification `listfree.syn`, and logs its contents to the console. |
| 31 | + |
| 32 | +```bash |
| 33 | +./suslik examples/listfree.syn --assert false --certTarget htt |
18 | 34 | ``` |
19 | | -opam repo add coq-released https://coq.inria.fr/opam/released |
20 | | -opam pin add coq-htt git+https://github.com/TyGuS/htt\#master --no-action --yes |
21 | | -opam install coq coq-mathcomp-ssreflect coq-fcsl-pcm coq-htt |
| 35 | + |
| 36 | +By providing the `--certDest` flag, SuSLik writes out this certificate as a file to the specified directory. The following example command writes a HTT certificate named `listfree.v` to the project root directory. |
| 37 | + |
| 38 | +```bash |
| 39 | +./suslik examples/listfree.syn --assert false --certTarget htt --certDest . |
22 | 40 | ``` |
23 | 41 |
|
24 | | -Each synthesized Coq certificate imports `SSL`, a module consisting of predefined tactics. The module source may be compiled by running `make clean && make` in the directory `certification/coq`. |
| 42 | +### Optional Flags |
25 | 43 |
|
26 | | -### Running Synthesis with Certification |
| 44 | +If HTT is chosen as the certification target, you can control additional certification parameters. |
| 45 | +Note that this has no effect if Iris or VST are chosen. |
27 | 46 |
|
28 | | -Add the following flags to run synthesis with certification. |
| 47 | +- `--certHammerPure <value>` (boolean; default is `false`): |
| 48 | +Controls whether to use CoqHammer to solve pure lemmas. |
| 49 | +If `false`, the generated certificate will have all pure lemmas `Admitted` instead. |
| 50 | +- `--certSetRepr <value>` (boolean; default is `false`): |
| 51 | +Controls whether to use SSReflect's `perm_eq` to express multi-set equality. |
| 52 | +If `false`, the generated certificate will use regular equality (`=`) instead. |
29 | 53 |
|
30 | | -- `--certTarget <value>`: Currently supports value `coq`. |
31 | | -- `--certDest <value>` (optional): Specifies the directory in which to generate a certificate file. Logs output to console if not provided. |
| 54 | +## Evaluation |
| 55 | + |
| 56 | +### Overview |
| 57 | + |
| 58 | +The paper discusses two classes of benchmarks. |
| 59 | + |
| 60 | +1. **Standard benchmarks** are supported by HTT, VST, and Iris. This |
| 61 | + corresponds to Table 1 of the paper. These are available in the folder |
| 62 | + `$SUSLIK_ROOT/src/test/resources/synthesis/certification-benchmarks`. |
| 63 | +2. **Advanced benchmarks** are supported by HTT only. These include test |
| 64 | + cases that use an alternative representation of multi-set equality and |
| 65 | + requires manual editing of pure lemma proofs. This corresponds to |
| 66 | + Table 2 of the paper. These are available in the folder |
| 67 | + `$SUSLIK_ROOT/src/test/resources/synthesis/certification-benchmarks-advanced` |
| 68 | + |
| 69 | +The tool `certify-benchmarks` synthesizes proof certificates for |
| 70 | + both standard and advanced benchmarks, and compiles the certificates for |
| 71 | + standard benchmarks only. |
| 72 | + |
| 73 | +The [`ssl-htt`](https://github.com/TyGuS/ssl-htt) repository contains a benchmarking script to compile the HTT |
| 74 | +certificates for advanced benchmarks. |
| 75 | + |
| 76 | +### Generating All Certificates and Compiling Standard Benchmarks |
| 77 | + |
| 78 | +To synthesize standard and advanced benchmark certificates, and compile |
| 79 | +the standard benchmark certificates, execute: |
| 80 | + |
| 81 | +``` |
| 82 | +cd $SUSLIK_ROOT |
| 83 | +./certify-benchmarks |
| 84 | +``` |
| 85 | + |
| 86 | +This will run SuSLik with certificate generation flags enabled, for all |
| 87 | +specification (`.syn`) files in the standard and advanced benchmarks, |
| 88 | +for all three target frameworks. |
| 89 | +Then, for the standard benchmarks, it will also compile the generated |
| 90 | +certificates. |
| 91 | +These are the default evaluation configuration, and the exact actions that will |
| 92 | +be performed under this configuration are written to timestamped `.log` files in |
| 93 | +the `certify` directory before each run. This configuration can be modified |
| 94 | +by setting the `--configure` flag. See section "Customizing Benchmark Options" |
| 95 | +for details. |
| 96 | + |
| 97 | +As this script produces verbose output, you may consider teeing the script's |
| 98 | +output to a log file for viewing/debugging later, instead of running the script |
| 99 | +directly. |
| 100 | + |
| 101 | +``` |
| 102 | +./certify-benchmarks> >(tee certify.log) |
| 103 | +cat certify.log |
| 104 | +``` |
| 105 | + |
| 106 | +When running the tool, please keep in mind the following. |
| 107 | + |
| 108 | +- **It should take _2-3 hours_ to run this evaluation on all benchmark |
| 109 | + groups and all target frameworks.** If you need to interrupt the evaluation, |
| 110 | + or if the benchmarking encounters a timeout error on a slow machine, please |
| 111 | + refer to section "Customizing Benchmark Options" on how to resume the task |
| 112 | + with selected benchmark groups/target frameworks only. |
| 113 | +- **Files in the `certify` directory will be overwritten on each subsequent |
| 114 | + run.** |
| 115 | +- Unsupported test cases for certain targets (such as `sll_max` for Iris, |
| 116 | + as indicated in Table 1) are ignored. |
| 117 | +- Warnings displayed during the proofs do not affect the functionality of the |
| 118 | + certificates. |
| 119 | + |
| 120 | +After the script terminates, it will create five output files in |
| 121 | +`$SUSLIK_ROOT/certify`. |
| 122 | + |
| 123 | +- `standard-syn.csv` contains synthesis times reported in Table 1. |
| 124 | +- `standard-{HTT,VST,Iris}.csv` contain proof/spec size and compilation |
| 125 | + times for each of the three target frameworks in Table 1. |
| 126 | +- `advanced-syn.csv` contains synthesis times reported in Table 2. |
| 127 | + |
| 128 | +#### Customizing Benchmark Options |
| 129 | + |
| 130 | +You may wish to run the benchmarks with alternative settings. |
| 131 | + |
| 132 | +To produce certificates and stats CSV files in a directory other than |
| 133 | +`$SUSLIK_ROOT/certify`, run the tool with the `--outputDir` flag. |
| 134 | + |
| 135 | +``` |
| 136 | +cd $SUSLIK_ROOT |
| 137 | +./certify-benchmarks --outputDir <ALTERNATIVE_DIR_PATH> |
| 138 | +``` |
| 139 | + |
| 140 | +As the benchmark tool takes several hours to run, you may need to interrupt |
| 141 | +its execution, and then resume later on a subset of the benchmarks/frameworks. |
| 142 | +If so, you can run the tool with the `--configure` flag. |
| 143 | + |
| 144 | +``` |
| 145 | +cd $SUSLIK_ROOT |
| 146 | +./certify-benchmarks --configure |
| 147 | +``` |
| 148 | + |
| 149 | +When this flag is set, a configuration wizard is run before the benchmarks, |
| 150 | +where you can selectively enable/disable three benchmarking parameters. |
| 151 | + |
| 152 | +- **Compilation:** Choose whether to measure compilation times of the |
| 153 | + generated certificates or not. |
| 154 | +- **Benchmark groups:** Select which benchmark groups to evaluate. |
| 155 | +- **Target frameworks:** Select which target framework to generate/compile |
| 156 | + certificates for. |
| 157 | + |
| 158 | +At each prompt, press ENTER to select the default option; alternatively, choose |
| 159 | +the desired option (`y` or `n`). The default settings are: |
| 160 | + |
| 161 | +- _For standard benchmarks_: Synthesize programs in all benchmark groups. Then |
| 162 | + generate and compile certificates for all three targets (HTT/VST/Iris). |
| 163 | +- _For advanced benchmarks_: Synthesize programs in all benchmark groups. Then |
| 164 | + generate (but _don't_ compile) certificates for HTT only. |
| 165 | + |
| 166 | +Each execution of the script generates a timestamped `.log` file listing |
| 167 | +exactly what actions will be performed given the user-specified configuration, |
| 168 | +which may be useful for debugging. |
| 169 | + |
| 170 | +### Compiling Advanced Benchmarks (Manually Edited Proofs) |
| 171 | + |
| 172 | +The steps from the above section do not produce target-specific statistics for |
| 173 | +advanced benchmarks. This is because the test cases in Table 2 require manual |
| 174 | +editing of the pure lemma proofs (as discussed in Section 8.2 of the paper), |
| 175 | +whereas that tool only checks SuSLik-generated certificates. |
| 176 | + |
| 177 | +To compile the advanced benchmark certificates with manually edited pure lemma |
| 178 | +proofs and execute the following (`$SSL_HTT_ROOT` refers to the local `ssl-htt` repository). |
| 179 | + |
| 180 | +``` |
| 181 | +cd $SSL_HTT_ROOT/benchmarks/advanced |
| 182 | +python3 benchmark.py --diffSource $SUSLIK_ROOT/certify/HTT/certification-benchmarks-advanced |
| 183 | +``` |
| 184 | + |
| 185 | +**It should take roughly _10 minutes_ to run this evaluation**. |
| 186 | + |
| 187 | +After the script terminates, it will create two output files in |
| 188 | +`$SSL_HTT_ROOT/benchmarks/advanced`. |
| 189 | + |
| 190 | +- `advanced-HTT.csv` contains proof/spec size and compilation times for HTT |
| 191 | + in Table 2. |
| 192 | +- `advanced-HTT.diff` compares the original SuSLik-generated certificates |
| 193 | + (stored in |
| 194 | + `$SUSLIK_ROOT/certify/HTT/certification-benchmarks-advanced`) with the |
| 195 | + manually edited ones (stored in `$SSL_HTT_ROOT/benchmarks/advanced`), |
| 196 | + showing which lines have been modified. The diff file will only be created if |
| 197 | + the original SuSLik-generated certificates exist. |
| 198 | + |
| 199 | +The number of pure lemmas and those with manual proofs can be verified by |
| 200 | +inspecting the files at `$SSL_HTT_ROOT/benchmarks/advanced`. |
| 201 | +You can also view the diff file to verify which parts of the proofs have been |
| 202 | +edited. |
| 203 | + |
| 204 | +``` |
| 205 | +less advanced-HTT.diff |
| 206 | +``` |
| 207 | + |
| 208 | +You can expect to see the following differences between the two proof versions. |
32 | 209 |
|
| 210 | +- _Proofs for the pure lemmas:_ In the generated scripts, pure lemmas have |
| 211 | + their `hammer` proofs replaced with `Admitted` statements. In the manually |
| 212 | + edited scripts, these are replaced with either a call to `hammer` when that is |
| 213 | + sufficient, or with a proof manually constructed by the authors. Note that in |
| 214 | + the diff file, pure lemmas may be ordered differently between the two |
| 215 | + versions, due to non-determinism in the way Scala accumulates hints during |
| 216 | + execution. |
| 217 | +- _Administrative renaming statements in the main proof:_ These are |
| 218 | + identifiable by the `try rename ...` statements; they may occur when |
| 219 | + two variable rename statements resulting from the same proof step |
| 220 | + are applied in different orders between the compared proofs, again |
| 221 | + dependent on Scala's execution. |
33 | 222 |
|
| 223 | +You should _not_ see edits that modify the main proof structure (aside from |
| 224 | +renaming variables). |
34 | 225 |
|
| 226 | +To use this tool with custom settings, execute the script with the `--help` flag to see all |
| 227 | +available options. |
0 commit comments