Skip to content

Commit 2990f7b

Browse files
authored
Merge pull request #8 from EngineeringSoftware/readme-update
Readme update
2 parents 3a1c8fd + 2a4ece9 commit 2990f7b

File tree

4 files changed

+184
-62
lines changed

4 files changed

+184
-62
lines changed

README.md

Lines changed: 72 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ section below.
88

99
## Requirements
1010

11+
- Linux or macOS
1112
- [OCaml 4.07.1](https://ocaml.org)
1213
- [SerAPI 0.7.1](https://github.com/ejgallego/coq-serapi)
1314
- [Coq 8.10.2](https://coq.inria.fr/download)
@@ -16,7 +17,19 @@ section below.
1617

1718
## Installation
1819

19-
### Installation of OCaml, Coq, and SerAPI
20+
You can install Roosterize from source code by cloning this GitHub
21+
repository and setting up the dependencies following steps 1 & 2.
22+
(Alternatively, you can download the a [binary
23+
distribution](https://github.com/EngineeringSoftware/roosterize/releases)
24+
which already contains the Python dependencies, and then you only need
25+
step 1.)
26+
27+
```
28+
git clone https://github.com/EngineeringSoftware/roosterize.git
29+
cd roosterize
30+
```
31+
32+
### 1. Installation of OCaml, Coq, and SerAPI
2033

2134
We strongly recommend installing the required versions of OCaml, Coq,
2235
and SerAPI via the [OPAM package manager](https://opam.ocaml.org),
@@ -35,93 +48,94 @@ opam pin add coq 8.10.2
3548
opam pin add coq-serapi 8.10.0+0.7.1
3649
```
3750

38-
### Installation of PyTorch and Python libraries
51+
### 2. Installation of PyTorch and Python libraries
52+
53+
We strongly recommend installing the required versions of Python and
54+
PyTorch using [Conda](https://docs.conda.io/en/latest/miniconda.html).
3955

40-
We strongly recommend installing the required versions of Python and PyTorch
41-
using [Conda](https://docs.conda.io/en/latest/miniconda.html).
56+
To set up the Conda environment, use one of the following command
57+
suitable for your operating system and whether you want to use it on a
58+
CPU or GPU.
4259

43-
First, create a Python 3.7 environment using Conda:
60+
- Linux, CPU:
4461
```
45-
conda create --name roosterize python=3.7 pip
46-
conda activate roosterize
62+
conda env create --name roosterize --file conda-envs/cpu.yml
4763
```
4864

49-
Then, install PyTorch 1.1.0 by following the
50-
[official instructions](https://pytorch.org/get-started/previous-versions/#conda-5),
51-
using the command suitable for your operating system and whether you want to use
52-
it on a CPU or GPU. For example, the following command installs PyTorch for
53-
CPU-only use on Linux or Windows:
65+
- Linux, GPU w/ CUDA 10.0:
5466
```
55-
conda install pytorch-cpu==1.1.0 torchvision-cpu==0.3.0 cpuonly -c pytorch
67+
conda env create --name roosterize --file conda-envs/gpu-cuda10.yml
5668
```
57-
Finally, install the required Python libraries:
69+
70+
- Linux, GPU w/ CUDA 9.0:
5871
```
59-
conda install configargparse nltk future seutil==0.4.12 six torchtext==0.4.0 tqdm==4.30.*
72+
conda env create --name roosterize --file conda-envs/gpu-cuda9.yml
6073
```
6174

62-
### Installation of Roosterize and trained models
75+
- Mac, CPU:
76+
```
77+
conda env create --name roosterize --file conda-envs/mac-cpu.yml
78+
```
6379

64-
To install Roosterize itself, clone the GitHub repository and enter the root directory:
80+
Finally, activate the Conda environment before using Roosterize:
6581
```
66-
git clone https://github.com/EngineeringSoftware/roosterize.git
67-
cd roosterize
82+
conda activate roosterize
6883
```
6984

70-
Next, you need to obtain pre-trained models that capture
71-
naming conventions. We have trained several models using our
72-
[corpus][math-comp-corpus], which follows the conventions
73-
used in the [Mathematical Components][math-comp-website]
74-
family of projects. These models are available for separate
75-
[download][models-link].
85+
### Installation of trained models
86+
87+
Next, you need to obtain a pre-trained model that capture naming
88+
conventions. The default pre-trained model, which was trained using
89+
our [corpus][math-comp-corpus] and follows the conventions used in the
90+
[Mathematical Components][math-comp-website] family of projects, can
91+
be obtained by running the command:
7692

77-
To use our pre-trained models, go to the Roosterize
78-
root directory, and download and unpack the archive (`models.tgz`):
7993
```
80-
wget https://github.com/EngineeringSoftware/roosterize/releases/download/v8.10.0/models.tgz
81-
tar xzf models.tgz
94+
./bin/roosterize download_global_model
8295
```
8396

84-
[models-link]: https://github.com/EngineeringSoftware/roosterize/releases/download/v8.10.0/models.tgz
97+
The model will be downloaded to `$HOME/.roosterize/`. To use a
98+
different model (that we [released][latest-release] or you trained),
99+
simply put it in `$HOME/.roosterize/`.
85100

86101
## Usage
87102

88103
To use Roosterize on a Coq verification project, you first need to
89-
build the Coq project using a command provided by the project
90-
(usually `make`). Then, at the root directory of
91-
the Roosterize project repository, run the command
92-
```
93-
python -m roosterize.main suggest_lemmas \
94-
--project=$PATH_TO_PROJECT \
95-
--serapi-options=$SERAPI_OPTIONS \
96-
--model-dir=./models/roosterize-ta \
97-
--output=./output
98-
```
99-
where `$PATH_TO_PROJECT` should be replaced with the path to the
100-
Coq project, and `$SERAPI_OPTIONS` should be replaced with the SerAPI
101-
command line options for mapping logical paths to directories
102-
(see [SerAPI's documentation][serapi-faq-link]). For example,
103-
if the logical path (inside Coq) for the project is `Verified`,
104-
you should set `SERAPI_OPTIONS="-R $PATH_TO_PROJECT,Verified"`.
105-
106-
The above command extracts all lemmas from the project, uses Roosterize's
107-
pre-trained model (at `./models/roosterize-ta`) to predict a lemma name
108-
for each lemma, and finally prints the lemma name update suggestions,
109-
i.e., the predicted lemma names that are different from to the existing ones.
110-
Below is an example of printed suggestions:
111-
```
112-
>>>>> Suggestions:
113-
infotheo/ecc_classic/bch.v: infotheo.ecc_classic.bch.BCH.BCH_PCM_altP1 -> inde_F2
114-
infotheo/ecc_classic/bch.v: infotheo.ecc_classic.bch.BCH.BCH_PCM_altP2 -> inde_mul
115-
infotheo/ecc_classic/bch.v: infotheo.ecc_classic.bch.BCH.PCM_altP -> F2_eq0
116-
...
104+
build the Coq project using a command provided by the project (usually
105+
`make`). Then, run this command to get the lemma name suggestions for
106+
the lemmas in a Coq document (.v file):
107+
108+
```
109+
python -m roosterize.main suggest_naming --file=PATH_TO_FILE
117110
```
118111

112+
Roosterize automatically infers the root directory of the project by
113+
finding `_CoqProject`, and reads `_CoqProject` to infer the SerAPI
114+
command line options (for mapping logical paths to directories, see
115+
[SerAPI's documentation][serapi-faq-link]). If you don't have a
116+
`_CoqProject` file, you need to provide an additional argument
117+
`--project_root=PATH_TO_PROJECT_ROOT`.
118+
119+
<!-- where `$PATH_TO_PROJECT` should be replaced with the path to the -->
120+
<!-- Coq project, and `$SERAPI_OPTIONS` should be replaced with the SerAPI -->
121+
<!-- command line options for mapping logical paths to directories -->
122+
<!-- (see [SerAPI's documentation][serapi-faq-link]). For example, -->
123+
<!-- if the logical path (inside Coq) for the project is `Verified`, -->
124+
<!-- you should set `SERAPI_OPTIONS="-R $PATH_TO_PROJECT,Verified"`. -->
125+
126+
The above command extracts all lemmas from the file, uses the
127+
pre-trained model to generate likely lemma names for each lemma, and
128+
finally prints the lemma name update suggestions, i.e., the generated
129+
lemma names that are different from to the existing ones. See an
130+
[example suggestion report](./docs/example-suggestion.txt).
131+
119132
For other usages and command line interfaces of Roosterize, please
120133
check the help:
121134
```
122135
python -m roosterize.main help
123136
```
124137

138+
[latest-release]: https://github.com/EngineeringSoftware/roosterize/releases/latest
125139
[serapi-faq-link]: https://github.com/ejgallego/coq-serapi/blob/v8.10/FAQ.md#does-serapi-support-coqs-command-line-flags
126140

127141
## Technique

docs/example-suggestion.txt

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
$ ~/projects/roosterize/bin/roosterize suggest_naming --file=$PWD/finmap/finmap.v
2+
== Analyzed 110 lemma names, 8 (7.3%) conform to the learned naming conventions.
3+
==========
4+
== 21 can be improved and here are Roosterize's suggestions:
5+
Line 851: fcatsK => eq_fcat (likelihood: 0.45)
6+
Line 822: fcatC => eq_fcat (likelihood: 0.44)
7+
Line 862: fcatKs => eq_fcat (likelihood: 0.43)
8+
Line 1118: map_key_zip' => eq_zip (likelihood: 0.31)
9+
Line 1178: zip_supp' => eq_zip (likelihood: 0.31)
10+
Line 1258: zunit_fcat => zip_fcat (likelihood: 0.30)
11+
Line 769: disjC => eq_disj (likelihood: 0.30)
12+
Line 962: mapf_disj => eq_map (likelihood: 0.29)
13+
Line 526: fcats0 => fcat_nil (likelihood: 0.28)
14+
Line 1273: zunit_disj => disj_zip (likelihood: 0.27)
15+
Line 1186: zip_supp => eq_zip (likelihood: 0.27)
16+
Line 937: mapf_ins => map_ins (likelihood: 0.26)
17+
Line 525: fcat0s => fcat_nil (likelihood: 0.25)
18+
Line 443: seqof_ins => path_ordP (likelihood: 0.24)
19+
Line 945: mapf_fcat => map_fcat (likelihood: 0.23)
20+
Line 1139: zip_sorted' => map_zip (likelihood: 0.23)
21+
Line 1006: path_mapk => path_ord_supp (likelihood: 0.23)
22+
Line 953: mapf_disjL => eq_map (likelihood: 0.23)
23+
Line 158: fmapP => eq_fnd (likelihood: 0.22)
24+
Line 1075: zipC' => eq_zip (likelihood: 0.21)
25+
Line 982: foldf_ins => path_foldfmap (likelihood: 0.20)
26+
==========
27+
== 81 can be improved but Roosterize cannot provide good suggestion:
28+
Line 1082: zipA' (best guess: zip_bind; likelihood: 0.20)
29+
Line 716: kfilter_pred0 (best guess: kfilter0s; likelihood: 0.19)
30+
Line 462: path_supp_ins_inv (best guess: path_ord_supp; likelihood: 0.19)
31+
Line 719: kfilter_predT (best guess: kfilterTE; likelihood: 0.18)
32+
Line 1017: mapk_ins (best guess: path_ord; likelihood: 0.17)
33+
Line 528: fcat_inss (best guess: eq_fcat; likelihood: 0.17)
34+
Line 1194: zip_filter' (best guess: filter_zip; likelihood: 0.16)
35+
Line 447: path_supp_ins (best guess: path_ord1; likelihood: 0.16)
36+
Line 532: fcat_sins (best guess: eq_fcat; likelihood: 0.15)
37+
Line 209: suppE (best guess: eq_suppP; likelihood: 0.15)
38+
Line 1215: zip_fnd (best guess: eq_zip; likelihood: 0.14)
39+
Line 979: foldf_nil (best guess: foldfmap_nil; likelihood: 0.14)
40+
Line 606: supp_eq_ins (best guess: eq_supp; likelihood: 0.13)
41+
Line 549: fcat_rems (best guess: eq_fcat; likelihood: 0.13)
42+
Line 97: sorted_filter' (best guess: map_filter; likelihood: 0.13)
43+
Line 650: sorted_kfilter (best guess: map_kfilter; likelihood: 0.13)
44+
Line 665: kfilt_nil (best guess: nil_kfilter; likelihood: 0.12)
45+
Line 120: last_ins' (best guess: path_ord; likelihood: 0.12)
46+
Line 319: ins_ins (best guess: eq_ins; likelihood: 0.12)
47+
Line 86: sorted_ins' (best guess: map_ins; likelihood: 0.12)
48+
Line 657: supp_kfilt (best guess: supp_filter; likelihood: 0.12)
49+
Line 922: map_key_mapf (best guess: map_inj; likelihood: 0.11)
50+
Line 134: notin_path (best guess: path_ord; likelihood: 0.11)
51+
Line 903: feqP (best guess: feqmP; likelihood: 0.11)
52+
Line 1164: zip_unitE (best guess: existsP; likelihood: 0.11)
53+
Line 358: rem_empty (best guess: nil_rem; likelihood: 0.11)
54+
Line 233: supp_nilE (best guess: nilP; likelihood: 0.10)
55+
Line 518: fcat_nil' (best guess: fcatP; likelihood: 0.09)
56+
Line 253: fnd_empty (best guess: fnd_eq; likelihood: 0.09)
57+
Line 262: fnd_ins (best guess: eq_fnd; likelihood: 0.09)
58+
Line 830: fcatA (best guess: eq_fcat; likelihood: 0.09)
59+
Line 1153: zip_unit_sorted' (best guess: zip_unit; likelihood: 0.09)
60+
Line 709: kfilt_fcat (best guess: fcat_kfilter; likelihood: 0.08)
61+
Line 668: fnd_kfilt (best guess: fndP; likelihood: 0.08)
62+
Line 844: fcatCA (best guess: eq_fcat; likelihood: 0.08)
63+
Line 558: fcat_srem (best guess: eq_fcat; likelihood: 0.08)
64+
Line 621: fmap_ind2 (best guess: eq_supp; likelihood: 0.08)
65+
Line 415: cancel_ins (best guess: eq_ins; likelihood: 0.08)
66+
Line 1250: zunit_ins (best guess: zip_unit; likelihood: 0.07)
67+
Line 76: path_ins' (best guess: map_ord; likelihood: 0.07)
68+
Line 510: fcat_ins' (best guess: eq_ins'; likelihood: 0.07)
69+
Line 63: sorted_nil (best guess: ord_sorted; likelihood: 0.07)
70+
Line 776: disj_nil (best guess: disj_fin; likelihood: 0.07)
71+
Line 255: fnd_rem (best guess: optio_fnd; likelihood: 0.07)
72+
Line 142: path_supp_ord (best guess: path_ord; likelihood: 0.07)
73+
Line 56: fmapE (best guess: prodP; likelihood: 0.07)
74+
Line 1001: sorted_mapk (best guess: ord_supp; likelihood: 0.06)
75+
Line 839: fcatAC (best guess: eq_fcat; likelihood: 0.06)
76+
Line 793: disj_rem (best guess: eqseq_disj; likelihood: 0.05)
77+
Line 760: disjP (best guess: eq_spec; likelihood: 0.05)
78+
Line 722: kfilter_predI (best guess: kfilterI1; likelihood: 0.05)
79+
Line 925: sorted_map (best guess: map_sorted; likelihood: 0.05)
80+
Line 680: kfilt_ins (best guess: ins_kfilter; likelihood: 0.04)
81+
Line 726: kfilter_predU (best guess: kfilter_fcat; likelihood: 0.04)
82+
Line 1206: zip_rem (best guess: fin_zip; likelihood: 0.04)
83+
Line 586: supp_fcat (best guess: eq_fcat; likelihood: 0.04)
84+
Line 1115: zip_unitL' (best guess: zip_unit; likelihood: 0.04)
85+
Line 236: supp_rem (best guess: meC_supp1; likelihood: 0.04)
86+
Line 277: ins_rem (best guess: eq_ins; likelihood: 0.04)
87+
Line 1028: map_id (best guess: fin_mapk; likelihood: 0.04)
88+
Line 698: kfilt_rem (best guess: rem_kfilter; likelihood: 0.03)
89+
Line 1126: zip_unitE' (best guess: eq_zipP; likelihood: 0.03)
90+
Line 370: rem_ins (best guess: eq_rem; likelihood: 0.03)
91+
Line 1033: map_comp (best guess: mapkC_comp; likelihood: 0.03)
92+
Line 800: disj_remE (best guess: eq_disj; likelihood: 0.03)
93+
Line 870: disj_kfilt (best guess: eqseq_kfilter; likelihood: 0.03)
94+
Line 1267: zunit_supp (best guess: zip_unit; likelihood: 0.03)
95+
Line 243: supp_ins (best guess: mort_supp; likelihood: 0.03)
96+
Line 689: rem_kfilt (best guess: rem_kfilter; likelihood: 0.03)
97+
Line 779: disj_ins (best guess: eqseq_disj; likelihood: 0.03)
98+
Line 471: fmap_ind' (best guess: fin_supp; likelihood: 0.03)
99+
Line 150: notin_filter (best guess: filter_predk; likelihood: 0.03)
100+
Line 411: fnd_supp_in (best guess: mex_exists; likelihood: 0.03)
101+
Line 191: suppP (best guess: supp_spec; likelihood: 0.03)
102+
Line 361: rem_rem (best guess: eqseq_rem; likelihood: 0.02)
103+
Line 1247: zunit0 (best guess: zip_comp; likelihood: 0.02)
104+
Line 124: first_ins' (best guess: map_ord; likelihood: 0.02)
105+
Line 181: predkN (best guess: prod0_predk; likelihood: 0.02)
106+
Line 398: rem_supp (best guess: min_rem; likelihood: 0.02)
107+
Line 483: fmap_ind'' (best guess: min_ord; likelihood: 0.01)
108+
Line 881: in_disj_kfilt (best guess: in_supp; likelihood: 0.01)

roosterize/interface/CommandLineInterface.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ def __init__(self):
6161
self.exclude_files = None
6262
self.exclude_pattern = None
6363
self.serapi_options = None
64-
self.model_url = "https://github.com/EngineeringSoftware/roosterize/releases/download/v1.1.0+8.10.0/roosterize-model-t1.tgz"
64+
self.model_url = "https://github.com/EngineeringSoftware/roosterize/releases/download/v1.1.0+8.10.2-beta/roosterize-model-t1.tgz"
6565
self.compile_cmd = None
6666
self.loaded_config_prj: Path = None
6767

@@ -241,15 +241,15 @@ def report_predictions(self, data: ProcessedFile, candidates_logprobs: List[List
241241
# Print suggestions
242242
total = len(good_names) + len(bad_names_and_suggestions) + len(bad_names_no_suggestion)
243243
print(f"== Analyzed {total} lemma names, "
244-
f"{len(good_names)} ({len(good_names)/total:.1%}) look good.")
244+
f"{len(good_names)} ({len(good_names)/total:.1%}) conform to the learned naming conventions.")
245245
if len(bad_names_and_suggestions) > 0:
246246
print(f"==========")
247247
print(f"== {len(bad_names_and_suggestions)} can be improved and here are Roosterize's suggestions:")
248248
for lemma, suggestion, score in sorted(bad_names_and_suggestions, key=lambda x: x[2], reverse=True):
249249
print(f"Line {lemma.vernac_command[0].lineno}: {lemma.name} => {suggestion} (likelihood: {score:.2f})")
250250
if len(bad_names_no_suggestion) > 0:
251251
print(f"==========")
252-
print(f"== {len(bad_names_no_suggestion)} can be improved but Roosterize cannot provide good suggestion (please consider improve_project_model):")
252+
print(f"== {len(bad_names_no_suggestion)} can be improved but Roosterize cannot provide good suggestion:")
253253
for lemma, suggestion, score in sorted(bad_names_no_suggestion, key=lambda x: x[2], reverse=True):
254254
print(f"Line {lemma.vernac_command[0].lineno}: {lemma.name} (best guess: {suggestion}; likelihood: {score:.2f})")
255255

roosterize/interface/VSCodeInterface.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ def report_predictions(self, data: ProcessedFile, candidates_logprobs: List[List
5656

5757
total = len(good_names) + len(bad_names_and_suggestions) + len(bad_names_no_suggestion)
5858
self.show_message(f"{data.path}: Analyzed {total} lemma names, "
59-
f"{len(good_names)} ({len(good_names)/total:.1%}) look good. "
59+
f"{len(good_names)} ({len(good_names)/total:.1%}) conform to the learned naming conventions. "
6060
f"Roosterize made {len(bad_names_and_suggestions)} suggestions.")
6161

6262
# Publish suggestions as diagnostics

0 commit comments

Comments
 (0)