Skip to content

Commit 27e416c

Browse files
authored
Merge pull request #38 from YangChenyuan/verusage-code
Upload the code of VeruSAGE, refact the dir structure, and update README
2 parents 02b5b21 + dabecc9 commit 27e416c

File tree

22,546 files changed

+58820
-1471
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

22,546 files changed

+58820
-1471
lines changed

.pre-commit-config.yaml

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# See https://pre-commit.com for more information
2+
# See https://pre-commit.com/hooks.html for more hooks
3+
repos:
4+
# General file checks
5+
- repo: https://github.com/pre-commit/pre-commit-hooks
6+
rev: v5.0.0
7+
hooks:
8+
- id: trailing-whitespace
9+
exclude: ^(generated/|benchmarks/)
10+
- id: end-of-file-fixer
11+
exclude: ^(generated/|benchmarks/)
12+
- id: check-yaml
13+
- id: check-json
14+
- id: check-added-large-files
15+
args: ['--maxkb=1000']
16+
exclude: ^(verusage-code/vstd_library/.*\.json|verusage/vstd_library/.*\.json|assets/)$
17+
- id: check-merge-conflict
18+
- id: check-case-conflict
19+
- id: detect-private-key
20+
- id: mixed-line-ending
21+
args: ['--fix=lf']
22+
exclude: ^(generated/|benchmarks/)
23+
24+
# Python code formatting with Black
25+
- repo: https://github.com/psf/black
26+
rev: 24.10.0
27+
hooks:
28+
- id: black
29+
language_version: python3
30+
args: ['--line-length=100']
31+
exclude: ^(generated/|benchmarks/)
32+
33+
# Import sorting with isort
34+
- repo: https://github.com/pycqa/isort
35+
rev: 5.13.2
36+
hooks:
37+
- id: isort
38+
args: ['--profile=black', '--line-length=100']
39+
exclude: ^(generated/|benchmarks/)
40+
41+
# Python linting with ruff (fast, modern linter)
42+
- repo: https://github.com/astral-sh/ruff-pre-commit
43+
rev: v0.8.4
44+
hooks:
45+
- id: ruff
46+
args: ['--fix', '--exit-non-zero-on-fix']
47+
exclude: ^(generated/|benchmarks/)
48+
49+
# Type checking with mypy (optional, commented out by default)
50+
# - repo: https://github.com/pre-commit/mirrors-mypy
51+
# rev: v1.13.0
52+
# hooks:
53+
# - id: mypy
54+
# additional_dependencies: []
55+
# exclude: ^(generated/|benchmarks/)
56+
57+
# Configuration
58+
default_language_version:
59+
python: python3
60+
61+
# CI settings
62+
ci:
63+
autofix_commit_msg: |
64+
[pre-commit.ci] auto fixes from pre-commit hooks
65+
autofix_prs: true
66+
autoupdate_branch: ''
67+
autoupdate_commit_msg: '[pre-commit.ci] pre-commit autoupdate'
68+
autoupdate_schedule: weekly
69+
skip: []
70+
submodules: false

README-artifact-evaluation.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
1-
# <img src="assets/logo.png" alt="Project logo" width="40" /> AutoVerus - Artifact Evaluation
1+
# <img src="assets/autoverus-logo.png" alt="AutoVerus logo" width="40" /> AutoVerus - Artifact Evaluation
22

3-
![Framework](assets/framework.png)
3+
![Framework](assets/autoverus-framework.png)
44

55
This repository contains code and artifacts for the paper "**AutoVerus: Automated Proof Generation for Rust Code**". This README guides you through reproducing the experimental results from our paper.
66

77
## 📁 Repository Structure
88

99
* **`benchmarks/`** - 150 Rust/Verus proof tasks (Verus-Bench) used in our evaluation
10-
* `CloverBench/` - Clover benchmark tasks
10+
* `CloverBench/` - Clover benchmark tasks
1111
* `Diffy/` - Diffy benchmark tasks
12-
* `MBPP/` - MBPP benchmark tasks
12+
* `MBPP/` - MBPP benchmark tasks
1313
* `Misc/` - Miscellaneous benchmark tasks
1414
* `ablation/` - Tasks for ablation studies
1515
* See [benchmarks README](benchmarks/README.md) for details
@@ -18,7 +18,7 @@ This repository contains code and artifacts for the paper "**AutoVerus: Automate
1818
* **`utils/lynette/`** - Verus parser supporting proof synthesis
1919
* **`generated/`** - Pre-generated proof results
2020
* `autoverus-generated/` - Results from our AutoVerus approach
21-
* `baseline-generated/` - Results from baseline LLM approach
21+
* `baseline-generated/` - Results from baseline LLM approach
2222
* `ablation-study/` - Results from ablation experiments
2323

2424
## 🚀 Quick Start (5 minutes)
@@ -51,13 +51,13 @@ The `generated/` directory is organized into three main categories:
5151
```
5252
generated/
5353
├── autoverus-generated/ # Results from our AutoVerus approach
54-
│ ├── gpt4o-clover-1.0/ # AutoVerus on Clover benchmarks
54+
│ ├── gpt4o-clover-1.0/ # AutoVerus on Clover benchmarks
5555
│ ├── gpt4o-diffy-1.0/ # AutoVerus on Diffy benchmarks
5656
│ ├── gpt4o-mbpp-1.0/ # AutoVerus on MBPP benchmarks
5757
│ └── gpt4o-misc-1.0/ # AutoVerus on Misc benchmarks
5858
├── baseline-generated/ # Results from baseline LLM approach
5959
│ ├── baseline-clover-1.0/ # Baseline on Clover benchmarks
60-
│ ├── baseline-diffy-1.0/ # Baseline on Diffy benchmarks
60+
│ ├── baseline-diffy-1.0/ # Baseline on Diffy benchmarks
6161
│ ├── baseline-mbpp-1.0/ # Baseline on MBPP benchmarks
6262
│ └── baseline-misc-1.0/ # Baseline on Misc benchmarks
6363
└── abalation-study/ # Results from ablation experiments
@@ -116,15 +116,15 @@ For the verified proofs, the score should be `(N, 0)` and `Safe: True`.
116116
**3. Reproduce Paper Results:**
117117
The generated results directly support the claims in our paper:
118118
- **Section 7.1 & 7.3** (Overall Results): Compare success rates across `autoverus-generated/` directories
119-
- **Section 7.2** (Baseline Comparison): Compare `autoverus-generated/` vs `baseline-generated/`
119+
- **Section 7.2** (Baseline Comparison): Compare `autoverus-generated/` vs `baseline-generated/`
120120
- **Section 7.4** (Ablation Studies): Analyze results in `abalation-study/` directories
121121

122122
**4. Verification:**
123123
All generated correct proofs have been verified by Verus to ensure correctness.
124124

125125
## 🧪 Full Experimental Reproduction
126126

127-
> **⚠️ Important:**
127+
> **⚠️ Important:**
128128
> - All experiments run inside Docker under `/home/appuser/verus-proof-synthesis/code/`
129129
> - Full reproduction requires time and OpenAI API costs
130130
> - Consider running experiments in parallel or on subsets for faster evaluation
@@ -163,7 +163,7 @@ python verify.py --name baseline-misc-simple --is-baseline
163163
```bash
164164
# Test impact of removing different few-shot examples
165165
python verify.py --name few-shot-ab-inference-without-3 --phase1-examples 6 7 --repair-num 0
166-
python verify.py --name few-shot-ab-inference-without-6 --phase1-examples 3 7 --repair-num 0
166+
python verify.py --name few-shot-ab-inference-without-6 --phase1-examples 3 7 --repair-num 0
167167
python verify.py --name few-shot-ab-inference-without-7 --phase1-examples 3 6 --repair-num 0
168168
```
169169

@@ -215,7 +215,7 @@ Reset config to use `"gpt-4o"`, then:
215215
```bash
216216
# Test different temperature settings
217217
python verify.py --name temp-ab-sampled --temp 0.1 --config-file config-artifact-openai.json
218-
python verify.py --name temp-ab-sampled --temp 0.4 --config-file config-artifact-openai.json
218+
python verify.py --name temp-ab-sampled --temp 0.4 --config-file config-artifact-openai.json
219219
python verify.py --name temp-ab-sampled --temp 0.7 --config-file config-artifact-openai.json
220220
```
221221

@@ -250,7 +250,7 @@ python verify.py --name temp-ab-sampled --temp 0.7 --config-file config-artifact
250250
git clone <repository-url>
251251
cd verus-proof-synthesis
252252

253-
# Install Python dependencies
253+
# Install Python dependencies
254254
pip install -r requirements.txt
255255

256256
# Set API key
@@ -266,11 +266,11 @@ python main.py --input <input_file.rs> --output <output_file.rs> --config <confi
266266

267267
**Key Parameters:**
268268
- `--input` - Input Rust file needing Verus proofs (default: `input.rs`)
269-
- `--output` - Output file with generated proofs (default: `output.rs`)
269+
- `--output` - Output file with generated proofs (default: `output.rs`)
270270
- `--config` - Configuration file (default: `config.json`)
271271
- `--repair` - Max debugging rounds (default: 10)
272272

273-
**Output:**
273+
**Output:**
274274
- Final proof in specified output file
275275
- `intermediate-<timestamp>/` folder with all intermediate files
276276
- Detailed logs showing the proof generation process
@@ -285,7 +285,7 @@ Error: OpenAI API rate limit exceeded
285285
```
286286
**Solution:** Wait and retry, or use multiple API keys in config file.
287287

288-
**2. Verus Path Issues**
288+
**2. Verus Path Issues**
289289
```
290290
Error: Verus binary not found
291291
```

0 commit comments

Comments
 (0)