Skip to content

Commit 7d8fad7

Browse files
committed
Bench
1 parent 523c9d2 commit 7d8fad7

File tree

9 files changed

+538
-19
lines changed

9 files changed

+538
-19
lines changed

BENCHMARK.md

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# Benchmarking Cipherscope
2+
3+
This document explains how to read the micro-benchmark results and what they do (and do not) measure.
4+
5+
## What the micro-benchmark measures
6+
7+
The benchmark is a full end-to-end scan using the compiled `cipherscope` binary. Each iteration:
8+
- Walks the roots and discovers files (respecting ignore rules).
9+
- Runs a fast regex anchor hint to skip files with no matching library/API patterns.
10+
- Parses files into ASTs.
11+
- Finds library anchors and algorithm hits.
12+
- Writes JSONL output to a temp file.
13+
14+
This is an integrated measurement of scanner performance, not a unit benchmark of a single stage.
15+
16+
## Datasets used
17+
18+
The current benchmark runs two small fixed datasets:
19+
- `fixtures`: `fixtures/` only (26 files).
20+
- `repo_mix`: `fixtures/` + `src/` + `tests/` (30 files).
21+
22+
These datasets are intentionally small and fast to run. They are useful for regression tracking but not
23+
representative of large codebases.
24+
25+
## Threading variants
26+
27+
Each dataset is benchmarked with:
28+
- `1` thread.
29+
- `num_cpus::get()` threads (full CPU on the current machine).
30+
31+
This shows scaling behavior on the same workload.
32+
33+
## Interpreting numbers
34+
35+
Criterion reports a time range per benchmark, e.g.:
36+
```
37+
scan/fixtures/1 time: [209.72 ms 210.81 ms 211.61 ms]
38+
```
39+
40+
This range represents the typical runtime distribution (low/median/high) across samples.
41+
For quick intuition, you can estimate throughput:
42+
- `files/sec ≈ file_count / median_time_seconds`
43+
44+
Example:
45+
- 26 files / 0.210 s ≈ 124 files/sec.
46+
47+
## Methodology summary
48+
49+
The benchmark:
50+
- Uses `cargo bench --bench scan_bench`.
51+
- Warms up for ~3 seconds.
52+
- Collects 10 samples over ~10 seconds per case.
53+
- Shells out to the compiled binary and writes JSONL to a temp file.
54+
55+
This keeps the timing focused on real scanning work while avoiding stdout overhead.
56+
57+
## Large-scale benchmark
58+
59+
For a more realistic scan, the `scan_large_bench` benchmark targets a folder
60+
containing multiple large repositories. It is opt-in and can be run with:
61+
```
62+
CIPHERSCOPE_BENCH_FIXTURE=/path/to/fixture cargo bench --bench scan_large_bench
63+
```
64+
65+
If `CIPHERSCOPE_BENCH_FIXTURE` is not set, the benchmark defaults to
66+
`../cipherscope-paper/fixture` relative to the `cipherscope` repo. The large
67+
benchmark uses fewer samples and a longer measurement window to accommodate
68+
large repos.
69+
70+
## Limitations and caveats
71+
72+
- Results are machine- and filesystem-dependent.
73+
- Small datasets can exaggerate overhead and reduce signal.
74+
- OS caching can make repeated scans faster than cold-cache runs.
75+
- The output writing cost is included (to a temp file).
76+
77+
## When to extend the benchmark
78+
79+
For larger or more realistic measurements, consider:
80+
- Adding a larger repo checkout as an additional dataset.
81+
- Reporting total bytes scanned to compute MB/sec.
82+
- Running explicit cold-cache tests.
83+
- Adding a "no-output" mode for pure scanning cost.

Cargo.lock

Lines changed: 182 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,3 +49,12 @@ lang-rust = ["tree-sitter-rust"]
4949
[dev-dependencies]
5050
assert_cmd = "2.0"
5151
tempfile = "3.10"
52+
criterion = "0.5"
53+
54+
[[bench]]
55+
name = "scan_bench"
56+
harness = false
57+
58+
[[bench]]
59+
name = "scan_large_bench"
60+
harness = false

0 commit comments

Comments
 (0)