|
| 1 | +# Benchmarking Cipherscope |
| 2 | + |
| 3 | +This document explains how to read the micro-benchmark results and what they do (and do not) measure. |
| 4 | + |
| 5 | +## What the micro-benchmark measures |
| 6 | + |
| 7 | +The benchmark is a full end-to-end scan using the compiled `cipherscope` binary. Each iteration: |
| 8 | +- Walks the roots and discovers files (respecting ignore rules). |
| 9 | +- Runs a fast regex anchor hint to skip files with no matching library/API patterns. |
| 10 | +- Parses files into ASTs. |
| 11 | +- Finds library anchors and algorithm hits. |
| 12 | +- Writes JSONL output to a temp file. |
| 13 | + |
| 14 | +This is an integrated measurement of scanner performance, not a unit benchmark of a single stage. |
| 15 | + |
| 16 | +## Datasets used |
| 17 | + |
| 18 | +The current benchmark runs two small fixed datasets: |
| 19 | +- `fixtures`: `fixtures/` only (26 files). |
| 20 | +- `repo_mix`: `fixtures/` + `src/` + `tests/` (30 files). |
| 21 | + |
| 22 | +These datasets are intentionally small and fast to run. They are useful for regression tracking but not |
| 23 | +representative of large codebases. |
| 24 | + |
| 25 | +## Threading variants |
| 26 | + |
| 27 | +Each dataset is benchmarked with: |
| 28 | +- `1` thread. |
| 29 | +- `num_cpus::get()` threads (full CPU on the current machine). |
| 30 | + |
| 31 | +This shows scaling behavior on the same workload. |
| 32 | + |
| 33 | +## Interpreting numbers |
| 34 | + |
| 35 | +Criterion reports a time range per benchmark, e.g.: |
| 36 | +``` |
| 37 | +scan/fixtures/1 time: [209.72 ms 210.81 ms 211.61 ms] |
| 38 | +``` |
| 39 | + |
| 40 | +This range represents the typical runtime distribution (low/median/high) across samples. |
| 41 | +For quick intuition, you can estimate throughput: |
| 42 | +- `files/sec ≈ file_count / median_time_seconds` |
| 43 | + |
| 44 | +Example: |
| 45 | +- 26 files / 0.210 s ≈ 124 files/sec. |
| 46 | + |
| 47 | +## Methodology summary |
| 48 | + |
| 49 | +The benchmark: |
| 50 | +- Uses `cargo bench --bench scan_bench`. |
| 51 | +- Warms up for ~3 seconds. |
| 52 | +- Collects 10 samples over ~10 seconds per case. |
| 53 | +- Shells out to the compiled binary and writes JSONL to a temp file. |
| 54 | + |
| 55 | +This keeps the timing focused on real scanning work while avoiding stdout overhead. |
| 56 | + |
| 57 | +## Large-scale benchmark |
| 58 | + |
| 59 | +For a more realistic scan, the `scan_large_bench` benchmark targets a folder |
| 60 | +containing multiple large repositories. It is opt-in and can be run with: |
| 61 | +``` |
| 62 | +CIPHERSCOPE_BENCH_FIXTURE=/path/to/fixture cargo bench --bench scan_large_bench |
| 63 | +``` |
| 64 | + |
| 65 | +If `CIPHERSCOPE_BENCH_FIXTURE` is not set, the benchmark defaults to |
| 66 | +`../cipherscope-paper/fixture` relative to the `cipherscope` repo. The large |
| 67 | +benchmark uses fewer samples and a longer measurement window to accommodate |
| 68 | +large repos. |
| 69 | + |
| 70 | +## Limitations and caveats |
| 71 | + |
| 72 | +- Results are machine- and filesystem-dependent. |
| 73 | +- Small datasets can exaggerate overhead and reduce signal. |
| 74 | +- OS caching can make repeated scans faster than cold-cache runs. |
| 75 | +- The output writing cost is included (to a temp file). |
| 76 | + |
| 77 | +## When to extend the benchmark |
| 78 | + |
| 79 | +For larger or more realistic measurements, consider: |
| 80 | +- Adding a larger repo checkout as an additional dataset. |
| 81 | +- Reporting total bytes scanned to compute MB/sec. |
| 82 | +- Running explicit cold-cache tests. |
| 83 | +- Adding a "no-output" mode for pure scanning cost. |
0 commit comments