Skip to content

Commit dce091c

Browse files
committed
Pattern improvements
1 parent 1bfbc85 commit dce091c

File tree

3 files changed

+2024
-113
lines changed

3 files changed

+2024
-113
lines changed

README.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Fast cryptographic inventory generator. Scans codebases to identify cryptographi
1010

1111
```bash
1212
cargo build --release
13-
./target/release/cipherscope /path/to/scan
13+
./target/release/cipherscope --patterns patterns.toml --progress /path/to/scan [... paths]
1414
```
1515

1616
## What It Does
@@ -51,6 +51,28 @@ C, C++, Go, Java, Kotlin, Python, Rust, Swift, Objective-C, PHP, Erlang
5151

5252
Edit `patterns.toml` to add new libraries or algorithms. No code changes needed.
5353

54+
## How It Works (High-Level)
55+
56+
1. Workspace discovery and prefilter
57+
- Walks files respecting .gitignore
58+
- Cheap Aho-Corasick prefilter using language-specific substrings derived from patterns
59+
2. Language detection and comment stripping
60+
- Detects language by extension; strips comments once for fast regex matching
61+
3. Library identification (anchors)
62+
- Per-language detector loads compiled patterns for that language (from `patterns.toml`)
63+
- Looks for include/import/namespace/API anchors to confirm a library is present in a file
64+
4. Algorithm matching
65+
- For each identified library, matches algorithm `symbol_patterns` (regex) against the file
66+
- Extracts parameters via `parameter_patterns` (e.g., key size, curve) with defaults when absent
67+
- Emits findings with file, line/column, library, algorithm, primitive, and NIST quantum level
68+
5. Deep static analysis (fallback/enrichment)
69+
- For small scans, analyzes files directly with the registry to find additional algorithms even if no library finding was produced
70+
6. CBOM generation
71+
- Findings are deduplicated and merged
72+
- Final MV-CBOM JSON is printed or written per CLI options
73+
74+
All behavior is driven by `patterns.toml` — adding new libraries/algorithms is a data-only change.
75+
5476
## Testing
5577

5678
```bash

crates/cbom-generator/src/algorithm_detector.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -279,7 +279,7 @@ impl AlgorithmDetector {
279279
if matches!(
280280
ext,
281281
// Existing languages
282-
"rs" | "java" | "go" | "py" | "c" | "cpp" | "swift" | "js" | "php" | "m" | "mm"
282+
"rs" | "java" | "go" | "py" | "c" | "cpp" | "cxx" | "cc" | "hpp" | "hxx" | "swift" | "js" | "php" | "m" | "mm"
283283
// Added: Kotlin and Erlang
284284
| "kt" | "kts" | "erl"
285285
) {

0 commit comments

Comments
 (0)