Skip to content

Commit e0ddc49

Browse files
committed
feat: Add default glob patterns and support for Swift, Objective-C, Kotlin
- Add comprehensive default glob patterns for all supported languages - Add support for Swift (.swift), Objective-C (.m, .mm, .M), and Kotlin (.kt, .kts) - Implement glob-based file filtering to only process source files - Update language detection to handle new file extensions - Add --patterns CLI argument for specifying patterns file path - Update README with new language support and performance optimizations - Optimize file discovery by pre-filtering with glob patterns Performance improvements: - Only processes relevant source files, skipping docs/images/binaries - Significant speedup on large repositories with many non-source files - Maintains accuracy while reducing unnecessary file processing
1 parent c12454d commit e0ddc49

File tree

5 files changed

+350
-59
lines changed

5 files changed

+350
-59
lines changed

README.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
## cryptofind
22

3-
Fast, low-false-positive static scanner that finds third-party cryptographic libraries and call sites across Go, Java, C, C++, Rust, Python, and PHP codebases.
3+
Fast, low-false-positive static scanner that finds third-party cryptographic libraries and call sites across Go, Java, C, C++, Rust, Python, PHP, Swift, Objective-C, and Kotlin codebases.
44

55
### Install & Run
66

@@ -20,6 +20,7 @@ Key flags:
2020
- `--min-confidence 0.9`: filter low-confidence hits
2121
- `--threads N`: set thread pool size
2222
- `--max-file-size MB`: skip large files (default 2)
23+
- `--patterns PATH`: specify patterns file (default: `patterns.toml`)
2324
- `--include-glob GLOB` / `--exclude-glob GLOB`
2425
- `--allow LIB` / `--deny LIB`
2526
- `--deterministic`: stable output ordering
@@ -55,6 +56,27 @@ SARIF snippet:
5556

5657
Patterns are loaded from `patterns.toml` (and optional `patterns.local.toml`, if you add it). The schema supports per-language `include`/`import`/`namespace`/`apis` anchored regexes. The engine strips comments and avoids string literals to reduce false positives.
5758

59+
#### Supported Languages & File Extensions
60+
61+
The scanner automatically detects and processes files with these extensions:
62+
63+
- **C/C++**: `.c`, `.h`, `.cc`, `.cpp`, `.cxx`, `.c++`, `.hpp`, `.hxx`, `.h++`, `.hh`
64+
- **Java**: `.java`
65+
- **Go**: `.go`
66+
- **Rust**: `.rs`
67+
- **Python**: `.py`, `.pyw`, `.pyi`
68+
- **PHP**: `.php`, `.phtml`, `.php3`, `.php4`, `.php5`, `.phps`
69+
- **Swift**: `.swift`
70+
- **Objective-C**: `.m`, `.mm`, `.M`
71+
- **Kotlin**: `.kt`, `.kts`
72+
73+
#### Performance Optimizations
74+
75+
- **Default Glob Filtering**: Only processes source files, skipping documentation, images, and binaries
76+
- **Pattern Caching**: Compiled patterns are cached per language for faster lookups
77+
- **Aho-Corasick Prefiltering**: Fast substring matching before expensive regex operations
78+
- **Parallel Processing**: Multi-threaded file scanning using Rayon
79+
5880
### Extending Detectors
5981

6082
Detectors are plugin-like. Add a new crate under `crates/` implementing the `Detector` trait, or extend the `patterns.toml` to cover additional libraries. See `crates/scanner-core/src/lib.rs` for the trait and pattern-driven detector.

crates/cli/src/main.rs

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,10 @@ struct Args {
6464
/// Dry-run: list files that would be scanned
6565
#[arg(long, action = ArgAction::SetTrue)]
6666
dry_run: bool,
67+
68+
/// Path to patterns file
69+
#[arg(long, value_name = "FILE", default_value = "patterns.toml")]
70+
patterns: PathBuf,
6771
}
6872

6973
fn main() -> Result<()> {
@@ -75,8 +79,8 @@ fn main() -> Result<()> {
7579
.ok();
7680
}
7781

78-
// Load patterns: patterns.toml + optional patterns.local.toml
79-
let base = fs::read_to_string("patterns.toml").context("read patterns.toml")?;
82+
// Load patterns from specified file
83+
let base = fs::read_to_string(&args.patterns).with_context(|| format!("read patterns file: {}", args.patterns.display()))?;
8084
let reg = PatternRegistry::load(&base)?;
8185
let reg = Arc::new(reg);
8286

crates/cli/tests/integration.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,12 @@ fn scan_fixtures() {
5050
let fixtures = workspace.join("fixtures");
5151
let findings = scanner.run(&[fixtures.clone()]).unwrap();
5252

53+
// Debug: print all findings
54+
println!("Found {} findings:", findings.len());
55+
for f in &findings {
56+
println!(" {:?} | {} | {}:{}", f.language, f.library, f.file.display(), f.span.line);
57+
}
58+
5359
// Expect at least one hit per language category in positive fixtures
5460
let has_rust = findings
5561
.iter()

0 commit comments

Comments
 (0)