Skip to content

Commit d713d0d

Browse files
cursoragentscript3r
andcommitted
Update README and disable unsupported languages
Co-authored-by: script3r <[email protected]>
1 parent e64a691 commit d713d0d

File tree

5 files changed

+87
-3
lines changed

5 files changed

+87
-3
lines changed

Cargo.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,4 +34,8 @@ tree-sitter-python = "0.21"
3434
tree-sitter-javascript = "0.21"
3535
tree-sitter-java = "0.21"
3636
tree-sitter-go = "0.21"
37+
# tree-sitter-php = "0.22"
38+
# tree-sitter-swift = "0.7"
39+
# tree-sitter-kotlin = "0.3"
40+
# Note: Additional languages disabled due to inconsistent tree-sitter APIs
3741

LANGUAGE_COVERAGE_STATUS.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Language Coverage Status
2+
3+
## Currently Supported Languages (AST-based detection)
4+
5+
**C** - 9 fixture files, OpenSSL library detection
6+
**C++** - 12 fixture files, OpenSSL/Botan/CryptoPP library detection
7+
**Rust** - 13 fixture files, Ring/RustCrypto library detection
8+
**Python** - 16 fixture files, Cryptography library detection
9+
**Java** - 13 fixture files, JCA/BouncyCastle library detection
10+
**Go** - 12 fixture files, std-crypto/x-crypto library detection
11+
12+
**Total**: 37 ground truth files generated for supported languages
13+
14+
## Languages with Fixtures but No AST Support Yet
15+
16+
**PHP** - 9 fixture files (OpenSSL/Sodium libraries)
17+
**Swift** - 9 fixture files (CryptoKit/CommonCrypto libraries)
18+
**Kotlin** - 5 fixture files (JCA library)
19+
**Objective-C** - 9 fixture files (CommonCrypto/OpenSSL libraries)
20+
**Erlang** - 5 fixture files (OTP-crypto library)
21+
22+
**Total**: 37 fixture files without AST support
23+
24+
## Why Some Languages Aren't Supported Yet
25+
26+
The missing languages have inconsistent or incompatible tree-sitter parser APIs:
27+
28+
- **tree-sitter-php**: Uses different function naming convention
29+
- **tree-sitter-swift**: Uses `LANGUAGE` constant instead of `language()` function
30+
- **tree-sitter-kotlin**: Different API structure
31+
- **Objective-C**: No stable tree-sitter parser available
32+
- **Erlang**: No stable tree-sitter parser available
33+
34+
## Detection Quality by Language
35+
36+
### High Quality Detection
37+
- **C/C++**: Detects include statements and function calls accurately
38+
- **Python**: Detects import statements and algorithm usage
39+
- **Java**: Detects import statements for crypto APIs
40+
- **Go**: Detects import statements for crypto packages
41+
42+
### Needs Refinement
43+
- **Rust**: Currently generates many false positives (51 findings for a simple file)
44+
- AST patterns are too broad and match every identifier
45+
- Need more specific patterns for crypto-specific usage
46+
47+
## How to Add New Language Support
48+
49+
1. **Add tree-sitter parser dependency** to Cargo.toml
50+
2. **Add parser initialization** in `AstDetector::new()`
51+
3. **Add language matching** in `find_matches()` method
52+
4. **Define AST patterns** for the language in `default_patterns()` or patterns.toml
53+
5. **Add detector** in CLI main.rs
54+
6. **Test and validate** with existing fixtures
55+
56+
## Future Improvements
57+
58+
1. **Refine Rust patterns** to reduce false positives
59+
2. **Add support for missing languages** with stable tree-sitter parsers
60+
3. **Enhance algorithm detection** with parameter extraction
61+
4. **Add more library patterns** for comprehensive coverage
62+
5. **Consider fallback regex detection** for languages without AST support
63+
64+
## Current Tool Usage
65+
66+
The tool currently works well for the 6 supported languages:
67+
68+
```bash
69+
./target/release/cipherscope fixtures/python/cryptography/ --patterns patterns.toml
70+
./target/release/cipherscope fixtures/c/openssl/ --patterns patterns.toml
71+
./target/release/cipherscope fixtures/java/jca/ --patterns patterns.toml
72+
```
73+
74+
For unsupported languages, the tool will simply skip the files (no errors, just no output).

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,9 @@ cargo build --release
4545

4646
## Languages Supported
4747

48-
C, C++, Go, Java, Python, Rust (AST-based detection)
48+
**AST-based detection**: C, C++, Go, Java, Python, Rust
49+
50+
**Note**: Additional languages like PHP, Swift, Kotlin, Objective-C, and Erlang have fixture files but are not yet supported by AST-based detection due to tree-sitter parser compatibility issues. These can be added in future versions.
4951

5052
## How It Works (High-Level)
5153

crates/scanner-core/Cargo.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,9 @@ tree-sitter-python = { workspace = true }
2424
tree-sitter-javascript = { workspace = true }
2525
tree-sitter-java = { workspace = true }
2626
tree-sitter-go = { workspace = true }
27+
# tree-sitter-php = { workspace = true }
28+
# tree-sitter-swift = { workspace = true }
29+
# tree-sitter-kotlin = { workspace = true }
2730

2831
[dev-dependencies]
2932
criterion = "0.5"

crates/scanner-core/src/ast.rs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ impl AstDetector {
5555
parsers.insert(ScanLanguage::Python, Self::create_parser(tree_sitter_python::language())?);
5656
parsers.insert(ScanLanguage::Java, Self::create_parser(tree_sitter_java::language())?);
5757
parsers.insert(ScanLanguage::Go, Self::create_parser(tree_sitter_go::language())?);
58-
// Note: JavaScript parser can be used for basic JS-like syntax if needed
58+
// Note: PHP, Swift, Kotlin, Objective-C and Erlang disabled due to inconsistent tree-sitter APIs
5959

6060
Ok(Self {
6161
parsers,
@@ -297,6 +297,7 @@ impl AstDetector {
297297
match_type: AstMatchType::Library { name: "std-crypto".to_string() },
298298
metadata: HashMap::new(),
299299
},
300+
300301
]
301302
}
302303

@@ -321,7 +322,7 @@ impl AstDetector {
321322
ScanLanguage::Python => tree_sitter_python::language(),
322323
ScanLanguage::Java => tree_sitter_java::language(),
323324
ScanLanguage::Go => tree_sitter_go::language(),
324-
_ => return Ok(matches), // Skip unsupported languages
325+
_ => return Ok(matches), // Skip unsupported languages (PHP, Swift, Kotlin, ObjC, Erlang)
325326
};
326327

327328
// Execute each pattern that matches this language

0 commit comments

Comments
 (0)