Skip to content

Commit 480cbc8

Browse files
committed
Now CBOMs, Readme
1 parent aa9198a commit 480cbc8

File tree

107 files changed

+259
-1078
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

107 files changed

+259
-1078
lines changed

README.md

Lines changed: 28 additions & 137 deletions
Original file line numberDiff line numberDiff line change
@@ -1,171 +1,62 @@
1-
## CipherScope
1+
# CipherScope
22

33
<div align="center">
44
<img src="cipherscope.png" alt="CipherScope Logo" width="350" height="350">
55
</div>
66

7-
**Cryptographic Bill of Materials (MV-CBOM) Generator** for Post-Quantum Cryptography (PQC) readiness assessment.
7+
Fast cryptographic inventory generator. Scans codebases to identify cryptographic algorithms and assess quantum resistance.
88

9-
Analyzes codebases across 11 programming languages (Go, Java, C, C++, Rust, Python, PHP, Swift, Objective-C, Kotlin, Erlang) and generates machine-readable JSON inventories of cryptographic assets with NIST quantum security levels.
10-
11-
### Install & Run
9+
## Quick Start
1210

1311
```bash
1412
cargo build --release
15-
16-
# Generate MV-CBOM for current directory
17-
./target/release/cipherscope .
18-
19-
# Generate MV-CBOMs recursively for all discovered projects
20-
./target/release/cipherscope . --recursive
13+
./target/release/cipherscope /path/to/scan
2114
```
2215

23-
Key flags:
24-
- `--recursive`: generate MV-CBOMs recursively for all discovered projects
25-
- `--threads N`: set thread pool size
26-
- `--max-file-size MB`: skip large files (default 2)
27-
- `--patterns PATH`: specify patterns file (default: `patterns.toml`)
28-
- `--progress`: show progress bar during scanning
29-
- `--print-config`: print loaded `patterns.toml`
30-
31-
### Output
32-
33-
**MV-CBOM JSON files** written to each project directory for comprehensive Post-Quantum Cryptography (PQC) readiness assessment.
34-
35-
#### MV-CBOM (Minimal Viable Cryptographic Bill of Materials)
36-
37-
CipherScope generates a comprehensive cryptographic inventory in JSON format that follows the MV-CBOM specification. This enables:
16+
## What It Does
3817

39-
- **Post-Quantum Cryptography (PQC) Risk Assessment**: Identifies algorithms vulnerable to quantum attacks (NIST Quantum Security Level 0)
40-
- **Crypto-Agility Planning**: Provides detailed algorithm parameters and usage patterns
41-
- **Supply Chain Security**: Maps dependencies between components and cryptographic assets
18+
- **Detects** cryptographic usage across 11 languages
19+
- **Identifies** many cryptographic algorithms (AES, SHA, RSA, ECDSA, ChaCha20, etc.)
20+
- **Outputs** JSON inventory with NIST quantum security levels
21+
- **Runs fast** - GiB/s throughput with parallel scanning
4222

43-
The MV-CBOM includes:
44-
- **Cryptographic Assets**: Algorithms, certificates, and related crypto material with NIST security levels
45-
- **Dependency Relationships**: Distinguishes between "uses" (actively called) vs "implements" (available but unused)
46-
- **Parameter Extraction**: Key sizes, curves, and other algorithm-specific parameters
47-
- **Recursive Project Discovery**: Automatically discovers and analyzes nested projects (BUCK, Bazel, Maven modules, etc.)
23+
## Example Output
4824

49-
Example MV-CBOM snippet:
5025
```json
5126
{
5227
"bomFormat": "MV-CBOM",
5328
"specVersion": "1.0",
54-
"cryptoAssets": [
55-
{
56-
"bom-ref": "uuid-1234",
57-
"assetType": "algorithm",
58-
"name": "RSA",
59-
"assetProperties": {
60-
"primitive": "signature",
61-
"parameterSet": {"keySize": 2048},
62-
"nistQuantumSecurityLevel": 0
63-
}
29+
"cryptoAssets": [{
30+
"name": "RSA",
31+
"assetProperties": {
32+
"primitive": "signature",
33+
"parameterSet": {"keySize": 2048},
34+
"nistQuantumSecurityLevel": 0
6435
}
65-
],
66-
"dependencies": [
67-
{
68-
"ref": "main-component",
69-
"dependsOn": ["uuid-1234"],
70-
"dependencyType": "uses"
71-
}
72-
]
36+
}]
7337
}
7438
```
7539

76-
### Configuration
77-
78-
Algorithm and library detection patterns are defined in `patterns.toml`. The schema supports:
79-
- **Library Detection**: `include`/`import`/`namespace`/`apis` patterns per language
80-
- **Algorithm Definitions**: Each library defines supported algorithms with NIST quantum security levels
81-
- **Parameter Extraction**: Patterns for extracting key sizes, curves, and algorithm parameters
82-
83-
**Supported Languages**: C, C++, Java, Go, Rust, Python, PHP, Swift, Objective-C, Kotlin, Erlang
84-
85-
#### High-Performance Architecture
86-
87-
- **Parallel Processing**: Producer-consumer model with `rayon` thread pools
88-
- **Smart Filtering**: Respects `.gitignore`, early language detection, Aho-Corasick prefiltering
89-
- **Scalable**: 4+ GiB/s throughput, linear scaling with CPU cores
90-
91-
### Architecture
92-
93-
**Modular MV-CBOM Generation Pipeline**:
94-
1. **Project Discovery**: Recursive scanning for project files (BUILD, pom.xml, Cargo.toml, etc.)
95-
2. **Static Analysis**: Pattern-driven cryptographic library detection
96-
3. **Algorithm Detection**: Extract algorithms and parameters using `patterns.toml` definitions
97-
4. **Certificate Parsing**: X.509 certificate analysis with signature algorithms
98-
5. **Dependency Analysis**: "Uses" vs "implements" relationship detection
99-
6. **CBOM Generation**: Standards-compliant JSON with NIST quantum security levels
100-
101-
**Key Innovation**: Algorithm detection moved from hardcoded Rust to configurable `patterns.toml` - new algorithms added by editing patterns, not code.
102-
103-
### Tests & Benchmarks
104-
105-
Run unit tests and integration tests (fixtures):
106-
107-
```bash
108-
cargo test
109-
```
110-
111-
#### Comprehensive Fixtures for MV-CBOM Testing
112-
113-
The `fixtures/` directory contains rich, realistic examples for testing MV-CBOM generation across multiple languages and build systems:
40+
## Options
11441

115-
**Rust Fixtures:**
116-
- **`rust/rsa-vulnerable`**: RSA 2048-bit usage (PQC vulnerable, "uses" relationship)
117-
- **`rust/aes-gcm-safe`**: Quantum-safe algorithms (AES-256-GCM, ChaCha20Poly1305, SHA-3, BLAKE3)
118-
- **`rust/implements-vs-uses`**: SHA2 "uses" vs P256 "implements" distinction
119-
- **`rust/mixed-crypto`**: Complex multi-algorithm project (RSA, AES, SHA2, Ed25519, Ring)
42+
- `--patterns PATH` - Custom patterns file (default: `patterns.toml`)
43+
- `--progress` - Show progress bar
44+
- `--deterministic` - Reproducible output for testing
12045

121-
**Java Fixtures:**
122-
- **`java/maven-bouncycastle`**: Maven project with BouncyCastle RSA/ECDSA
123-
- **`java/bazel-tink`**: Bazel project with Google Tink and BouncyCastle
124-
- **`java/jca-standard`**: Standard JCA/JCE without external dependencies
46+
## Languages Supported
12547

126-
**C/C++ Fixtures:**
127-
- **`c/openssl-mixed`**: OpenSSL + libsodium with RSA, ChaCha20Poly1305, AES
128-
- **`c/libsodium-modern`**: Modern libsodium with quantum-safe and vulnerable algorithms
129-
- **`c/makefile-crypto`**: Basic OpenSSL usage with Makefile dependency detection
130-
- **`cpp/botan-modern`**: Botan library with RSA, AES-GCM, SHA-3, BLAKE2b
131-
- **`cpp/cryptopp-legacy`**: Crypto++ library with RSA, AES-GCM, SHA-256/512
48+
C, C++, Go, Java, Kotlin, Python, Rust, Swift, Objective-C, PHP, Erlang
13249

133-
**Go Fixtures:**
134-
- **`go/stdlib-crypto`**: Standard library crypto (RSA, ECDSA, AES-GCM, SHA-256/512)
135-
- **`go/x-crypto-extended`**: Extended crypto with golang.org/x/crypto dependencies
50+
## Configuration
13651

137-
**Python Fixtures:**
138-
- **`python/cryptography-mixed`**: PyCA Cryptography with RSA, AES, PBKDF2
139-
- **`python/pycryptodome-legacy`**: PyCryptodome with RSA signatures and AES
140-
- **`python/requirements-basic`**: Basic requirements.txt with Fernet and hashing
52+
Edit `patterns.toml` to add new libraries or algorithms. No code changes needed.
14153

142-
**Certificate Fixtures:**
143-
- **`certificates/x509-rsa-ecdsa`**: X.509 certificates with RSA and ECDSA signatures
144-
145-
Run fixture tests:
146-
```bash
147-
# Test RSA vulnerability detection
148-
./target/release/cipherscope fixtures/rust/rsa-vulnerable
149-
jq '.cryptoAssets[] | select(.assetProperties.nistQuantumSecurityLevel == 0)' fixtures/rust/rsa-vulnerable/mv-cbom.json
150-
151-
# Test multi-language support
152-
./target/release/cipherscope fixtures/java/maven-bouncycastle
153-
./target/release/cipherscope fixtures/go/stdlib-crypto
154-
./target/release/cipherscope fixtures/python/cryptography-mixed
155-
156-
# Test recursive project discovery
157-
./target/release/cipherscope fixtures/buck-nested --recursive
158-
./target/release/cipherscope fixtures/bazel-nested --recursive
159-
```
160-
161-
Benchmark performance:
54+
## Testing
16255

16356
```bash
16457
cargo test
165-
cargo bench
16658
```
16759

168-
### Contributing
169-
170-
See `CONTRIBUTING.md` for guidelines on adding languages, libraries, and improving performance.
60+
## License
17161

62+
MIT

crates/cbom-generator/src/algorithm_detector.rs

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ impl AlgorithmDetector {
107107
.symbol_patterns
108108
.iter()
109109
.any(|pattern| pattern.is_match(&finding.snippet));
110-
110+
111111
if symbol_match || snippet_match {
112112
// Extract parameters from the finding
113113
let parameters = self.extract_parameters_from_finding(finding, algorithm)?;
@@ -368,10 +368,7 @@ impl AlgorithmDetector {
368368
AssetProperties::Algorithm(props) => {
369369
// Deduplicate by algorithm name, primitive, and source library to avoid merging
370370
// different libraries' detections of the same algorithm (e.g., OpenSSL vs CommonCrypto).
371-
let library = asset
372-
.source_library
373-
.as_deref()
374-
.unwrap_or("unknown-library");
371+
let library = asset.source_library.as_deref().unwrap_or("unknown-library");
375372
format!(
376373
"{}:{}:{}",
377374
asset.name.as_deref().unwrap_or("unknown"),

crates/cbom-generator/src/lib.rs

Lines changed: 2 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ pub mod certificate_parser;
1919

2020
use algorithm_detector::AlgorithmDetector;
2121
use certificate_parser::CertificateParser;
22-
22+
2323
/// The main MV-CBOM document structure
2424
#[derive(Debug, Clone, Serialize, Deserialize)]
2525
pub struct MvCbom {
@@ -38,9 +38,6 @@ pub struct MvCbom {
3838

3939
#[serde(rename = "cryptoAssets")]
4040
pub crypto_assets: Vec<CryptoAsset>,
41-
42-
#[serde(skip_serializing_if = "Vec::is_empty", default)]
43-
pub libraries: Vec<LibrarySummary>,
4441
}
4542

4643
/// Metadata about the BOM's creation
@@ -149,12 +146,6 @@ pub struct AssetEvidence {
149146
pub column: usize,
150147
}
151148

152-
#[derive(Debug, Clone, Serialize, Deserialize)]
153-
pub struct LibrarySummary {
154-
pub name: String,
155-
pub count: usize,
156-
}
157-
158149
/// Classification of cryptographic primitives
159150
#[derive(Debug, Clone, Copy, Serialize, Deserialize)]
160151
#[serde(rename_all = "lowercase")]
@@ -228,18 +219,6 @@ impl CbomGenerator {
228219
crypto_assets.extend(algorithms);
229220
crypto_assets.extend(certificates);
230221

231-
let mut lib_counts: std::collections::BTreeMap<String, usize> =
232-
std::collections::BTreeMap::new();
233-
for asset in &crypto_assets {
234-
if let Some(ref lib) = asset.source_library {
235-
*lib_counts.entry(lib.clone()).or_insert(0) += 1;
236-
}
237-
}
238-
let libraries: Vec<LibrarySummary> = lib_counts
239-
.into_iter()
240-
.map(|(name, count)| LibrarySummary { name, count })
241-
.collect();
242-
243222
let cbom = MvCbom {
244223
bom_format: "MV-CBOM".to_string(),
245224
spec_version: "1.0".to_string(),
@@ -265,7 +244,6 @@ impl CbomGenerator {
265244
}],
266245
},
267246
crypto_assets: { crypto_assets },
268-
libraries,
269247
};
270248

271249
Ok(cbom)
@@ -282,8 +260,7 @@ impl CbomGenerator {
282260
.with_context(|| format!("Failed to canonicalize path: {}", scan_path.display()))?;
283261

284262
// Project discovery removed; just generate one CBOM for the root
285-
let mut cboms = Vec::new();
286-
cboms.push((scan_path.clone(), self.generate_cbom(&scan_path, findings)?));
263+
let cboms = vec![(scan_path.clone(), self.generate_cbom(&scan_path, findings)?)];
287264
Ok(cboms)
288265
}
289266

@@ -338,7 +315,6 @@ mod tests {
338315
}],
339316
},
340317
crypto_assets: vec![],
341-
libraries: vec![],
342318
};
343319

344320
let json = serde_json::to_string_pretty(&cbom).unwrap();

crates/cli/tests/ground_truth.rs

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ fn normalize(v: &mut Value) {
2020
for a in assets.iter_mut() {
2121
if let Some(obj) = a.as_object_mut() {
2222
obj.remove("bom-ref");
23-
23+
2424
// Normalize file paths in evidence to be relative
2525
if let Some(Value::Object(evidence)) = obj.get_mut("evidence") {
2626
if let Some(Value::String(file_path)) = evidence.get_mut("file") {
@@ -37,10 +37,16 @@ fn normalize(v: &mut Value) {
3737
// Sort assets by name, sourceLibrary, then assetType for stable comparisons
3838
assets.sort_by(|a, b| {
3939
let an = a.get("name").and_then(|x| x.as_str()).unwrap_or("");
40-
let as_ = a.get("sourceLibrary").and_then(|x| x.as_str()).unwrap_or("");
40+
let as_ = a
41+
.get("sourceLibrary")
42+
.and_then(|x| x.as_str())
43+
.unwrap_or("");
4144
let at = a.get("assetType").and_then(|x| x.as_str()).unwrap_or("");
4245
let bn = b.get("name").and_then(|x| x.as_str()).unwrap_or("");
43-
let bs = b.get("sourceLibrary").and_then(|x| x.as_str()).unwrap_or("");
46+
let bs = b
47+
.get("sourceLibrary")
48+
.and_then(|x| x.as_str())
49+
.unwrap_or("");
4450
let bt = b.get("assetType").and_then(|x| x.as_str()).unwrap_or("");
4551
(an, as_, at).cmp(&(bn, bs, bt))
4652
});

crates/scanner-core/src/lib.rs

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -468,8 +468,7 @@ fn derive_prefilter_substrings(p: &LibraryPatterns) -> Vec<String> {
468468
let mut push_tokens = |s: &str| {
469469
// Remove common regex anchors that pollute tokens
470470
let cleaned = s.replace("\\b", "");
471-
for tok in cleaned
472-
.split(|c: char| !c.is_alphanumeric() && c != '.' && c != '/' && c != '_')
471+
for tok in cleaned.split(|c: char| !c.is_alphanumeric() && c != '.' && c != '/' && c != '_')
473472
{
474473
let t = tok.trim();
475474
if t.len() >= 4 {
@@ -1430,7 +1429,11 @@ impl PatternDetector {
14301429
// Require anchor only if patterns define any; always require at least one API hit
14311430
let has_anchor_patterns =
14321431
!lib.include.is_empty() || !lib.import.is_empty() || !lib.namespace.is_empty();
1433-
let anchor_satisfied = if has_anchor_patterns { matched_import } else { true };
1432+
let anchor_satisfied = if has_anchor_patterns {
1433+
matched_import
1434+
} else {
1435+
true
1436+
};
14341437
let should_report = anchor_satisfied && api_hits > 0;
14351438
if should_report {
14361439
let finding = Finding {

fixtures/c/libsodium/aes-gcm/mv-cbom.json

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,5 @@
3030
"column": 23
3131
}
3232
}
33-
],
34-
"libraries": [
35-
{
36-
"name": "libsodium",
37-
"count": 1
38-
}
3933
]
4034
}

fixtures/c/libsodium/hmac-sha256/mv-cbom.json

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,5 @@
3030
"column": 23
3131
}
3232
}
33-
],
34-
"libraries": [
35-
{
36-
"name": "libsodium",
37-
"count": 1
38-
}
3933
]
4034
}

fixtures/c/libsodium/rsa-sign/mv-cbom.json

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,5 @@
3030
"column": 5
3131
}
3232
}
33-
],
34-
"libraries": [
35-
{
36-
"name": "libsodium",
37-
"count": 1
38-
}
3933
]
4034
}

fixtures/c/libsodium/sha256/mv-cbom.json

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,5 @@
3030
"column": 24
3131
}
3232
}
33-
],
34-
"libraries": [
35-
{
36-
"name": "libsodium",
37-
"count": 1
38-
}
3933
]
4034
}

0 commit comments

Comments
 (0)