A Rust implementation of MOSS (Measure Of Software Similarity) for detecting code plagiarism through document fingerprinting using the winnowing algorithm.
- Fast: Parallel processing with Rayon, efficient rolling hash (O(1) per window)
- Language-agnostic preprocessing: Supports 9 programming languages
- Multiple output formats: Console table, JSON, HTML network visualization
- Configurable: Adjustable k-gram size, window size, similarity threshold
C, C++, Go, Java, JavaScript, Python, Ruby, Rust, TypeScript
cargo install --git https://github.com/Devin-Yeung/fuscumOr build from source:
cargo build --release# Basic usage
fuscum-cli --dir ./src --pat "**/*.py" --lang Python
# With output files
fuscum-cli --dir ./src --pat "**/*.rs" --lang Rust --json results.json --network network.html
# Configure detection sensitivity
fuscum-cli --dir ./src --pat "**/*.js" --lang JavaScript --kgram 30 --window 50 --threshold 0.4| Option | Description | Default |
|---|---|---|
--dir |
Directory to scan | required |
--pat |
Glob pattern for files | required |
--lang |
Language for preprocessing | required |
--kgram |
K-gram size (characters) | 35 |
--window |
Window size for winnowing | 40 |
--threshold |
Minimum similarity (0-1) | 0.5 |
--top |
Top-K matches per file | 5 |
--json |
Export results to JSON | - |
--network |
Generate HTML visualization | - |
- Preprocessing: Parse source code with AST, remove comments, normalize identifiers and strings
- K-gram generation: Split preprocessed code into overlapping k-length windows, hash each using Rabin-Karp rolling hash
- Winnowing: Select representative hashes from each sliding window (rightmost minimum)
- Similarity: Compute pairwise similarity as
|intersection| / |base fingerprint|
Derived from Sphagnum fuscum (rusty peat moss), a nod to the original MOSS system and Rust.
Schleimer, S., Wilkerson, D. S., & Aiken, A. (2003). Winnowing: local algorithms for document fingerprinting. Proceedings of the 2003 ACM SIGMOD, 76–85. https://doi.org/10.1145/872757.872770
MIT OR Apache-2.0