Skip to content

Commit 85b76c5

Browse files
feat: Implement DSLLVM General-Purpose Fuzzing Foundation
Co-authored-by: intel <[email protected]>
1 parent 4720333 commit 85b76c5

16 files changed

+3501
-0
lines changed

dsmil/DSMIL-GENERAL-FUZZING-FOUNDATION-COMPLETE.md

Lines changed: 438 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
# DSLLVM General-Purpose Fuzzing Foundation Summary
2+
3+
## Overview
4+
5+
The DSLLVM General-Purpose Fuzzing Foundation is a **target-agnostic** fuzzing infrastructure that can be applied to **any** codebase, not just crypto/TLS. It provides a complete foundation for advanced next-generation fuzzing techniques.
6+
7+
## Generalization Changes
8+
9+
### Renamed Components
10+
11+
- `dsssl_*``dsmil_fuzz_*` (general-purpose naming)
12+
- `DSSSL_*``DSMIL_FUZZ_*` (attribute macros)
13+
- `Dsssl*Pass``DsmilFuzz*Pass` (LLVM passes)
14+
15+
### Generic APIs
16+
17+
All APIs are now target-agnostic:
18+
- `dsmil_fuzz_cov_hit()` - Works for any coverage site
19+
- `dsmil_fuzz_state_transition()` - Works for any state machine
20+
- `dsmil_fuzz_metric_record()` - Works for any operation
21+
- `dsmil_fuzz_api_misuse_report()` - Works for any API
22+
23+
### Flexible Configuration
24+
25+
Configuration supports any target type:
26+
- **generic** - Any codebase
27+
- **protocol** - Network protocols
28+
- **parser** - Text/binary parsers
29+
- **api** - Library APIs
30+
31+
## Components
32+
33+
### 1. General-Purpose Attributes
34+
35+
**File**: `dsmil/include/dsmil_fuzz_attributes.h`
36+
37+
- `DSMIL_FUZZ_COVERAGE` - Coverage tracking
38+
- `DSMIL_FUZZ_ENTRY_POINT` - Mark primary targets
39+
- `DSMIL_FUZZ_STATE_MACHINE(name)` - State machines
40+
- `DSMIL_FUZZ_CRITICAL_OP(name)` - Operation metrics
41+
- `DSMIL_FUZZ_API_MISUSE_CHECK(name)` - API misuse
42+
- `DSMIL_FUZZ_CONSTANT_TIME_LOOP` - Constant-time loops
43+
44+
### 2. General Runtime API
45+
46+
**File**: `dsmil/include/dsmil_fuzz_telemetry.h`
47+
48+
Target-agnostic telemetry API for any fuzzing scenario.
49+
50+
### 3. Advanced Runtime API
51+
52+
**File**: `dsmil/include/dsmil_fuzz_telemetry_advanced.h`
53+
54+
Advanced features:
55+
- Performance counters
56+
- Coverage maps
57+
- ML integration
58+
- Distributed fuzzing
59+
60+
### 4. LLVM Passes
61+
62+
**File**: `dsmil/lib/Passes/DsmilFuzzCoveragePass.cpp`
63+
64+
General-purpose instrumentation pass that works for any target.
65+
66+
### 5. Harness Generator
67+
68+
**File**: `dsmil/tools/dsmil-gen-fuzz-harness/dsmil-gen-fuzz-harness.cpp`
69+
70+
Generates harnesses for:
71+
- Generic targets
72+
- Protocol targets
73+
- Parser targets
74+
- API targets
75+
76+
### 6. Runtime Implementation
77+
78+
**File**: `dsmil/runtime/dsmil_fuzz_telemetry.c`
79+
80+
General-purpose telemetry runtime.
81+
82+
## Use Cases
83+
84+
### HTTP Parser
85+
86+
```c
87+
DSMIL_FUZZ_STATE_MACHINE("http_parser")
88+
DSMIL_FUZZ_COVERAGE
89+
int http_parse(const uint8_t *data, size_t len);
90+
```
91+
92+
### JSON Parser
93+
94+
```c
95+
DSMIL_FUZZ_CRITICAL_OP("json_parse")
96+
DSMIL_FUZZ_COVERAGE
97+
int json_parse(const char *json);
98+
```
99+
100+
### Network Protocol
101+
102+
```c
103+
DSMIL_FUZZ_STATE_MACHINE("protocol_sm")
104+
DSMIL_FUZZ_COVERAGE
105+
int process_protocol(const uint8_t *msg, size_t len);
106+
```
107+
108+
### File Format
109+
110+
```c
111+
DSMIL_FUZZ_ENTRY_POINT
112+
DSMIL_FUZZ_COVERAGE
113+
int parse_format(const uint8_t *data, size_t len);
114+
```
115+
116+
### Kernel Driver
117+
118+
```c
119+
DSMIL_FUZZ_ENTRY_POINT
120+
DSMIL_FUZZ_API_MISUSE_CHECK("ioctl")
121+
int driver_ioctl(unsigned long cmd, void *arg);
122+
```
123+
124+
## Files Created
125+
126+
### Headers
127+
- `dsmil/include/dsmil_fuzz_telemetry.h`
128+
- `dsmil/include/dsmil_fuzz_telemetry_advanced.h`
129+
- `dsmil/include/dsmil_fuzz_attributes.h`
130+
131+
### Passes
132+
- `dsmil/lib/Passes/DsmilFuzzCoveragePass.cpp`
133+
134+
### Runtime
135+
- `dsmil/runtime/dsmil_fuzz_telemetry.c`
136+
- `dsmil/runtime/dsmil_fuzz_telemetry_advanced.c` (from previous)
137+
138+
### Tools
139+
- `dsmil/tools/dsmil-gen-fuzz-harness/dsmil-gen-fuzz-harness.cpp`
140+
141+
### Configs
142+
- `dsmil/config/fuzz_telemetry_generic.yaml`
143+
- `dsmil/config/fuzz_target_http_parser.yaml`
144+
- `dsmil/config/fuzz_target_json_parser.yaml`
145+
146+
### Examples
147+
- `dsmil/examples/generic_fuzz_example.c`
148+
149+
### Docs
150+
- `dsmil/docs/DSMIL-GENERAL-FUZZING-GUIDE.md`
151+
- `dsmil/docs/DSMIL-GENERAL-FUZZING-QUICKREF.md`
152+
153+
## Key Features
154+
155+
✅ **Target-Agnostic** - Works for any codebase
156+
✅ **Advanced Techniques** - Grammar, ML, structure-aware
157+
✅ **Rich Telemetry** - Coverage, performance, security
158+
✅ **High Performance** - Optimized for 1+ petaops
159+
✅ **Distributed** - Multi-worker support
160+
✅ **Flexible** - Configurable for any use case
161+
162+
## Migration from DSSSL-Specific
163+
164+
If you have DSSSL-specific code:
165+
166+
1. Replace `dsssl_*` with `dsmil_fuzz_*`
167+
2. Replace `DSSSL_*` attributes with `DSMIL_FUZZ_*`
168+
3. Update config files to use generic format
169+
4. Regenerate harnesses with generic generator
170+
171+
## Summary
172+
173+
The foundation is now **completely general-purpose** and can be used for:
174+
- **Any protocol** (HTTP, FTP, SMTP, custom)
175+
- **Any parser** (JSON, XML, binary formats)
176+
- **Any API** (libraries, kernels, drivers)
177+
- **Any codebase** (with appropriate annotations)
178+
179+
All advanced features (grammar-based, ML-guided, distributed, etc.) work for any target type.
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# HTTP Parser Fuzzing Configuration
2+
target:
3+
name: "http_parser"
4+
type: "parser"
5+
input_format: "text"
6+
max_input_size: 65536
7+
8+
enable_structure_aware: true
9+
enable_dictionary: true
10+
dictionary:
11+
- "GET"
12+
- "POST"
13+
- "PUT"
14+
- "DELETE"
15+
- "HTTP/1.1"
16+
- "HTTP/1.0"
17+
- "Content-Length"
18+
- "Content-Type"
19+
- "Host"
20+
21+
enable_grammar_fuzzing: true
22+
grammar_file: "http_grammar.bnf"
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# JSON Parser Fuzzing Configuration
2+
target:
3+
name: "json_parser"
4+
type: "parser"
5+
input_format: "text"
6+
max_input_size: 1048576
7+
8+
enable_structure_aware: true
9+
enable_grammar_fuzzing: true
10+
grammar_file: "json_grammar.bnf"
11+
12+
enable_dictionary: true
13+
dictionary:
14+
- "{"
15+
- "}"
16+
- "["
17+
- "]"
18+
- "\""
19+
- "null"
20+
- "true"
21+
- "false"
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
# DSLLVM General-Purpose Fuzzing & Telemetry Configuration
2+
# Generic configuration template for any fuzzing target
3+
4+
# Target configuration
5+
target:
6+
name: "generic_target"
7+
type: "generic" # generic, protocol, parser, api, etc.
8+
input_format: "binary" # binary, text, structured
9+
max_input_size: 1048576 # 1MB default
10+
11+
# Advanced fuzzing techniques
12+
enable_grammar_fuzzing: false
13+
grammar_file: ""
14+
enable_structure_aware: false
15+
enable_ml_guided: false
16+
ml_model_path: ""
17+
enable_dictionary: false
18+
dictionary: []
19+
20+
# Distributed fuzzing
21+
enable_distributed: false
22+
worker_id: 0
23+
num_workers: 1
24+
25+
# Performance
26+
enable_perf_counters: false
27+
28+
# Target-specific options (key-value pairs)
29+
options: {}
30+
31+
# Operation budgets (for critical operations)
32+
operation_budgets:
33+
default:
34+
max_branches: 10000
35+
max_loads: 50000
36+
max_stores: 25000
37+
max_delta_cycles: 5000
38+
39+
# API misuse policies
40+
api_misuse_policies:
41+
buffer_write:
42+
check_bounds: true
43+
check_null: true
44+
abort_on_violation: false
45+
46+
memory_alloc:
47+
check_size: true
48+
check_overflow: true
49+
abort_on_violation: true
50+
51+
# Telemetry settings
52+
telemetry:
53+
ring_buffer_size: 1048576 # 1MB
54+
flush_on_exit: true
55+
output_file: "fuzz_telemetry.bin"
56+
enable_timing: false
57+
enable_perf_counters: false
58+
enable_ml_integration: false
59+
60+
# Compression
61+
compress_output: true
62+
compression_level: 6
63+
64+
# Export formats
65+
export_formats:
66+
- "json"
67+
- "protobuf"
68+
69+
# State machine budgets
70+
state_machine_budgets:
71+
default:
72+
max_transitions: 100
73+
max_depth: 20
74+
75+
# Mutation strategies
76+
mutation_strategies:
77+
bit_flip:
78+
enabled: true
79+
probability: 0.3
80+
81+
byte_insert:
82+
enabled: true
83+
probability: 0.2
84+
85+
byte_delete:
86+
enabled: true
87+
probability: 0.2
88+
89+
crossover:
90+
enabled: true
91+
probability: 0.1
92+
93+
dictionary_insert:
94+
enabled: true
95+
probability: 0.1
96+
97+
ml_guided:
98+
enabled: false
99+
probability: 0.1
100+
101+
# Coverage feedback
102+
coverage_feedback:
103+
interestingness_threshold: 0.7
104+
use_ml_scoring: false
105+
coverage_map_size: 1048576
106+
edge_map_size: 524288
107+
state_map_size: 65536
108+
feedback_interval: 1000
109+
110+
# Performance optimization
111+
performance:
112+
enable_parallel: false
113+
num_threads: 1
114+
enable_batch: false
115+
batch_size: 1000
116+
preallocate_buffers: true
117+
buffer_size: 1048576

0 commit comments

Comments
 (0)