Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 64 additions & 36 deletions fuzz/FUZZING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,12 @@

## Install AFL++

For effective fuzzing with crash replay support, AFL++ must be built from source with `AFL_PERSISTENT_RECORD` enabled.
AFL++ must be built from source with `AFL_PERSISTENT_RECORD` enabled for crash replay.

```bash
# Install dependencies
sudo apt update
sudo apt install llvm-18-dev clang-18 lld-18 gcc-13-plugin-dev

# Build AFL++ with AFL_PERSISTENT_RECORD support
git clone --depth=1 --branch v4.34c https://github.com/AFLplusplus/AFLplusplus.git
cd AFLplusplus

Expand All @@ -26,7 +24,7 @@ sudo make install
sudo afl-system-config
```

Sets core_pattern and CPU governors for optimal AFL++ performance.
`run_fuzzer.sh` also runs these checks automatically (core_pattern, CPU governor).

## Build Dragonfly

Expand All @@ -42,60 +40,90 @@ cd fuzz
./run_fuzzer.sh
```

## AFL_PERSISTENT_RECORD (Stateful Crash Replay)
Configuration via environment variables:

Dragonfly uses AFL++ persistent mode for performance. This means multiple fuzzing iterations run within the same process, and the server accumulates state between iterations.
| Variable | Default | Description |
|----------|---------|-------------|
| `AFL_PROACTOR_THREADS` | `1` | Server threads (1 = most stable coverage) |
| `AFL_LOOP_LIMIT` | `10000` | Iterations before server restart (= `AFL_PERSISTENT_RECORD`) |
| `BUILD_DIR` | `build-dbg` | Path to build directory |

**Problem:** When a crash occurs, AFL++ only saves the last input. But the crash may depend on state accumulated from previous inputs.
## Custom Mutator

**Solution:** `AFL_PERSISTENT_RECORD` saves the last N inputs before a crash, enabling replay of the full sequence.
`resp_mutator.py` is a custom AFL++ mutator that operates at the RESP protocol
level. Instead of flipping random bytes (which mostly breaks RESP framing and
gets rejected by the parser), it:

### Enable Recording
- Parses input into a list of Redis commands
- Mutates at the command/argument level (replace command, change argument,
insert/remove commands, wrap in MULTI/EXEC, swap order)
- Serializes back to valid RESP

```bash
# Set number of inputs to record before crash (e.g., last 100 inputs)
AFL_PERSISTENT_RECORD=100 ./run_fuzzer.sh
```
The mutator is loaded automatically by `run_fuzzer.sh`. AFL++'s built-in
byte-level mutations also run alongside it (useful for parser edge cases).

To use only the custom mutator: `export AFL_CUSTOM_MUTATOR_ONLY=1`.

## Crash Replay

Dragonfly uses AFL++ persistent mode — the server accumulates state across
iterations. A crash at iteration N depends on state built by inputs 1..N-1.

`run_fuzzer.sh` syncs `AFL_PERSISTENT_RECORD` with `afl_loop_limit`
so the full state history is always available on crash.

When a crash occurs, AFL++ saves files in the crashes directory:
When a crash occurs, AFL++ saves:
```
crashes/RECORD:000000,cnt:000000 (input N-99)
crashes/RECORD:000000,cnt:000001 (input N-98)
crashes/id:000000,sig:06,... # the crashing input
crashes/RECORD:000000,cnt:000000 # first input after server start
crashes/RECORD:000000,cnt:000001 # second input
...
crashes/RECORD:000000,cnt:000099 (crashing input)
crashes/RECORD:000000,cnt:NNNNNN # input before the crash
```

### Replay Recorded Crash
### Replay

```bash
# Set directory containing RECORD files
export AFL_PERSISTENT_DIR=./artifacts/resp/default/crashes
# Start dragonfly (non-AFL build)
./build/dragonfly --port 6379 --logtostderr --proactor_threads 1 --dbfilename=""

# Replay specific record (e.g., record 000000)
AFL_PERSISTENT_REPLAY=000000 ./build-dbg/dragonfly --port=6379
# Replay crash 000000
python3 fuzz/replay_crash.py fuzz/artifacts/resp/default/crashes 000000
```

This replays all recorded inputs in sequence, reproducing the exact state that led to the crash.
### Package crash for sharing

### Manual Replay (Alternative)
```bash
cd fuzz
./package_crash.sh 000000
```

If AFL_PERSISTENT_REPLAY doesn't work, replay manually:
Creates `crash-000000.tar.gz` containing crash data and `replay_crash.py`.
The recipient runs:

```bash
# Start dragonfly
./build-dbg/dragonfly --port=6379 &
./build/dragonfly --port 6379 --logtostderr --proactor_threads 1 --dbfilename=""

# Send each recorded input in order
for f in $(ls crashes/RECORD:000000,cnt:* | sort); do
nc localhost 6379 < "$f"
done
tar xzf crash-000000.tar.gz
cd crash-000000
python3 replay_crash.py crashes 000000
```

## Replay Simple Crash
## Seed Corpus

For crashes that don't depend on accumulated state:
`seeds/resp/` contains 79 seed files covering all major command families:
string, list, hash, set, sorted set, stream, transactions, pub/sub, geo,
HyperLogLog, Bloom filter, bitops, JSON, search, scripting, ACL, and
server introspection.

```bash
./build-dbg/dragonfly --port=6379 &
nc localhost 6379 < artifacts/resp/default/crashes/id:000000,...
To add a new seed, create a file with RESP-encoded commands:

```
*3
$3
SET
$3
key
$5
value
```
219 changes: 209 additions & 10 deletions fuzz/dict/resp.dict
Original file line number Diff line number Diff line change
Expand Up @@ -165,12 +165,9 @@
"1"
"-1"

# Number patterns
"0"
"1"
# Number patterns (0, 1, -1 already above)
"100"
"1000"
"-1"
"-100"

# Special arguments
Expand All @@ -185,12 +182,10 @@
"COUNT"
"MATCH"

# Common RESP command patterns
"*1\x0d\x0a$4\x0d\x0aPING\x0d\x0a"
"*2\x0d\x0a$3\x0d\x0aGET\x0d\x0a$3\x0d\x0akey\x0d\x0a"
"*3\x0d\x0a$3\x0d\x0aSET\x0d\x0a$3\x0d\x0akey\x0d\x0a$5\x0d\x0avalue\x0d\x0a"
"*2\x0d\x0a$3\x0d\x0aDEL\x0d\x0a$3\x0d\x0akey\x0d\x0a"
"*2\x0d\x0a$6\x0d\x0aEXISTS\x0d\x0a$3\x0d\x0akey\x0d\x0a"
# Small RESP framing patterns (larger patterns removed — AFL++ warned about >33B tokens)
"*1\x0d\x0a$"
"*2\x0d\x0a$"
"*3\x0d\x0a$"

# Scripting commands
"EVAL"
Expand Down Expand Up @@ -274,3 +269,207 @@
"\x00"
"\xff"
"\x00\x00\x00\x00"

# --- Additional commands for broader coverage ---

# Missing key operations
"COPY"
"SORT"
"SORT_RO"
"UNLINK"
"TOUCH"
"OBJECT"
"RANDOMKEY"
"DUMP"
"RESTORE"
"WAIT"
"EXPIREAT"
"PEXPIRE"
"PEXPIREAT"
"PEXPIRETIME"
"EXPIRETIME"
"PTTL"

# String commands
"GETDEL"
"GETEX"
"INCRBYFLOAT"
"DECRBY"
"INCRBY"
"MSETNX"
"PSETEX"
"SUBSTR"

# List commands
"LPOS"
"LMPOP"
"LMOVE"
"BLMOVE"
"BLMPOP"
"BLPOP"
"BRPOP"
"LPUSHX"
"RPUSHX"
"RPOPLPUSH"

# Set commands
"SRANDMEMBER"
"SMOVE"
"SMISMEMBER"
"SINTERCARD"
"SDIFFSTORE"
"SINTERSTORE"
"SUNIONSTORE"

# Sorted set commands
"ZDIFF"
"ZDIFFSTORE"
"ZLEXCOUNT"
"ZRANGEBYLEX"
"ZRANGESTORE"
"ZRANDMEMBER"
"ZREVRANGE"
"ZREVRANGEBYLEX"
"ZREVRANGEBYSCORE"
"ZREVRANK"
"ZMSCORE"
"ZREMRANGEBYLEX"
"ZREMRANGEBYRANK"
"ZREMRANGEBYSCORE"
"BZMPOP"
"BZPOPMIN"
"BZPOPMAX"

# Hash commands
"HRANDFIELD"
"HSCAN"
"HSETEX"
"HSETNX"
"HSTRLEN"
"HINCRBYFLOAT"
"HEXPIRE"

# Server/client commands
"CLIENT"
"CONFIG"
"MEMORY"
"ACL"
"HELLO"
"COMMAND"
"LATENCY"
"SLOWLOG"
"BGSAVE"
"LASTSAVE"
"ROLE"

# Subcommands
"OBJECT ENCODING"
"OBJECT HELP"
"OBJECT FREQ"
"OBJECT IDLETIME"
"CLIENT SETNAME"
"CLIENT GETNAME"
"CLIENT LIST"
"CLIENT ID"
"CLIENT INFO"
"CONFIG GET"
"CONFIG SET"
"MEMORY USAGE"
"MEMORY DOCTOR"
"ACL LIST"
"ACL WHOAMI"
"ACL SETUSER"
"COMMAND COUNT"
"COMMAND INFO"

# Scan operations (HSCAN already above)
"SSCAN"
"ZSCAN"

# Function/script commands
"FUNCTION"
"FUNCTION LOAD"
"FUNCTION LIST"
"FUNCTION DELETE"

# More JSON commands
"JSON.ARRINSERT"
"JSON.ARRTRIM"
"JSON.ARRPOP"
"JSON.ARRINDEX"
"JSON.OBJKEYS"
"JSON.OBJLEN"
"JSON.STRAPPEND"
"JSON.STRLEN"
"JSON.TOGGLE"
"JSON.CLEAR"
"JSON.MERGE"
"JSON.MGET"
"JSON.MSET"
"JSON.DEBUG"
"JSON.RESP"

# More Geo commands
"GEOPOS"
"GEOHASH"
"GEOSEARCHSTORE"
"GEORADIUSBYMEMBER"

# Search commands
"FT.CREATE"
"FT.SEARCH"
"FT.DROPINDEX"
"FT.INFO"
"FT.ALTER"

# Additional arguments
"REPLACE"
"ABSTTL"
"IDLETIME"
"FREQ"
"LEFT"
"RIGHT"
"BEFORE"
"AFTER"
"BY"
"ASC"
"DESC"
"ALPHA"
"STORE"
"REV"
"BYSCORE"
"BYLEX"
"CH"
"KEEPTTL"
"EXAT"
"PXAT"
"ENCODING"
"REFCOUNT"

# Malformed RESP for edge-case testing
"*-2\x0d\x0a"
"*999999\x0d\x0a"
"$-2\x0d\x0a"
"$999999999\x0d\x0a"
"*\x0d\x0a"
"$\x0d\x0a"
"+\x0d\x0a"
"-\x0d\x0a"
":\x0d\x0a"

# Inline commands (no RESP framing)
"PING\x0d\x0a"
"PING\x0a"
"SET key value\x0d\x0a"
"GET key\x0a"
"QUIT\x0d\x0a"

# More binary patterns
"\xfe\xff\x00\x01"
"\x0d\x0a\x0d\x0a"
"\x0d\x0d\x0a\x0a"
"\x00\x01\x02\x03"

# RESP edge cases (small fragments only)
"$0\x0d\x0a\x0d\x0a"
"$-1\x0d\x0a"
Loading
Loading