Skip to content

Commit c291725

Browse files
authored
Merge pull request #62 from trailofbits/mschwager-ruby-python-fuzzing
Add Ruby and Python fuzzing sections
2 parents 12ca7e8 + 76ce73a commit c291725

File tree

2 files changed

+447
-6
lines changed

2 files changed

+447
-6
lines changed

content/docs/fuzzing/3-python.md

Lines changed: 226 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,236 @@ weight: 3
77

88
# Python
99

10-
Coming soon...
10+
We recommend using [Atheris](https://github.com/google/atheris) to fuzz Python code.
1111

12-
Until then, we recommend using [Atheris](https://github.com/google/atheris) to fuzz Python code.
12+
## Installation
1313

14-
Check out the following resources to learn more about fuzzing Python code with Atheris:
14+
Atheris supports 32-bit and 64-bit Linux, and macOS. We recommend fuzzing on Linux because it's simpler to manage and often faster. If you'd like to run Atheris in a Linux environment on a Mac or Windows system, we recommend using [Docker Desktop](https://www.docker.com/products/docker-desktop/).
15+
16+
If you'd like a fully operational Linux environment, see the [`Dockerfile`](#dockerfile) section below.
17+
18+
If you'd like to install Atheris locally, first install a recent version of `clang`, preferably the [latest release](https://github.com/llvm/llvm-project/releases), then run the following command:
19+
20+
```bash
21+
python -m pip install atheris
22+
```
23+
24+
{{< hint info >}}
25+
Atheris is built on libFuzzer, so consider reading [our section]({{% ref "docs/fuzzing/c-cpp/10-libfuzzer/index.md" %}}) on that too.
26+
{{< /hint >}}
27+
28+
## Usage
29+
30+
### Fuzzing pure Python code
31+
32+
With a working Atheris environment, let's fuzz some Python code.
33+
34+
Start by saving the following as `fuzz.py`:
35+
36+
```python
37+
import sys
38+
import atheris
39+
40+
@atheris.instrument_func
41+
def test_one_input(data: bytes):
42+
if len(data) == 4:
43+
if data[0] == 0x46: # "F"
44+
if data[1] == 0x55: # "U"
45+
if data[2] == 0x5A: # "Z"
46+
if data[3] == 0x5A: # "Z"
47+
raise RuntimeError("You caught me")
48+
49+
def main():
50+
atheris.Setup(sys.argv, test_one_input)
51+
atheris.Fuzz()
52+
53+
if __name__ == "__main__":
54+
main()
55+
```
56+
57+
Then run Atheris with the following command:
58+
59+
```bash
60+
python fuzz.py
61+
```
62+
63+
Relatively quickly, it should produce a crash like the following:
64+
65+
```bash
66+
INFO: Using preloaded libfuzzer
67+
INFO: Running with entropic power schedule (0xFF, 100).
68+
INFO: Seed: 3701051567
69+
INFO: Loaded 2 modules (15334 inline 8-bit counters): 9595 [0xffff951f58e0, 0xffff951f7e5b), 5739 [0xffff94f843e0, 0xffff94f85a4b),
70+
INFO: Loaded 2 PC tables (15334 PCs): 9595 [0xffff951f7e60,0xffff9521d610), 5739 [0xffff94f85a50,0xffff94f9c100),
71+
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
72+
INFO: A corpus is not provided, starting from an empty corpus
73+
#2 INITED cov: 882 ft: 883 corp: 1/1b exec/s: 0 rss: 66Mb
74+
#3 NEW cov: 882 ft: 885 corp: 2/3b lim: 4 exec/s: 0 rss: 66Mb L: 2/2 MS: 1 CopyPart-
75+
#13 NEW cov: 884 ft: 1257 corp: 3/7b lim: 4 exec/s: 0 rss: 67Mb L: 4/4 MS: 5 CrossOver-ChangeBinInt-ChangeByte-ChangeBinInt-CopyPart-
76+
#65536 pulse cov: 884 ft: 1257 corp: 3/7b lim: 652 exec/s: 21845 rss: 98Mb
77+
#71462 NEW cov: 886 ft: 1379 corp: 4/11b lim: 706 exec/s: 17865 rss: 101Mb L: 4/4 MS: 4 ChangeBinInt-ChangeByte-ChangeBit-CopyPart-
78+
#131072 pulse cov: 886 ft: 1379 corp: 4/11b lim: 1290 exec/s: 21845 rss: 130Mb
79+
#230788 NEW cov: 888 ft: 1691 corp: 5/15b lim: 2281 exec/s: 25643 rss: 177Mb L: 4/4 MS: 1 ChangeByte-
80+
#262144 pulse cov: 888 ft: 1691 corp: 5/15b lim: 2589 exec/s: 26214 rss: 194Mb
81+
#287560 NEW cov: 890 ft: 1704 corp: 6/19b lim: 2842 exec/s: 26141 rss: 208Mb L: 4/4 MS: 2 InsertByte-EraseBytes-
82+
83+
=== Uncaught Python exception: ===
84+
RuntimeError: You caught me
85+
Traceback (most recent call last):
86+
File "/app/fuzz.py", line 11, in test_one_input
87+
raise RuntimeError("You caught me")
88+
RuntimeError: You caught me
89+
90+
==399== ERROR: libFuzzer: fuzz target exited
91+
#0 0xffff989df9b8 in __sanitizer_print_stack_trace (/opt/venv/lib/python3.11/site-packages/asan_with_fuzzer.so+0x11f9b8) (BuildId: b12d6567a22f7311b104efa346c5035b6837d8d1)
92+
#1 0xffff989344cc in fuzzer::PrintStackTrace() (/opt/venv/lib/python3.11/site-packages/asan_with_fuzzer.so+0x744cc) (BuildId: b12d6567a22f7311b104efa346c5035b6837d8d1)
93+
#2 0xffff9891a7c8 in fuzzer::Fuzzer::ExitCallback() (/opt/venv/lib/python3.11/site-packages/asan_with_fuzzer.so+0x5a7c8) (BuildId: b12d6567a22f7311b104efa346c5035b6837d8d1)
94+
#3 0xffff9822ce88 (/lib/aarch64-linux-gnu/libc.so.6+0x3ce88) (BuildId: 918ff46614b9808b05f1e29a9914132def52f69e)
95+
#4 0xffff9822cf5c in exit (/lib/aarch64-linux-gnu/libc.so.6+0x3cf5c) (BuildId: 918ff46614b9808b05f1e29a9914132def52f69e)
96+
#5 0xffff9852eba8 in Py_Exit (/usr/local/bin/../lib/libpython3.11.so.1.0+0x18eba8) (BuildId: 3e68e83acff0ce909056da94a6b647416bc78ec5)
97+
#6 0xffff9852ebd8 (/usr/local/bin/../lib/libpython3.11.so.1.0+0x18ebd8) (BuildId: 3e68e83acff0ce909056da94a6b647416bc78ec5)
98+
#7 0xffff9852ec34 (/usr/local/bin/../lib/libpython3.11.so.1.0+0x18ec34) (BuildId: 3e68e83acff0ce909056da94a6b647416bc78ec5)
99+
#8 0xffff98601af8 in _PyRun_SimpleFileObject (/usr/local/bin/../lib/libpython3.11.so.1.0+0x261af8) (BuildId: 3e68e83acff0ce909056da94a6b647416bc78ec5)
100+
#9 0xffff9860172c in _PyRun_AnyFileObject (/usr/local/bin/../lib/libpython3.11.so.1.0+0x26172c) (BuildId: 3e68e83acff0ce909056da94a6b647416bc78ec5)
101+
#10 0xffff985fa04c in Py_RunMain (/usr/local/bin/../lib/libpython3.11.so.1.0+0x25a04c) (BuildId: 3e68e83acff0ce909056da94a6b647416bc78ec5)
102+
#11 0xffff985a1ef4 in Py_BytesMain (/usr/local/bin/../lib/libpython3.11.so.1.0+0x201ef4) (BuildId: 3e68e83acff0ce909056da94a6b647416bc78ec5)
103+
#12 0xffff9821773c (/lib/aarch64-linux-gnu/libc.so.6+0x2773c) (BuildId: 918ff46614b9808b05f1e29a9914132def52f69e)
104+
#13 0xffff98217814 in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x27814) (BuildId: 918ff46614b9808b05f1e29a9914132def52f69e)
105+
#14 0xaaaae095086c in _start (/usr/local/bin/python3.11+0x86c) (BuildId: 4556bff17c135ffcd799fb46df15a21e0c671da8)
106+
107+
SUMMARY: libFuzzer: fuzz target exited
108+
MS: 1 CopyPart-; base unit: cc3a45e08551b2e1d4f50d233a2a1b6c24f6dee8
109+
0x46,0x55,0x5a,0x5a,
110+
FUZZ
111+
artifact_prefix='./'; Test unit written to ./crash-aea2e3923af219a8956f626558ef32f30a914ebc
112+
Base64: RlVaWg==
113+
```
114+
115+
As you can see, it found the input that produces an exception: `"FUZZ"`. This example highlights Atheris' ability to instrument and track coverage in pure Python code. More typically you will want to use something like [`atheris.instrument_imports` or `atheris.instrument_all`](https://github.com/google/atheris#python-coverage) to fuzz broader parts of an application or library.
116+
117+
To fuzz your own target, modify the `test_one_input` function to call your target function.
118+
119+
### Fuzzing Python C extensions
120+
121+
Fuzzing Python C extensions requires a bit more work. They must be compiled with the correct compiler flags. If you're using the provided [`Dockerfile`](#dockerfile), they should already be set for you (`CC`, `CFLAGS`, `LD_PRELOAD`, etc.).
122+
123+
Let's fuzz the [`cbor2`](https://github.com/agronholm/cbor2) project as an example. It includes a Python C extension component and binary data parsing functionality, which is particularly amenable to fuzzing.
124+
125+
First, install the package:
126+
127+
```bash
128+
CBOR2_BUILD_C_EXTENSION=1 python -m pip install --no-binary cbor2 cbor2==5.6.4
129+
```
130+
131+
The `CBOR2_BUILD_C_EXTENSION` environment variable and `--no-binary` flag ensure that the C extension code is compiled locally rather than using pre-compiled binaries. This allows us to instrument fuzzing and [`AddressSanitizer`](https://clang.llvm.org/docs/AddressSanitizer.html) functionality into the compiled object.
132+
133+
Start by saving the following as `cbor2-fuzz.py`:
134+
135+
```python
136+
import sys
137+
import atheris
138+
139+
# _cbor2 ensures the C library is imported
140+
from _cbor2 import loads
141+
142+
def test_one_input(data: bytes):
143+
try:
144+
loads(data)
145+
except Exception:
146+
# We're searching for memory corruption, not Python exceptions
147+
pass
148+
149+
def main():
150+
atheris.Setup(sys.argv, test_one_input)
151+
atheris.Fuzz()
152+
153+
if __name__ == "__main__":
154+
main()
155+
```
156+
157+
Then run Atheris with the following command:
158+
159+
```bash
160+
python cbor2-fuzz.py
161+
```
162+
163+
This will start fuzzing `cbor2`, but you should not expect a crash unless you get lucky and find a bug. This example serves as a demonstration of fuzzing an existing Python C extension.
164+
165+
{{< hint info >}}
166+
Remember, if you're running this locally and not in the provided Docker image, then you'll need to [set `LD_PRELOAD` manually](https://github.com/google/atheris/blob/master/native_extension_fuzzing.md#option-a-sanitizerlibfuzzer-preloads).
167+
{{< /hint >}}
168+
169+
## Additional resources
15170
16171
- [Continuously fuzzing Python C extensions](https://blog.trailofbits.com/2024/02/23/continuously-fuzzing-python-c-extensions/)
17172
- [Fuzzing pure Python code](https://github.com/google/atheris#using-atheris)
18173
- [Fuzzing Python C extensions](https://github.com/google/atheris/blob/master/native_extension_fuzzing.md)
19174
- [Fuzzing Python in CI](https://google.github.io/clusterfuzzlite//build-integration/python-lang/)
175+
176+
### Dockerfile
177+
178+
To use Atheris in a Docker environment, save the following code in the `Dockerfile`:
179+
180+
```dockerfile
181+
# https://hub.docker.com/_/python
182+
ARG PYTHON_VERSION=3.11
183+
184+
FROM python:$PYTHON_VERSION-slim-bookworm
185+
186+
RUN python --version
187+
188+
RUN apt update && apt install -y \
189+
ca-certificates \
190+
wget \
191+
&& rm -rf /var/lib/apt/lists/*
192+
193+
# LLVM builds version 15-19 for Debian 12 (Bookworm)
194+
# https://apt.llvm.org/bookworm/dists/
195+
ARG LLVM_VERSION=19
196+
197+
RUN echo "deb http://apt.llvm.org/bookworm/ llvm-toolchain-bookworm-$LLVM_VERSION main" > /etc/apt/sources.list.d/llvm.list
198+
RUN echo "deb-src http://apt.llvm.org/bookworm/ llvm-toolchain-bookworm-$LLVM_VERSION main" >> /etc/apt/sources.list.d/llvm.list
199+
RUN wget -qO- https://apt.llvm.org/llvm-snapshot.gpg.key > /etc/apt/trusted.gpg.d/apt.llvm.org.asc
200+
201+
RUN apt update && apt install -y \
202+
build-essential \
203+
clang-$LLVM_VERSION \
204+
&& rm -rf /var/lib/apt/lists/*
205+
206+
ENV APP_DIR "/app"
207+
RUN mkdir $APP_DIR
208+
WORKDIR $APP_DIR
209+
210+
ENV VIRTUAL_ENV "/opt/venv"
211+
RUN python -m venv $VIRTUAL_ENV
212+
ENV PATH "$VIRTUAL_ENV/bin:$PATH"
213+
214+
# https://github.com/google/atheris/blob/master/native_extension_fuzzing.md#step-1-compiling-your-extension
215+
ENV CC="clang-$LLVM_VERSION"
216+
ENV CFLAGS "-fsanitize=address,fuzzer-no-link"
217+
ENV CXX="clang++-$LLVM_VERSION"
218+
ENV CXXFLAGS "-fsanitize=address,fuzzer-no-link"
219+
ENV LDSHARED="clang-$LLVM_VERSION -shared"
220+
ENV LDSHAREDXX="clang++-$LLVM_VERSION -shared"
221+
ENV ASAN_SYMBOLIZER_PATH="/usr/bin/llvm-symbolizer-$LLVM_VERSION"
222+
223+
# Allow Atheris to find fuzzer sanitizer shared libs
224+
# https://github.com/google/atheris#building-from-source
225+
RUN LIBFUZZER_LIB=$($CC -print-file-name=libclang_rt.fuzzer_no_main-$(uname -m).a) \
226+
python -m pip install --no-binary atheris atheris
227+
228+
# https://github.com/google/atheris/blob/master/native_extension_fuzzing.md#option-a-sanitizerlibfuzzer-preloads
229+
ENV LD_PRELOAD "$VIRTUAL_ENV/lib/python3.11/site-packages/asan_with_fuzzer.so"
230+
231+
# 1. Skip memory allocation failures for now, they are common, and low impact (DoS)
232+
# 2. https://github.com/google/atheris/blob/master/native_extension_fuzzing.md#leak-detection
233+
ENV ASAN_OPTIONS "allocator_may_return_null=1,detect_leaks=0"
234+
235+
CMD ["/bin/bash"]
236+
```
237+
238+
Then run the following commands to build and run the container:
239+
- `docker build -t atheris .`
240+
- `docker run -it atheris`
241+
242+
Note you may need to modify `CFLAGS` and `CXXFLAGS` if you'd like to [use UBSAN](https://llvm.org/docs/LibFuzzer.html#fuzzer-usage).

0 commit comments

Comments
 (0)