You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/docs/fuzzing/3-python.md
+226-3Lines changed: 226 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,13 +7,236 @@ weight: 3
7
7
8
8
# Python
9
9
10
-
Coming soon...
10
+
We recommend using [Atheris](https://github.com/google/atheris) to fuzz Python code.
11
11
12
-
Until then, we recommend using [Atheris](https://github.com/google/atheris) to fuzz Python code.
12
+
## Installation
13
13
14
-
Check out the following resources to learn more about fuzzing Python code with Atheris:
14
+
Atheris supports 32-bit and 64-bit Linux, and macOS. We recommend fuzzing on Linux because it's simpler to manage and often faster. If you'd like to run Atheris in a Linux environment on a Mac or Windows system, we recommend using [Docker Desktop](https://www.docker.com/products/docker-desktop/).
15
+
16
+
If you'd like a fully operational Linux environment, see the [`Dockerfile`](#dockerfile) section below.
17
+
18
+
If you'd like to install Atheris locally, first install a recent version of `clang`, preferably the [latest release](https://github.com/llvm/llvm-project/releases), then run the following command:
19
+
20
+
```bash
21
+
python -m pip install atheris
22
+
```
23
+
24
+
{{< hint info >}}
25
+
Atheris is built on libFuzzer, so consider reading [our section]({{% ref "docs/fuzzing/c-cpp/10-libfuzzer/index.md" %}}) on that too.
26
+
{{< /hint >}}
27
+
28
+
## Usage
29
+
30
+
### Fuzzing pure Python code
31
+
32
+
With a working Atheris environment, let's fuzz some Python code.
33
+
34
+
Start by saving the following as `fuzz.py`:
35
+
36
+
```python
37
+
import sys
38
+
import atheris
39
+
40
+
@atheris.instrument_func
41
+
deftest_one_input(data: bytes):
42
+
iflen(data) ==4:
43
+
if data[0] ==0x46: # "F"
44
+
if data[1] ==0x55: # "U"
45
+
if data[2] ==0x5A: # "Z"
46
+
if data[3] ==0x5A: # "Z"
47
+
raiseRuntimeError("You caught me")
48
+
49
+
defmain():
50
+
atheris.Setup(sys.argv, test_one_input)
51
+
atheris.Fuzz()
52
+
53
+
if__name__=="__main__":
54
+
main()
55
+
```
56
+
57
+
Then run Atheris with the following command:
58
+
59
+
```bash
60
+
python fuzz.py
61
+
```
62
+
63
+
Relatively quickly, it should produce a crash like the following:
64
+
65
+
```bash
66
+
INFO: Using preloaded libfuzzer
67
+
INFO: Running with entropic power schedule (0xFF, 100).
#13 0xffff98217814 in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x27814) (BuildId: 918ff46614b9808b05f1e29a9914132def52f69e)
105
+
#14 0xaaaae095086c in _start (/usr/local/bin/python3.11+0x86c) (BuildId: 4556bff17c135ffcd799fb46df15a21e0c671da8)
106
+
107
+
SUMMARY: libFuzzer: fuzz target exited
108
+
MS: 1 CopyPart-; base unit: cc3a45e08551b2e1d4f50d233a2a1b6c24f6dee8
109
+
0x46,0x55,0x5a,0x5a,
110
+
FUZZ
111
+
artifact_prefix='./'; Test unit written to ./crash-aea2e3923af219a8956f626558ef32f30a914ebc
112
+
Base64: RlVaWg==
113
+
```
114
+
115
+
As you can see, it found the input that produces an exception: `"FUZZ"`. This example highlights Atheris' ability to instrument and track coverage in pure Python code. More typically you will want to use something like [`atheris.instrument_imports` or `atheris.instrument_all`](https://github.com/google/atheris#python-coverage) to fuzz broader parts of an application or library.
116
+
117
+
To fuzz your own target, modify the `test_one_input` function to call your target function.
118
+
119
+
### Fuzzing Python C extensions
120
+
121
+
Fuzzing Python C extensions requires a bit more work. They must be compiled with the correct compiler flags. If you're using the provided [`Dockerfile`](#dockerfile), they should already be set for you (`CC`, `CFLAGS`, `LD_PRELOAD`, etc.).
122
+
123
+
Let's fuzz the [`cbor2`](https://github.com/agronholm/cbor2) project as an example. It includes a Python C extension component and binary data parsing functionality, which is particularly amenable to fuzzing.
The `CBOR2_BUILD_C_EXTENSION` environment variable and `--no-binary` flag ensure that the C extension code is compiled locally rather than using pre-compiled binaries. This allows us to instrument fuzzing and [`AddressSanitizer`](https://clang.llvm.org/docs/AddressSanitizer.html) functionality into the compiled object.
132
+
133
+
Start by saving the following as `cbor2-fuzz.py`:
134
+
135
+
```python
136
+
import sys
137
+
import atheris
138
+
139
+
# _cbor2 ensures the C library is imported
140
+
from _cbor2 import loads
141
+
142
+
def test_one_input(data: bytes):
143
+
try:
144
+
loads(data)
145
+
except Exception:
146
+
# We're searching for memory corruption, not Python exceptions
147
+
pass
148
+
149
+
def main():
150
+
atheris.Setup(sys.argv, test_one_input)
151
+
atheris.Fuzz()
152
+
153
+
if __name__ == "__main__":
154
+
main()
155
+
```
156
+
157
+
Then run Atheris with the following command:
158
+
159
+
```bash
160
+
python cbor2-fuzz.py
161
+
```
162
+
163
+
This will start fuzzing `cbor2`, but you should not expect a crash unless you get lucky and find a bug. This example serves as a demonstration of fuzzing an existing Python C extension.
164
+
165
+
{{< hint info >}}
166
+
Remember, if you're running this locally and not in the provided Docker image, then you'll need to [set `LD_PRELOAD` manually](https://github.com/google/atheris/blob/master/native_extension_fuzzing.md#option-a-sanitizerlibfuzzer-preloads).
167
+
{{< /hint >}}
168
+
169
+
## Additional resources
15
170
16
171
- [Continuously fuzzing Python C extensions](https://blog.trailofbits.com/2024/02/23/continuously-fuzzing-python-c-extensions/)
17
172
- [Fuzzing pure Python code](https://github.com/google/atheris#using-atheris)
18
173
- [Fuzzing Python C extensions](https://github.com/google/atheris/blob/master/native_extension_fuzzing.md)
19
174
- [Fuzzing Python in CI](https://google.github.io/clusterfuzzlite//build-integration/python-lang/)
175
+
176
+
### Dockerfile
177
+
178
+
To use Atheris in a Docker environment, save the following code in the `Dockerfile`:
179
+
180
+
```dockerfile
181
+
# https://hub.docker.com/_/python
182
+
ARG PYTHON_VERSION=3.11
183
+
184
+
FROM python:$PYTHON_VERSION-slim-bookworm
185
+
186
+
RUN python --version
187
+
188
+
RUN apt update && apt install -y \
189
+
ca-certificates \
190
+
wget \
191
+
&& rm -rf /var/lib/apt/lists/*
192
+
193
+
# LLVM builds version 15-19 for Debian 12 (Bookworm)
194
+
# https://apt.llvm.org/bookworm/dists/
195
+
ARG LLVM_VERSION=19
196
+
197
+
RUN echo"deb http://apt.llvm.org/bookworm/ llvm-toolchain-bookworm-$LLVM_VERSION main"> /etc/apt/sources.list.d/llvm.list
198
+
RUN echo"deb-src http://apt.llvm.org/bookworm/ llvm-toolchain-bookworm-$LLVM_VERSION main">> /etc/apt/sources.list.d/llvm.list
199
+
RUN wget -qO- https://apt.llvm.org/llvm-snapshot.gpg.key > /etc/apt/trusted.gpg.d/apt.llvm.org.asc
0 commit comments