Skip to content

Commit fb34b98

Browse files
authored
Merge pull request #48 from healeycodes/post/ptrace
add making python less random post
2 parents 36dd605 + 5c8d169 commit fb34b98

File tree

3 files changed

+246
-0
lines changed

3 files changed

+246
-0
lines changed

data/posts.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ export const popularPosts = [
77

88
// Good posts/highly viewed posts (not in any specific order)
99
export const postStars = [
10+
"making-python-less-random",
1011
"2d-multiplayer-from-scratch",
1112
"lisp-compiler-optimizations",
1213
"lisp-to-javascript-compiler",

data/projects.ts

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,12 @@ export default [
5353
desc: "A minimal programming language inspired by Ink, JavaScript, and Python.",
5454
to: "/creating-the-golfcart-programming-language",
5555
},
56+
{
57+
name: "unrandom",
58+
link: "https://github.com/healeycodes/unrandom",
59+
desc: "Intercept and modify getrandom syscalls from a process (x86-64 Linux).",
60+
to: "/making-python-less-random",
61+
},
5662
{
5763
name: "bitcask-lite",
5864
link: "https://github.com/healeycodes/bitcask-lite",

posts/making-python-less-random.md

Lines changed: 239 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,239 @@
1+
---
2+
title: "Making Python Less Random"
3+
date: "2024-07-08"
4+
tags: ["c"]
5+
description: "Using ptrace to intercept and modify a process's getrandom syscall."
6+
---
7+
8+
I was working on a game prototype written in Python when I came across a tricky bug. I was able to reproduce it (good), but because it depended on randomness, it was hard to iterate on a fix (bad).
9+
10+
I searched and found that my game had two sources of randomness: `os.urandom` and `random.randint`. I tried to mock them like this:
11+
12+
```python
13+
import os
14+
os.urandom = lambda n: b'\x00' * n
15+
import random
16+
random.randint = lambda a, b: a
17+
```
18+
19+
However, I found an imported third-party library was also calling `random` functions. The library's code wasn't well-structured (e.g., importing modules inside functions). I couldn't mock the call sites without altering the dependency locally.
20+
21+
At this point, I should have pulled the code I was using out of that library, and refactored my game so that all sources of randomness came from some kind of pseudorandom generator function. This way, I could provide fixed seeds for deterministic debugging.
22+
23+
Instead, I took a detour to catch and modify syscalls to `getrandom`.
24+
25+
## Where Does Python's Randomness Come From?
26+
27+
We can debug this using [strace](https://man7.org/linux/man-pages/man1/strace.1.html) to look at the syscalls made by a Python process.
28+
29+
```python
30+
# example.py
31+
32+
import os
33+
os.urandom(8)
34+
```
35+
36+
If we run the above program with `strace python example.py`, we get a fairly verbose output which I've trimmed a bit here:
37+
38+
```bash
39+
# ..
40+
read(3, "import os\nos.urandom(8)\n", 4096) = 24
41+
read(3, "", 4096) = 0
42+
close(3) = 0
43+
getrandom("\x58\x54\x9d\x43\xbf\x4f\xae\x75", 8, 0) = 8
44+
# ..
45+
```
46+
47+
Every time `os.urandom(n)` is called, `n` number of bytes are requested from [getrandom](https://man7.org/linux/man-pages/man2/getrandom.2.html).
48+
49+
> The getrandom() system call fills the buffer pointed to by buf with up to buflen random bytes. These bytes can be used to seed user-space random number generators or for cryptographic purposes.
50+
51+
However, `random.randint` is a little different; the module is seeded when it's *imported*.
52+
53+
```python
54+
# example2.py
55+
56+
import random
57+
```
58+
59+
The above program requests `2496` bytes from `getrandom` (even though we haven't actually requested any random numbers yet) in order to seed the module. See the trimmed output from `strace python example2.py` below:
60+
61+
```bash
62+
# ..
63+
getrandom("\xcf\xf3\x34\xf4\x65\x49\xd2\xab\xc2\x65\x26\x0
64+
\xd6\x59\xdd\x4f\x5c\xf5\xa5\x2d\xe7\x65\x25\xca\x0b\x74
65+
\xd3\x40\x94\x8a\xe0\x4f"..., 2496, GRND_NONBLOCK) = 2496
66+
# ..
67+
```
68+
69+
Either way, to achieve deterministic randomness, I need to get between my program and these syscalls to`getrandom`. I'll list all the methods I've heard of, ordered from most tricky to least tricky:
70+
71+
- Compile the Linux kernel with an altered `getrandom` function. Downside: my computer becomes hilarious, obscurely, insecure and vulnerable
72+
- Use [kprobes](https://www.kernel.org/doc/Documentation/kprobes.txt) to hook into the kernel's existing `getrandom` function (ideally with some filtering so it's not *all* calls to `getrandom`)
73+
- Compile a Python binary with a modified [py_urandom function](https://github.com/python/cpython/blob/3bddd07c2ada7cdadb55ea23a15037bd650e20ef/Python/bootstrap_hash.c#L477)
74+
- Use a kernel probe (see: [Kprobes](https://docs.kernel.org/trace/kprobes.html)) to hook into the kernel
75+
- Use `LD_PRELOAD` to alter the call that Python makes to `libc`'s `getrandom` (read more on this in [LD_PRELOAD: The Hero We Need and Deserve](https://blog.jessfraz.com/post/ld_preload/))
76+
- Use [ptrace](https://man7.org/linux/man-pages/man2/ptrace.2.html) (process trace) to intercept and modify the return value of the `getrandom` syscall
77+
78+
Note: I've not included methods like monkey patching (e.g. with [MagicMock](https://docs.python.org/3/library/unittest.mock.html#unittest.mock.MagicMock)) because that requires a code change and doesn't count.
79+
80+
## Modifying System Calls With ptrace
81+
82+
Given the constraint of no code changes allowed, [ptrace](https://man7.org/linux/man-pages/man2/ptrace.2.html) is well suited for this job. It only affects a specific process and I don't need to recompile my dependencies. About 20 or so lines of C will do it.
83+
84+
> The ptrace() system call provides a means by which one process (the "tracer") may observe and control the execution of another process (the "tracee"), and examine and change the tracee's memory and registers. It is primarily used to implement breakpoint debugging and system call tracing.
85+
>
86+
87+
First, I need to find the process ID (PID) of a running Python program, so with bash:
88+
89+
```bash
90+
$ ps aux | grep python
91+
andrew 9792 0.0 0.4 16468 8264 pts/5 S+ 16:33 0:00 python
92+
# ^ PID
93+
```
94+
95+
Here, `9792` is the PID. Then, I want to call my `unrandom` program like this: `./unrandom <pid>`, so my C program starts by reading from `argv`:
96+
97+
```c
98+
// unrandom.c
99+
100+
int main(int argc, char *argv[]) {
101+
if (argc < 2) {
102+
fprintf(stderr, "Usage: %s <pid>\n", argv[0]);
103+
return 1;
104+
}
105+
106+
pid_t pid = atoi(argv[1]);
107+
108+
// ..
109+
}
110+
```
111+
112+
Next, we need to attach to the Python process (the tracee) so that `unrandom.c` (the tracer) can gain control.
113+
114+
```c
115+
// Attach to the process with the given PID and initiate tracing (sends a
116+
// SIGSTOP) on the tracee to halt its execution.
117+
if (ptrace(PTRACE_ATTACH, pid, NULL, NULL) == -1) {
118+
perror("ptrace attach");
119+
return 1;
120+
}
121+
122+
// Wait for the tracee to stop and become ready for further tracing.
123+
waitpid(pid, 0, 0);
124+
```
125+
126+
The main part of `unrandom` is a loop where we intercept the entry and exit of each syscall.
127+
128+
On the entry, we'll read the tracee's register values and check if the syscall is `getrandom`; if so, then on the exit, we will write to the buffer that the Python process passed as a reference (it expects random bytes to be inside this buffer).
129+
130+
Let's start by debug logging to see what's going on.
131+
132+
```c
133+
for (;;) {
134+
// Restart the tracee and stop at the next system call entry or exit. Here,
135+
// we enter the syscall.
136+
if (ptrace(PTRACE_SYSCALL, pid, 0, 0) == -1) {
137+
perror("ptrace syscall enter");
138+
break;
139+
}
140+
waitpid(pid, 0, 0);
141+
142+
// Retrieve the tracee's register values.
143+
struct user_regs_struct regs;
144+
if (ptrace(PTRACE_GETREGS, pid, 0, &regs) == -1) {
145+
perror("ptrace getregs");
146+
break;
147+
}
148+
149+
// Check if the syscall being traced is SYS_getrandom.
150+
int intercepted = 0;
151+
if (regs.orig_rax == SYS_getrandom) {
152+
intercepted = 1;
153+
}
154+
155+
// Exit the syscall and wait for the tracee to stop again.
156+
if (ptrace(PTRACE_SYSCALL, pid, 0, 0) == -1) {
157+
perror("ptrace syscall exit");
158+
break;
159+
}
160+
waitpid(pid, 0, 0);
161+
162+
if (intercepted) {
163+
fprintf(stderr,
164+
"intercepted getrandom call: regs.rdi = %llu, regs.rsi = %zu\n",
165+
regs.rdi, regs.rsi);
166+
}
167+
}
168+
```
169+
170+
I compiled this with `gcc -o unrandom unrandom.c`, started a Python REPL, grabbed the pid, and ran `./unrandom <pid>` in a different session.
171+
172+
My `unrandom` program didn't print anything initially, it let all the non-getrandom syscalls through to the kernel, and back, without interference. But when I ran `os.urandom(8)` in the REPL, `unrandom` logged this:
173+
174+
```bash
175+
intercepted getrandom call: regs.rdi = 140219284068912, regs.rsi = 8
176+
```
177+
178+
If we look up a [system call table](https://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/) for x86-64, we can check what these register values mean:
179+
180+
- rdi: `char __user *buf`
181+
- rsi: `size_t count`
182+
183+
We need to write `count` zero bytes to `*buf` after the syscall exits. It's important that it's *after*; otherwise the syscall exit will overwrite our modifications.
184+
185+
```c
186+
if (intercepted) {
187+
fprintf(stderr,
188+
"intercepted getrandom call: regs.rdi = %llu, regs.rsi = %zu\n",
189+
regs.rdi, regs.rsi);
190+
191+
unsigned long long buf = regs.rdi;
192+
size_t count = regs.rsi;
193+
194+
// Overwrite the buffer contents with zeroes.
195+
for (size_t i = 0; i < count; i += sizeof(long)) {
196+
if (ptrace(PTRACE_POKEDATA, pid, buf + i, 0) == -1) {
197+
perror("ptrace pokedata");
198+
break;
199+
}
200+
}
201+
202+
// Set the return value to indicate the amount of data written.
203+
regs.rax = count;
204+
205+
// Modify the tracee's registers to reflect the changes made.
206+
if (ptrace(PTRACE_SETREGS, pid, 0, &regs) == -1) {
207+
perror("ptrace setregs");
208+
break;
209+
}
210+
}
211+
```
212+
213+
When a Python process is the tracee of `unrandom`, all `getrandom` syscalls will return zeroes. This means that `os.unrandom` returns as many `\x00` as requested, and `random.randint` returns deterministically random numbers (the same series of numbers, every time the process restarts — internally, it uses the [Mersenne Twister](https://en.wikipedia.org/wiki/Mersenne_Twister) as the core generator).
214+
215+
This is what it looks like in a traced REPL:
216+
217+
```bash
218+
Python 3.11.2 (main, May 2 2024, 6:59:08) [GCC 12.2.0] on linux
219+
Type "help", "copyright", "credits" or "license" for more information.
220+
>>> import os
221+
>>> os.urandom(8)
222+
b'\x00\x00\x00\x00\x00\x00\x00\x00'
223+
>>> os.urandom(8)
224+
b'\x00\x00\x00\x00\x00\x00\x00\x00'
225+
>>> import random
226+
>>> random.randint(0, 10)
227+
5
228+
>>> random.randint(0, 10)
229+
8
230+
>>> random.randint(0, 10)
231+
0
232+
# these last three numbers are the same every time the process restarts!
233+
```
234+
235+
Detour complete. The source code for `unrandom` is [on GitHub](https://github.com/healeycodes/unrandom). I imagine it will run on most x86-64 Linux distributions.
236+
237+
My main resource was the [man page for ptrace](https://man7.org/linux/man-pages/man2/ptrace.2.html). These two blog posts also have helpful code examples and some fun ideas: [Intercepting and Emulating Linux System Calls with Ptrace](https://nullprogram.com/blog/2018/06/23/) and [Modifying System Call Arguments With ptrace](https://www.alfonsobeato.net/c/modifying-system-call-arguments-with-ptrace/).
238+
239+
It was fun digging into system call tracing, so I'm going to do some more research into how the tracing tools I use work under the hood!

0 commit comments

Comments
 (0)