Towards soundness of PyByteArray::to_vec #4742

robsdedude · 2024-11-29T08:40:05Z

In free-threaded Python, to_vec needs to make sure to run inside a critical section so that no other Python thread is mutating the bytearray causing UB.

See also #4736

Unfortunately it seems I can't write proper tests for this as Python 3.13t is not yet part of the test matrix. I'm aware that support for testing with 3.13 and 3.13t is still in it's early stages and for instance virtualenv does not yet support it.

In free-threaded Python, to_vec needs to make sure to run inside a critical section so that no other Python thread is mutating the bytearray causing UB. See also PyO3#4736

davidhewitt

Thanks for the PR! We actually do have tests running for the free-threaded build, I would have been unhappy to declare support running without them! Similarly I have had virtualenv working just fine with 3.13t (haven't tried windows, though).

I think we could write a test which spawns a thread which does something to attempt to invalidate the data (maybe write to it using py.run or PySequenceMethods::set_slice) and confirm that the data read is the original data inserted, not the conflicting data (which should hopefully now block on either the GIL or the critical section depending on the build).

newsfragments/4742.fixed.md

Co-authored-by: David Hewitt <[email protected]>

robsdedude · 2024-11-29T16:51:15Z

@davidhewitt I tried to write a test runing bytearray.extend in one thread while reading the bytearray with to_vec() in another thread and found that I was able to read inconsistent (more precisely partially uninitialized memory) regardless whether the critical section change was in place or not. Digging deeper, I'm not surprised. If you look the the C implementation of bytearray, you'll see that no critical section is used throughout the whole file. All the memcpy and memmove calls are unprotected 😕

Not sure where to go from here.

However, no matter how hard I tried, I couldn't get it to segfault. So maybe there's something more to it that I'm not aware of.

davidhewitt · 2024-11-30T09:11:07Z

I think that's not a suprise that it's hard to segfault; you'd have to do something like turn the uninitialized read into a cast on the bytes to create a structure in an invalid state.

Nevertheless, invalid reads alone are a clear security issue. This problem clearly gets a lot worse in freethreaded Python. My knee jerk reaction is to make all bytearray methods in PyO3 unsafe.

cc @ngoldbaum @colesbury is there any upstream opinion on how to handle bytearray objects on the free threaded build?

ngoldbaum · 2024-11-30T15:22:38Z

I can't find any discussion about bytearray and free-threading in the CPython issue tracker, you may want to file an issue, especially if you can make a pure-python reproducer using the threading module. There are still lots of thread safety issues in CPython itself and we should make sure they all get tracked as we run into them.

robsdedude · 2024-12-01T09:19:18Z

python/cpython#127472

davidhewitt · 2024-12-03T14:53:50Z

Thanks for that. I'm a bit unsure what the way forward here is. Without upstream also using critical sections, as you observe, adding the single section here seems a bit moot. I think we cannot change our API in a patch release so I think the likely path at the moment is that we make all the methods unsafe in PyO3 0.24?

alex · 2024-12-03T15:20:25Z

Seems right

…

On Tue, Dec 3, 2024, 9:54 AM David Hewitt ***@***.***> wrote: Thanks for that. I'm a bit unsure what the way forward here is. Without upstream also using critical sections, as you observe, adding the single section here seems a bit moot. I think we cannot change our API in a patch release so I think the likely path at the moment is that we make all the methods unsafe in PyO3 0.24? — Reply to this email directly, view it on GitHub <#4742 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAAGBEGMF7BEHP27MJKJMD2DXAZJAVCNFSM6AAAAABSWTQ2IKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMJUG44TKMBRHE> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

ngoldbaum · 2024-12-03T16:00:30Z

Hopefully a future Python release will fix the thread safety issues you identified and we can at least make the free-threaded build have similar guarantees compared with the GIL-enabled build.

robsdedude · 2024-12-04T13:21:20Z

so I think the likely path at the moment is that we make all the methods unsafe in PyO3 0.24?

I can see arguments for and against that. I'm slightly gravitating towards not doing it though. The way I see it is that PyO3's memory safety stands and falls with the soundness of the linked Python implementation. You have to assume that it's sound. If you don't, every PyO3 API would be unsafe (which I guess is Rust's standpoint, as every FFI call is unsafe). So in this particular case to_vec were safe if CPython were sound (by using critical sections around bytearray operations). In that sense PyO3 is as safe here as using pure Python is and maybe that should be the criteria whether you mark an API safe/unsafe in PyO3.

Just my 2 cents and you're much deeper into the world of this wonderful crate so ofc. it's up to you to decide 😇

adding the single section here seems a bit moot

I guess it is 🫤 Feel free to close the PR ⚰️

robsdedude · 2025-02-10T08:35:56Z

It seems a fix in CPython is being worked on 👀
python/cpython#129108

If that get's merged, I'll pick up this PR again.

ngoldbaum · 2025-03-03T17:18:50Z

bytearray is now thread-safe, but the fixes are going to have to wait until 3.14 I think. There are a number of thread-safety fixes that only live in the main branch of CPython.

Once 3.14b1 comes out, we'll merge the open PR adding 3.14 support to PyO3 and then I think we can move this forward. We'd probably also need to do a little thinking about how to handle 3.13 having known thread safety issues, if anything.

robsdedude · 2025-03-03T19:30:29Z

but the fixes are going to have to wait until 3.14 I think

You are right, guess I was a bit too eager 😅 unless the fixes get backported to 3.13

We'd probably also need to do a little thinking about how to handle 3.13 having known thread safety issues, if anything.

Not sure there's much to be done. For one, 3.13t is an experimental CPython build so users can't expect a fully working and stable piece of software b) PyO3 is just as memory unsafe here as plain Python, so one could argue "PyO3 is not making it any worse, so it's good enough from our perspective". However, a note in the API docs around this might be appropriate.

davidhewitt · 2025-07-19T19:13:21Z

With 3.14 now well into the beta phase, I think we should proceed with merging this. @robsdedude do you have any remnants of that test you previously wrote?

robsdedude · 2025-07-20T07:50:46Z

I had a quick look and it seems I do indeed have some things left on my machine. Let me see if I can polish it up and push it.

robsdedude · 2025-07-20T10:14:23Z

While doing so, I realized that it's incredibly hard to provoke the data race in to_vec. Here's what would need to happen (and I had it happen once in running the test for many minutes):

Thread t1 manipulates a bytearray b. I opted for b.extend(b)
Thread t2 calls b.to_vec such that
- extend already increased the size, but didn't finish initializing the data
- therefore, unsafe { self.as_bytes() }.to_vec() starts to copy that now bigger byte slice
- however, to_vec will likely (just like extend) process the data sequentially. So for to_vec to read inconsistent state it would need to catch up how far extend came with initializing the state.

Here's an alternative suggestion that will be easier to test:
Instead of testing the full to_vec implementation, how about the test just tests the assumption that a critical section is sufficient for soundness. So instead, the test would start a critical section and assert that b.as_bytes() gives a consistent view on the data.

davidhewitt

Thanks, this looks great to me. Just two small refinements, then let's merge.

src/types/bytearray.rs

davidhewitt · 2025-07-20T12:48:01Z

src/types/bytearray.rs

+            let mut handles = [&mut handle1, &mut handle2, &mut handle3];
+
+            let t0 = std::time::Instant::now();
+            while t0.elapsed() < Duration::from_secs(10) {


It might be helpful to reduce this to 0.5 secs (as long as that's still sufficient iterations), else this week have a noticeable impact on CI times.

Yeah, this is good point. The problem with race conditions is that they're inherently non-deterministic. So there is no definitive number of "sufficient iterations". This is 100% a trade-off between CI time and how likely you want the test to be catching the issue. Further, reducing this number to the smallest possible value on my machine such that it still satisfy my feeling of sufficiently accurate is absolutely no guarantee that this number will be a good fit for other machines (e.g., the CI runners).

That being said, on my machine it usually takes 1-2 seconds for it to fail against CPython 3.13t. But that number is ofc. also depending on how fast a machine can actually run the iterations. So maybe I should just put a cap on the number of iterations an call it a day. Again, on my machine, each iterations amounts to just shy of 0.2 seconds.

So after a fair bit of fiddling, I saw on pass that took 17 rounds for the bug to occur. So I pinned the max number of iterations to 25 (adding some margin of error). This, however, still amounts to roughly 5 seconds. 🤷 I don't think this can be reasonably reduced without accepting that the test might not actually test what it sets out to test :/

Does it reproduce with smaller test data? 200 MiB seems like a lot.

Same tradeoff. The smaller the size, the faster the extend, the lower the chance of intercepting it and catching the race condition. I'll try to see if I can further lower the size without lowering the accuracy too much. Again: all of it will be tailored to my specific machine. I have no idea how well the numbers I'll end with will generalize.

As long as there's enough iterations that eventually a CI run would trigger it (doesn't need to be every CI run) then I'm happy. If the chance that CI running all month would have zero chance of triggering, then I'd think it wasn't enough :)

src/types/bytearray.rs

davidhewitt

Thanks for following up, I think this is good to merge now 👍

ngoldbaum · 2025-07-23T20:53:24Z

src/types/bytearray.rs

    }
+
+    // CPython 3.13t is unsound => test fails
+    #[cfg(any(Py_3_14, not(all(Py_3_13, Py_GIL_DISABLED))))]


also needs to be skipped on emscripten

this failure mode happens often enough and wastes human time - it's probably worth adding a linter checking that all tests using threads are disabled on platforms that don't support spawning threads.

it's probably worth adding a linter

Do you mean upstream in clippy or something else?

I'm not sure - can you write custom clippy lints like that? If you can that's probably the easiest way. You might also be able to do it with a sufficiently advanced regex...

Also just to be clear my comment above wasn't an ask for you to do that, if it came across that way 😄

I don't think we can write custom lints, so regex might be the most feasible option albeit painful. I might just pretend the lint is harder to write than just losing the time repeatedly for now 🙈

Towards soundness of PyByteArrayMethods::to_vec

982ac64

In free-threaded Python, to_vec needs to make sure to run inside a critical section so that no other Python thread is mutating the bytearray causing UB. See also PyO3#4736

robsdedude changed the title ~~Towards soundness of PyByteArrayMethods::to_vec~~ Towards soundness of PyByteArray::to_vec Nov 29, 2024

Add changelog entry

15e602b

robsdedude marked this pull request as ready for review November 29, 2024 09:28

davidhewitt reviewed Nov 29, 2024

View reviewed changes

newsfragments/4742.fixed.md Outdated Show resolved Hide resolved

Reword changelog entry

b6cd9a6

Co-authored-by: David Hewitt <[email protected]>

davidhewitt mentioned this pull request Nov 29, 2024

release: 0.23.3 #4745

Merged

davidhewitt mentioned this pull request Jul 19, 2025

0.26 Release #5249

Closed

Merge branch 'main' into byte-array-soundness

2ea582a

robsdedude added 2 commits July 20, 2025 12:34

Add test

5630668

Skip test failing for unsound CPython 3.13

3819f78

davidhewitt reviewed Jul 20, 2025

View reviewed changes

src/types/bytearray.rs Outdated Show resolved Hide resolved

robsdedude added 3 commits July 20, 2025 15:33

Improve docs

b22fbbc

Add safety comment to unsafe invocation

4216788

Limit test by time AND number of rounds

2def97f

robsdedude added 3 commits July 20, 2025 15:48

Test: fix MSRV compatibility & simplify scenario

dc15f45

Fix example code

8c4fd92

Further reduce required test duration

df74c80

davidhewitt approved these changes Jul 23, 2025

View reviewed changes

davidhewitt added this pull request to the merge queue Jul 23, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 23, 2025

ngoldbaum reviewed Jul 23, 2025

View reviewed changes

robsdedude added 2 commits July 24, 2025 09:31

Skip threaded test on wasm

7d82de9

Rewrite cfg attribute for llvm-cov to not turn into llvm-cough

53c990a

ngoldbaum added this pull request to the merge queue Jul 24, 2025

Merged via the queue into PyO3:main with commit aa0fa4c Jul 24, 2025
43 of 46 checks passed

robsdedude deleted the byte-array-soundness branch July 24, 2025 18:31

Towards soundness of PyByteArray::to_vec #4742

Towards soundness of PyByteArray::to_vec #4742

Uh oh!

Conversation

robsdedude commented Nov 29, 2024

Uh oh!

davidhewitt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

robsdedude commented Nov 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidhewitt commented Nov 30, 2024

Uh oh!

ngoldbaum commented Nov 30, 2024

Uh oh!

robsdedude commented Dec 1, 2024

Uh oh!

davidhewitt commented Dec 3, 2024

Uh oh!

alex commented Dec 3, 2024 via email

Uh oh!

ngoldbaum commented Dec 3, 2024

Uh oh!

robsdedude commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robsdedude commented Feb 10, 2025

Uh oh!

ngoldbaum commented Mar 3, 2025

Uh oh!

robsdedude commented Mar 3, 2025

Uh oh!

davidhewitt commented Jul 19, 2025

Uh oh!

robsdedude commented Jul 20, 2025

Uh oh!

robsdedude commented Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidhewitt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

davidhewitt Jul 20, 2025

Choose a reason for hiding this comment

Uh oh!

robsdedude Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robsdedude Jul 20, 2025

Choose a reason for hiding this comment

Uh oh!

ngoldbaum Jul 20, 2025

Choose a reason for hiding this comment

Uh oh!

robsdedude Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidhewitt Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

davidhewitt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ngoldbaum Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

robsdedude Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

ngoldbaum Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

davidhewitt Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

robsdedude commented Nov 29, 2024 •

edited

Loading

robsdedude commented Dec 4, 2024 •

edited

Loading

robsdedude commented Jul 20, 2025 •

edited

Loading

robsdedude Jul 20, 2025 •

edited

Loading

robsdedude Jul 20, 2025 •

edited

Loading

davidhewitt Jul 25, 2025 •

edited

Loading