-
-
Notifications
You must be signed in to change notification settings - Fork 101
Add support for free-threading builds of CPython #243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dpdani
commented
Nov 17, 2024
- Compile fails on 3.14 no-GIL #231
# Conflicts: # .github/workflows/test.yml # pyproject.toml
In addition to the failure you left a comment for, I also see a different failure which is more obviously a thread safety issue in hypothesis itself:
Hypothesis even warns about this:
So I suspect the other failure is caused by a similar problem happening. And indeed looking at the hypothesis docs, they do not support running hypothesis simultaneously in multiple threads: https://hypothesis.readthedocs.io/en/latest/details.html#thread-safety-policy I'll open an issue to document this in the free-threading porting guide and an issue on pytest-run-parallel to hopefully detect this and warn about it. I think the fix for the python-zstandard tests is to mark any tests using hypothesis as thread-unsafe. |
It's also probably worth adding some tests for sharing a compressor or decompressor between threads. It looks like the GIL does get released when calling into the C zstd library, so any C-level thread safety issues that exist from sharing zstd contexts between threads are probably also present in the GIL-enabled build and no one has reported them yet. |
mm, it's weird that I'm not seeing that. how are you running the tests? |
I'm running it all locally on my mac dev machine. I installed the library with |
I've added I'm seeing a lot of warnings with the annotations, am I using it wrong?
I've also added a test for a compressor object that is shared between several threads, and it does cause a segmentation fault both in the free-threading and in the default build. this mode of passing data to de/compression contexts for python-zstd does not seem to make sense anyways. |
You can make it thread-safe if you do something like this to add a per-decompressor lock: https://py-free-threading.github.io/porting/#dealing-with-thread-unsafe-libraries. Of course that won't scale well but as you said it's a weird thing to do. Does the test segfault with the GIL too? I wouldn't be surprised if it does. If it does, the fact that no one has reported this issue means it's not a big problem in practice and maybe you can just document the thread safety caveats? I have a suspicion that python threading will spike in popularity soon as people adopt free-threading, so it was probably inevitable that someone would hit this eventually and in that case it probably is worth adding the locking. |
Oh you said it's a bug in the default build I missed that. We've generally been trying to fix pre-existing thread safety issues that can be triggered in the default build if we can but we don't see them as blockers for shipping wheels. |
Maybe @andfoy or @lysnikolaou know what's up with the warning. |
The docs in We need to ensure we don't crash if someone violates the "concurrent operations on multiple threads" rule. I'm fine with undefined behavior if someone attempts operations on the same zstd context from multiple threads. But I would prefer detecting and raising an error if this can be implemented with minimal runtime cost. |
You could have an atomic flag that a thread sets when it acquires the context, if another thread tries to acquire a context with the flag set then that would be an error. It's a little obnoxious to write cross-platform C code that uses atomics (see |
one fairly easy cross-platform option is to just use pymutex behind some essentially, for Python >= 3.13 we could guarantee to throw an exception around concurrent use, and for prior versions we could retain the current behavior.
|
not sure if I should do it in this PR or open a new one. |
You can also use PyThread_type_lock on older python versions. It's sadly undocumented in CPython but you can take a look at what I did to make NumPy's use of lapack_lite thread safe to see how to conditionally use either depending on Python version: https://github.com/numpy/numpy/blob/main/numpy/linalg/umath_linalg.cpp. Grep for HAVE_EXTERNAL_LAPACK and LOCK_LAPACK_LITE. The main difference wrt PyMutex is it's slower, supports a try lock API (which you probably don't need) and it requires a heap allocation. |
@ngoldbaum I've ended up opting for an atomic flag, partly using your @indygreg the modifications I just pushed make it so that when a I believe the performance impact is as little as it can possibly be. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note that you're not doing a relaxed read - at least for the C11 atomics case you would need to use atomic_load_explicit
. Right now as I understand it this is using SeqCst ordering for all operations. It's possible to write something that will likely be fast and scale better, but given that this implements a flag that triggers an error case on multithreaded access to shared resources, I doubt that matters.
c-ext/_pyzstd_atomics.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in the long-term there should probably be some sort of C threading utility library that C extensions can use, including a full copy of the CPython atomics headers. Something that can be easily vendored as a submodule and ideally header-only. That said, that project doesn't yet exist so copying numpy's header for this purpose is probably the most practical solution for python-zstandard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch! thanks 🙏
yeah, this is not the ideal solution. I hoped that CPython would expose them, but it seems unlikely it will.
a header-only library would be very nice, there may be other people interested in having it, out there in the C world.
maybe we could eventually push for CPython to create a separate C package and consume it itself, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @SonicField (author of ft_utils), we were talking about this last week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't know about that library, looks very similar to my https://dpdani.github.io/cereggii/, maybe we could collaborate in the future!
@indygreg we'd appreciate it if you could give us your opinion on this approach and/or some code review. It would be really helpful to be able to have cp313t wheels available. One particularly disruptive impact of python-zstandard failing to build on the free-threaded Python at the moment, is that There are probably things that could happen inside hatch to avoid this issue, but I think the "right" fix is for python-zstandard to help out a bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry it took so long to look at this. I've been exceptionally busy.
I suspect the level of support for free-threaded builds has improved since this PR was authored.
Please refresh and try to use the latest versions of things (with presumed FT compatibility) so we don't have to hack around missing support when 3.13 initially shipped.
Please split out the @pytest.mark.thread_unsafe
annotations to their own PR along with adding the initial FT coverage to CI. I want to see the main
branch running FT builds successfully, even if like 90% of tests are skipped.
Then we can revisit the meat of this change, which is adding the free-threaded detection to the C code.
Is that reasonable?
.github/workflows/test.yml
Outdated
@@ -59,47 +60,60 @@ jobs: | |||
PYTHONDEVMODE: '1' | |||
steps: | |||
- name: Set up Python | |||
uses: actions/setup-python@v5 | |||
uses: Quansight-Labs/setup-python@v5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to adopt a less official action? Does actions/setup-python
not support the free-threaded builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, GitHub has been quite unresponsive to this: actions/setup-python#771
Though, there seems to be some recent activity: actions/setup-python#973
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fork is tracking upstream and has the open PR to add free-threading support applied.
You could also use setup-uv
, which can be used as a drop-in replacement if you install pip into the uv environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, my (former) project provides the Python builds for uv. I actually trust those Python builds and the people behind uv. So astral-sh/setup-uv would be my preference.
# TODO enable once PyO3 supports 3.13. | ||
- name: Build (Rust) | ||
if: matrix.arch == 'x64' && matrix.py != '3.13' | ||
if: matrix.arch == 'x64' && !startsWith(matrix.py, '3.13') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this is supported now. But scope bloat to resolve it in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened a followup issue to track this #251
.github/workflows/test.yml
Outdated
- name: Test CFFI Backend | ||
if: "!startsWith(matrix.py, '3.13')" # see pyproject.toml:4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably this limitation no longer holds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately no, CFFI is one of the last low-level major dependencies without support. Recently the maintainers asked our team to work on a fork with free-threading support so they can review one big PR:
Hi Gregory 👋 Thanks for coming back to us! Tomorrow I'll take a look at bumping numbers 👍
Do you mean you would like the annotations merged beforehand?
Do you have any thoughts on this? |
Yesterday I made the change that I just pushed, but didn't have the time to go further than that. |
@dpdani I disabled the cffi backend in See https://github.com/ngoldbaum/python-zstandard/tree/feature/3.13t. Also all the tests pass with no skips if I set |
@ngoldbaum invitation sent 👍 |
I opened #257 to update to pyo3 0.22, which if merged would allow this PR to only turn off the rust backend for the free-threaded build. |
tests/test_compressor_fuzzing.py
Outdated
@@ -2,6 +2,9 @@ | |||
import os | |||
import unittest | |||
|
|||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these imports and the comment can probably be reverted as well @ngoldbaum
ci/requirements.freethreading.in
Outdated
# This is a dependency of pytest on Windows but isn't picked up by pip-compile. | ||
atomicwrites | ||
cibuildwheel | ||
#cffi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that we're disabling cffi with setup.py maybe it makes sense to uncomment this line.
probably less headaches in the future.
IIRC this was the only difference, so these special requirements files for free threading can be removed.
(I'm on my phone, sorry I can't check this right now.)
@indygreg can you trigger the CI here? 🙏 |
Thanks! It seems there's a problem with the modifications in setup.py when building for PyPy 3.9 on windows @ngoldbaum |
Actually I don't think it's related to this PR |
I opened #258 to avoid the CI failures. |
With the newer version of |
ping @indygreg, we'd appreciate it if you could give this another pass |
Sorry for the repeated ping @indygreg, can you please trigger the CI and give the PR another look? 🙏 |
FWIW, CFFI 2.0.0b1 added support for the free-threaded build, so any hacks for the lack of CFFI support on the free-threaded build can be removed now. |
Closing since it's been resolved in other PRs. |
I'm not sure whether @indygreg agrees, but I think now that supporting 3.14t is much more straightforward, now is probably a good time to raise the possibility of adding a lock or an atomic boolean flag to raise an error under shared multithreaded use and @dpdani I think you might want to take that on, since you've been trying to get that working for a while now in this PR. That would require updating the docs that describe the thread safety guarantees to describe whatever behavior gets added. At the same time, you could add tests to make sure the error is raised appropriately under multithreaded use. IMO if you opened a followup that did that and then a second followup that updates the docs and metadata to declare free-threaded support, that would be a much nicer story on officially supporting the free-threaded build here than when we tried last year before CFFI and PyO3 had good support. |
I'll take a look this weekend 👀 |