-
Notifications
You must be signed in to change notification settings - Fork 1
Proof of Concept of Rust in CPython #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit updates the build system to automatically detect cargo and enable/disable _base64 without needing to pass a flag. If cargo is unavailable, _base64 is disabled. It also updates cpython-sys to use a hand written header (which is what Linux seems to do) and splits off the parser bindings to be handled in the future (since the files are included differently).
There are still some issues with compilation, but those can be sorted out in a future PR.
Also make PyObject an UnsafeCell<ffi::_object> so it can be passed around by & reference
alex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few thoughts. I'm very excited!
| } | ||
|
|
||
| let input_len = view_len as usize; | ||
| let input = unsafe { slice::from_raw_parts(buffer.as_ptr(), input_len) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this is technically unsound. See https://alexgaynor.net/2022/oct/23/buffers-on-the-edge/.
In pyca/cryptography we just accept the issue (https://github.com/pyca/cryptography/blob/main/src/rust/src/buf.rs#L114-L122), but I think its worth a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose to be fully safe we'd need to copy the input buffer if the reference count was >1?
Even that probably isn't enough, since new references could be made concurrently. Yeah I guess we should just add a comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.rs/pyo3/latest/pyo3/buffer/struct.PyBuffer.html#method.as_slice is how we model it in pyo3 -- we use a type that does atomic reads.
Like I said, in pyca/cryptography we just accept the issue. (Someone somewhere along the line expressed interest in a PEP to improve the bufffer interface for this, but it never happened.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think this is reinforcing to me that we should focus on safe stable API abstractions first then build on those. I'll probably start working on those in another branch to have something to propose.
| } | ||
| return Err(()); | ||
| } | ||
| let dest = unsafe { slice::from_raw_parts_mut(dest_ptr.cast::<u8>(), output_len) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is sound -- PyBytes_FromStringAndSize isn't guaranteed to initialize the values, and &[u8] is required to be initialized bytes.
We need to either use MaybeUninit or to initialize the values.
| edition = "2024" | ||
|
|
||
| [dependencies] | ||
| cpython-sys ={ path = "../cpython-sys" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should specify this as a workspace dep for simplicity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed!
| @@ -0,0 +1,2 @@ | |||
| [toolchain] | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to specify a default toolchain?
What we do for pyca/cryptography is specify an MSRV and then people can bring whatever version of rust they link. (And in CI we test on MSRV, possible-next-MSRV, stable, beta, and nightly.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should just specify an MSRV, this should probably be removed now. Thanks for the review!
This PR introduces a plausible approach for introducing Rust as an optional build dependency for CPython. A few changes are necessary to accomplish this:
Modules/cpython-sys. This isn't super well integrated into the Makefile right now. It requires runningmake regen-rust-wrapper-hto generate the wrapper first, then you can run make normally._base64is added, which has a single functionstandard_b64encodeThere are also a lot of other caveats and limitations. I tested this with gcc and clang on a Linux system, with a GIL-ful build. There's no Mac or Windows support yet, or support for other platforms, but I believe Mac WASI Android and iOS may not be too hard since they share the same build system as Linux..
Windows will require a different approach. I'm not sure yet on how exactly we should do that.
Rough series of commands to get things working:
Install Rust and make sure
cargois available.