-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
PEP 797: Shared Object Proxies for Subinterpreters #4536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,271 @@ | ||||||||||||||
PEP: 797 | ||||||||||||||
Title: Shared Object Proxies for Subinterpreters | ||||||||||||||
Author: Peter Bierma <[email protected]> | ||||||||||||||
Discussions-To: Pending | ||||||||||||||
Status: Draft | ||||||||||||||
Type: Standards Track | ||||||||||||||
Created: 08-Aug-2025 | ||||||||||||||
Python-Version: 3.15 | ||||||||||||||
Post-History: `01-Jul-2025 <https://discuss.python.org/t/97306>`__ | ||||||||||||||
|
||||||||||||||
|
||||||||||||||
Abstract | ||||||||||||||
======== | ||||||||||||||
|
||||||||||||||
This PEP introduces a new :func:`~concurrent.interpreters.share` function to | ||||||||||||||
the :mod:`concurrent.interpreters` module, which allows *any* arbitrary object | ||||||||||||||
to be shared for a period of time. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shared between what? Processes, threads, interpreters, over the network, etc? |
||||||||||||||
|
||||||||||||||
For example:: | ||||||||||||||
|
||||||||||||||
from concurrent import interpreters | ||||||||||||||
|
||||||||||||||
with open("spanish_inquisition.txt") as unshareable: | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. always use an encoding!
Suggested change
|
||||||||||||||
interp = interpreters.create() | ||||||||||||||
with interpreters.share(unshareable) as proxy: | ||||||||||||||
interp.prepare_main(file=proxy) | ||||||||||||||
interp.exec("file.write('I didn't expect the Spanish Inquisition')") | ||||||||||||||
|
||||||||||||||
|
||||||||||||||
Motivation | ||||||||||||||
========== | ||||||||||||||
|
||||||||||||||
Many Objects Cannot be Shared Between Subinterpreters | ||||||||||||||
----------------------------------------------------- | ||||||||||||||
|
||||||||||||||
In Python 3.14, the new :mod:`concurrent.interpreters` module can be used to | ||||||||||||||
create multiple interpreters in a single Python process. This works well for | ||||||||||||||
stateless code (that is, code that doesn't need any state from a caller) and | ||||||||||||||
Comment on lines
+37
to
+38
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
objects that can be serialized, but it is fairly common for applications to | ||||||||||||||
want to use highly-complex data structures (that cannot be serialized) with | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In general I would use fewer paranthetical asides, though this is style. Here, I don't think the comment about serialisation adds much, it's clear from context we're talking about non-serialisable data (or even data that could be serialised but that is too expensive to do so)
Suggested change
|
||||||||||||||
their concurrency. | ||||||||||||||
|
||||||||||||||
Currently, :mod:`!concurrent.interpreters` can only share | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
:ref:`a handful of types <interp-object-sharing>` natively, and then falls back | ||||||||||||||
to the :mod:`pickle` module for other types. This can be very limited, as many | ||||||||||||||
Comment on lines
+44
to
+45
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
types of objects cannot be pickled. For example, file objects returned by | ||||||||||||||
:func:`open` cannot be serialized through ``pickle``. | ||||||||||||||
Comment on lines
+46
to
+47
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
|
||||||||||||||
Rationale | ||||||||||||||
========= | ||||||||||||||
|
||||||||||||||
A Fallback for Object Sharing | ||||||||||||||
----------------------------- | ||||||||||||||
|
||||||||||||||
A shared object proxy is designed to be a fallback for sharing an object | ||||||||||||||
between interpreters, because it's generally slow and causes increased memory | ||||||||||||||
usage (due to :term:`immortality <immortal>`, which will be discussed more | ||||||||||||||
later). As such, this PEP does not make other mechanisms for sharing objects | ||||||||||||||
Comment on lines
+57
to
+58
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Which part/section are you referring to here? |
||||||||||||||
(namely, serialization) obsolete. A shared object proxy should only be used as | ||||||||||||||
a last-resort for highly complex objects that cannot be serialized or shared | ||||||||||||||
in any other way. | ||||||||||||||
Comment on lines
+55
to
+61
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The PEP might benefit from an extended discussion of the performance trade-offs. E.g. serialisation may have a higher start-up/init cost, but is faster otherwise, whereas I assume these proxies are cheap to create but more expensive to use. |
||||||||||||||
|
||||||||||||||
Specification | ||||||||||||||
============= | ||||||||||||||
|
||||||||||||||
The ``SharedObjectProxy`` Type | ||||||||||||||
------------------------------ | ||||||||||||||
|
||||||||||||||
.. class:: concurrent.interpreters.SharedObjectProxy | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
|
||||||||||||||
A proxy type that allows thread-safe access to an object across multiple | ||||||||||||||
interpreters. This cannot be constructed from Python; instead, use the | ||||||||||||||
:func:`~concurrent.interpreters.share` function. | ||||||||||||||
|
||||||||||||||
When interacting with the wrapped object, the proxy will switch to the | ||||||||||||||
interpreter in which the object was created. Arguments passed to anything | ||||||||||||||
on the proxy are also wrapped in a new shared object proxy if the type | ||||||||||||||
isn't natively shareable (so, for example, strings would not be wrapped | ||||||||||||||
in an object proxy, but file objects would). The same goes for return | ||||||||||||||
values. | ||||||||||||||
|
||||||||||||||
For thread-safety purposes, an instance of ``SharedObjectProxy`` is | ||||||||||||||
always :term:`immortal`. This means that it won't be deallocated for the | ||||||||||||||
lifetime of the interpreter. When an object proxy is done being used, it | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "done being used" is imprecise, is this at object destruction time?
Suggested change
|
||||||||||||||
clears its reference to the object that it wraps and allows itself to be | ||||||||||||||
reused. This prevents extreme memory accumulation. | ||||||||||||||
|
||||||||||||||
In addition, all object proxies have an implicit context that manages them. | ||||||||||||||
This context is determined by the most recent call to | ||||||||||||||
Comment on lines
+88
to
+89
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be good to talk about why this was chosen, as this could(?) be surprising behaviour. I don't know enough to be sure, though. 'Explicit is better than implicit'! |
||||||||||||||
:func:`~concurrent.interpreters.share` in the current thread. When the context | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
finishes, all object proxies created under that context are cleared, allowing | ||||||||||||||
them to be reused in a new context. | ||||||||||||||
|
||||||||||||||
Thread State Switching | ||||||||||||||
********************** | ||||||||||||||
Comment on lines
+94
to
+95
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This stanza could probably go into Rationale, I think it's too detailed for Specification. |
||||||||||||||
|
||||||||||||||
At the C level, all objects in Python's C API are interacted with through their | ||||||||||||||
type (a pointer to a :c:type:`PyTypeObject`). For example, to call an object, | ||||||||||||||
Comment on lines
+97
to
+98
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't like this phrasing. Maybe 'all interactions occur via the type' or etc? |
||||||||||||||
the interpreter will access the :c:member:`~PyTypeObject.tp_call` field on the | ||||||||||||||
object's type. This is where the magic of a shared object proxy can happen. | ||||||||||||||
|
||||||||||||||
The :c:type:`!PyTypeObject` for a shared object proxy must implement | ||||||||||||||
wrapping behavior for every single field on the type object structure. | ||||||||||||||
So, going back to ``tp_call``, an object proxy must be able to "intercept" the | ||||||||||||||
call in such a way where the wrapped object's ``tp_call`` slot can be executed | ||||||||||||||
without thread-safety issues. This is done by switching the | ||||||||||||||
:term:`attached thread state`. | ||||||||||||||
|
||||||||||||||
In the C API, a :term:`thread state` belongs to a certain interpreter, and by | ||||||||||||||
holding an attached thread state, the thread may interact with any object | ||||||||||||||
belonging to its interpreter. This is because holding an attached thread state | ||||||||||||||
implies things like holding the :term:`GIL`, which make object access thread-safe. | ||||||||||||||
|
||||||||||||||
.. note:: | ||||||||||||||
|
||||||||||||||
On the :term:`free threaded <free threading>` build, it is still required | ||||||||||||||
to hold an :term:`attached thread state` to interact with objects in the | ||||||||||||||
C API. | ||||||||||||||
Comment on lines
+116
to
+118
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Active voice, but also I wonder if this note is needed? This is a technical PEP, we are allowed to assume the reader is competent.
Suggested change
|
||||||||||||||
|
||||||||||||||
So, with that in mind, the only thing that the object proxy has to do to call | ||||||||||||||
a type slot is hold an attached thread state for the object's interpreter. | ||||||||||||||
This is the fundamental idea of how a shared object proxy works: allow access | ||||||||||||||
from any interpreter, but switch to the wrapped object's interpreter when a type | ||||||||||||||
slot is called. | ||||||||||||||
|
||||||||||||||
Sharing Arguments and Return Values | ||||||||||||||
*********************************** | ||||||||||||||
|
||||||||||||||
Once the attached thread state has been switched to match a wrapped object's | ||||||||||||||
interpreter, arguments and the return value (if it's a ``PyObject *``) of the | ||||||||||||||
slot need to be shared back to the caller. This is done by first attempting to | ||||||||||||||
share them natively (for example, with ``pickle``), and then falling back to | ||||||||||||||
creating a new shared object proxy if all else fails. The new proxy is given | ||||||||||||||
the same context as the current proxy, meaning the newly wrapped object will | ||||||||||||||
be able to be freed once the :func:`~concurrent.interpreters.share` context | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
is closed. | ||||||||||||||
|
||||||||||||||
The Sharing APIs | ||||||||||||||
---------------- | ||||||||||||||
|
||||||||||||||
.. function:: concurrent.interpreters.share(obj) | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
|
||||||||||||||
Wrap *obj* in a :class:`~concurrent.interpreters.SharedObjectProxy`, | ||||||||||||||
allowing it to be used in other interpreter APIs as if it were natively shareable. | ||||||||||||||
|
||||||||||||||
This returns a :term:`context manager`. The resulting object from the | ||||||||||||||
context is the proxy that can be shared. After the context is closed, the | ||||||||||||||
proxy will release its reference to *obj* and allow itself to be reused | ||||||||||||||
for a future call to ``share``. | ||||||||||||||
|
||||||||||||||
If this function is used on an existing shared object proxy, it is assigned | ||||||||||||||
a new context, preventing it from being cleared when the parent ``share`` | ||||||||||||||
context finishes. | ||||||||||||||
|
||||||||||||||
For example: | ||||||||||||||
|
||||||||||||||
.. code-block:: python | ||||||||||||||
|
||||||||||||||
from concurrent import interpreters | ||||||||||||||
|
||||||||||||||
with open("spanish_inquisition.txt") as unshareable: | ||||||||||||||
interp = interpreters.create() | ||||||||||||||
with interpreters.share(unshareable) as proxy: | ||||||||||||||
interp.prepare_main(file=proxy) | ||||||||||||||
interp.exec("file.write('I didn't expect the Spanish Inquisition')") | ||||||||||||||
|
||||||||||||||
|
||||||||||||||
.. note:: | ||||||||||||||
|
||||||||||||||
``None`` cannot be used with this function, as ``None`` is a special | ||||||||||||||
value reserved for dead object proxies. Since ``None`` is natively | ||||||||||||||
shareable, there's no need to pass it to this function anyway. | ||||||||||||||
Comment on lines
+168
to
+172
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be in the main prose, not a note. |
||||||||||||||
|
||||||||||||||
.. function:: concurrent.interpreters.share_forever(obj) | ||||||||||||||
|
||||||||||||||
Similar to :func:`~concurrent.interpreters.share`, but *does not* give the resulting | ||||||||||||||
proxy a context, meaning it will live forever (unless a call to ``share`` | ||||||||||||||
explicitly gives the proxy a new lifetime). As such, this function does not | ||||||||||||||
return a :term:`context manager`. | ||||||||||||||
|
||||||||||||||
For example: | ||||||||||||||
|
||||||||||||||
.. code-block:: python | ||||||||||||||
|
||||||||||||||
from concurrent import interpreters | ||||||||||||||
|
||||||||||||||
with open("spanish_inquisition.txt") as unshareable: | ||||||||||||||
interp = interpreters.create() | ||||||||||||||
proxy = interpreters.share_forever(unshareable) | ||||||||||||||
interp.prepare_main(file=proxy) | ||||||||||||||
# Note: the bound method object for file.write() will also live | ||||||||||||||
# forever in a proxy. | ||||||||||||||
interp.exec("file.write('I didn't expect the Spanish Inquisition')") | ||||||||||||||
|
||||||||||||||
.. warning:: | ||||||||||||||
|
||||||||||||||
Proxies created as a result of the returned proxy (for example, bound | ||||||||||||||
method objects) will also exist for the lifetime of the interpreter, | ||||||||||||||
which can lead to high memory usage. | ||||||||||||||
|
||||||||||||||
|
||||||||||||||
Multithreaded Scaling | ||||||||||||||
--------------------- | ||||||||||||||
|
||||||||||||||
Since an object proxy mostly interacts with an object normally, there shouldn't | ||||||||||||||
be much additional overhead on using the object once the thread state has been | ||||||||||||||
switched. However, this means that when the :term:`GIL` is enabled, you may lose | ||||||||||||||
some of the concurrency benefits from subinterpreters, because threads will be | ||||||||||||||
stuck waiting on the GIL of a wrapped object's interpreter. | ||||||||||||||
|
||||||||||||||
Backwards Compatibility | ||||||||||||||
======================= | ||||||||||||||
|
||||||||||||||
In order to implement the immortality mechanism used by shared object proxies, | ||||||||||||||
several assumptions had to be made about the object lifecycle in the C API. | ||||||||||||||
So, some best practices in the C API (such as using the object allocator for | ||||||||||||||
objects) are made harder requirements by the implementation of this PEP. | ||||||||||||||
Comment on lines
+214
to
+217
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Be explicit about what those assumptions are. In general, it is hard to make assumptions about what code in the wild is or is not doing (Hyrum's Law etc) |
||||||||||||||
|
||||||||||||||
The author of this PEP believes it is unlikely that this will cause breakage, | ||||||||||||||
as he has not ever seen code in the wild that violates the assumptions made | ||||||||||||||
about the object lifecycle as required by the reference implementation. | ||||||||||||||
Comment on lines
+219
to
+221
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are millions of lines of proprietary code that you & I haven't seen ;-) More generally just re-emphasising the point on explicitly enumerating the changes in semantics that you propose. |
||||||||||||||
|
||||||||||||||
Security Implications | ||||||||||||||
===================== | ||||||||||||||
|
||||||||||||||
The largest issue with shared object proxies is that in order to have | ||||||||||||||
thread-safe reference counting operations, they must be :term:`immortal`, | ||||||||||||||
which prevents any concurrent modification to their reference count. | ||||||||||||||
This can cause them to take up very large amounts of memory if mismanaged. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Large amounts of memory -> denial of service attack, which is the actual security concern here |
||||||||||||||
|
||||||||||||||
The :func:`~concurrent.interpreters.share` context manager does its best | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
to avoid this issue by manually clearing references at the end of an object | ||||||||||||||
proxy's usage (allowing mortal objects to be freed), as well as avoiding | ||||||||||||||
the allocation of new object proxies by reusing dead ones (that is, object | ||||||||||||||
proxies with a cleared reference). | ||||||||||||||
|
||||||||||||||
How to Teach This | ||||||||||||||
================= | ||||||||||||||
|
||||||||||||||
New APIs and important information about how to use them will be added to the | ||||||||||||||
:mod:`concurrent.interpreters` documentation. An informational PEP regarding | ||||||||||||||
the new immortality mechanisms included in the reference implementation will | ||||||||||||||
be written if this PEP is accepted. | ||||||||||||||
|
||||||||||||||
Reference Implementation | ||||||||||||||
======================== | ||||||||||||||
|
||||||||||||||
The reference implementation of this PEP can be found | ||||||||||||||
`here <https://github.com/python/cpython/compare/main...ZeroIntensity:cpython:shared-object-proxy>`_. | ||||||||||||||
|
||||||||||||||
Rejected Ideas | ||||||||||||||
============== | ||||||||||||||
|
||||||||||||||
Why Not Atomic Reference Counting? | ||||||||||||||
---------------------------------- | ||||||||||||||
|
||||||||||||||
Immortality seems to be the driver for a lot of complexity in this proposal; | ||||||||||||||
why not use atomic reference counting instead? | ||||||||||||||
|
||||||||||||||
Atomic reference counting has been tried before in previous :term:`GIL` | ||||||||||||||
removal attempts, but unfortunately added too much overhead to CPython to be | ||||||||||||||
feasible, because atomic "add" operations are much slower than their non-atomic | ||||||||||||||
counterparts. Immortality, while complex, has the benefit of being efficient | ||||||||||||||
and thread-safe without needing to slow down single-threaded performance with | ||||||||||||||
reference counting. | ||||||||||||||
|
||||||||||||||
Copyright | ||||||||||||||
========= | ||||||||||||||
|
||||||||||||||
This document is placed in the public domain or under the | ||||||||||||||
CC0-1.0-Universal license, whichever is more permissive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.