|
| 1 | +PEP: 797 |
| 2 | +Title: Shared Object Proxies |
| 3 | +Author: Peter Bierma < [email protected]> |
| 4 | +Discussions-To: Pending |
| 5 | +Status: Draft |
| 6 | +Type: Standards Track |
| 7 | +Created: 08-Aug-2025 |
| 8 | +Python-Version: 3.15 |
| 9 | +Post-History: `01-Jul-2025 <https://discuss.python.org/t/97306>`__ |
| 10 | + |
| 11 | + |
| 12 | +Abstract |
| 13 | +======== |
| 14 | + |
| 15 | +This PEP introduces a new :func:`~concurrent.interpreters.share` function to |
| 16 | +the :mod:`concurrent.interpreters` module, which allows any arbitrary object |
| 17 | +to be shared across interpreters using an object proxy, at the cost of being |
| 18 | +less efficient to concurrently access across multiple interpreters. |
| 19 | + |
| 20 | +For example: |
| 21 | + |
| 22 | +.. code-block:: python |
| 23 | +
|
| 24 | + from concurrent import interpreters |
| 25 | +
|
| 26 | + with open("spanish_inquisition.txt") as unshareable: |
| 27 | + interp = interpreters.create() |
| 28 | + proxy = interpreters.share(unshareable) |
| 29 | + interp.prepare_main(file=proxy) |
| 30 | + interp.exec("file.write('I didn't expect the Spanish Inquisition')") |
| 31 | +
|
| 32 | +Motivation |
| 33 | +========== |
| 34 | + |
| 35 | +Many Objects Cannot be Shared Between Subinterpreters |
| 36 | +----------------------------------------------------- |
| 37 | + |
| 38 | +In Python 3.14, the new :mod:`concurrent.interpreters` module can be used to |
| 39 | +create multiple interpreters in a single Python process. This works well for |
| 40 | +code without shared state, but since one of the primary applications of |
| 41 | +subinterpreters is to bypass the :term:`global interpreter lock`, it is |
| 42 | +fairly common for programs to require highly-complex data structures that are |
| 43 | +not easily shareable. In turn, this damages the practicality of |
| 44 | +subinterpreters for concurrency. |
| 45 | + |
| 46 | +As of writing, subinterpreters can only share :ref:`a handful of types |
| 47 | +<interp-object-sharing>` natively, relying on the :mod:`pickle` module |
| 48 | +for other types. This can be very limited, as many types of objects cannot be |
| 49 | +serialized with ``pickle`` (such as file objects returned by :func:`open`). |
| 50 | +Additionally, serialization can be a very expensive operation, which is not |
| 51 | +ideal for multithreaded applications. |
| 52 | + |
| 53 | +Rationale |
| 54 | +========= |
| 55 | + |
| 56 | +A Fallback for Object Sharing |
| 57 | +----------------------------- |
| 58 | + |
| 59 | +A shared object proxy is designed to be a fallback for sharing an object |
| 60 | +between interpreters. A shared object proxy should only be used as |
| 61 | +a last-resort for highly complex objects that cannot be serialized or shared |
| 62 | +in any other way. |
| 63 | + |
| 64 | +This means that even if this PEP is accepted, there is still benefit in |
| 65 | +implementing other methods to share objects between interpreters. |
| 66 | + |
| 67 | + |
| 68 | +Specification |
| 69 | +============= |
| 70 | + |
| 71 | +.. class:: concurrent.interpreters.SharedObjectProxy |
| 72 | + |
| 73 | + A proxy type that allows access to an object across multiple interpreters. |
| 74 | + This cannot be constructed from Python; instead, use the |
| 75 | + :func:`~concurrent.interpreters.share` function. |
| 76 | + |
| 77 | + |
| 78 | +.. function:: concurrent.interpreters.share(obj) |
| 79 | + |
| 80 | + Wrap *obj* in a :class:`~concurrent.interpreters.SharedObjectProxy`, |
| 81 | + allowing it to be used in other interpreter APIs as if it were natively |
| 82 | + shareable. |
| 83 | + |
| 84 | + If *obj* is natively shareable, this function does not create a proxy and |
| 85 | + simply returns *obj*. |
| 86 | + |
| 87 | + |
| 88 | +Interpreter Switching |
| 89 | +--------------------- |
| 90 | + |
| 91 | +When interacting with the wrapped object, the proxy will switch to the |
| 92 | +interpreter in which the object was created. This must happen for any access |
| 93 | +to the object, such as accessing attributes. To visualize, ``foo`` in the |
| 94 | +following code is only ever called in the main interpreter, despite being |
| 95 | +accessed in subinterpreters through a proxy: |
| 96 | + |
| 97 | +.. code-block:: python |
| 98 | +
|
| 99 | + from concurrent import interpreters |
| 100 | +
|
| 101 | + def foo(): |
| 102 | + assert interpreters.get_current() == interpreters.get_main() |
| 103 | +
|
| 104 | + interp = interpreters.create() |
| 105 | + proxy = interpreters.share(foo) |
| 106 | + interp.prepare_main(foo=proxy) |
| 107 | + interp.exec("foo()") |
| 108 | +
|
| 109 | +
|
| 110 | +Multithreaded Scaling |
| 111 | +--------------------- |
| 112 | + |
| 113 | +To switch to a wrapped object's interpreter, an object proxy must swap the |
| 114 | +:term:`attached thread state` of the current thread, which will in turn wait |
| 115 | +on the :term:`GIL` of the target interpreter, if it is enabled. This means that |
| 116 | +a shared object proxy will experience contention when accessed concurrently, |
| 117 | +but is still useful for multicore threading, since other threads in the |
| 118 | +interpreter are free to execute while waiting on the GIL of the target |
| 119 | +interpreter. |
| 120 | + |
| 121 | +As an example, imagine that multiple interpreters want to write to a log through |
| 122 | +a proxy for the main interpreter, but don't want to constantly wait on the log. |
| 123 | +By accessing the proxy in a separate thread for each interpreter, the thread |
| 124 | +performing the computation can still execute while accessing the proxy. |
| 125 | + |
| 126 | +.. code-block:: python |
| 127 | +
|
| 128 | + from concurrent import interpreters |
| 129 | +
|
| 130 | + def write_log(message): |
| 131 | + print(message) |
| 132 | +
|
| 133 | + def execute(n, write_log): |
| 134 | + from threading import Thread |
| 135 | + from queue import Queue |
| 136 | +
|
| 137 | + log = Queue() |
| 138 | +
|
| 139 | + # By performing this in a separate thread, 'execute' can still run |
| 140 | + # while the log is being accessed by the main interpreter. |
| 141 | + def log_queue_loop(): |
| 142 | + while True: |
| 143 | + write_log(log.get()) |
| 144 | +
|
| 145 | + thread = Thread(target=log_queue_loop) |
| 146 | + thread.start() |
| 147 | +
|
| 148 | + for i in range(100000): |
| 149 | + n ** i |
| 150 | + log.put(f"Completed an iteration: {i}") |
| 151 | +
|
| 152 | + thread.join() |
| 153 | +
|
| 154 | + proxy = interpreters.share(write_log) |
| 155 | + for n in range(4): |
| 156 | + interp = interpreters.create() |
| 157 | + interp.call_in_thread(execute, n, proxy) |
| 158 | +
|
| 159 | +
|
| 160 | +Proxy Copying |
| 161 | +------------- |
| 162 | + |
| 163 | +Contrary to what one might think, a shared object proxy itself can only be used |
| 164 | +in one interpreter, because the proxy's reference count is not thread-safe |
| 165 | +(and thus cannot be accessed from multiple interpreters). Instead, when crossing |
| 166 | +an interpreter boundary, a new proxy is created for the target interpreter that |
| 167 | +wraps the same object as the original proxy. |
| 168 | + |
| 169 | +For example, in the following code, there are two proxies created, not just one. |
| 170 | + |
| 171 | +.. code-block:: python |
| 172 | +
|
| 173 | + from concurrent import interpreters |
| 174 | +
|
| 175 | + interp = interpreters.create() |
| 176 | + foo = object() |
| 177 | + proxy = interpreters.share(foo) |
| 178 | +
|
| 179 | + # The proxy crosses an interpreter boundary here. 'proxy' is *not* directly |
| 180 | + # send to 'interp'. Instead, a new proxy is created for 'interp', and the |
| 181 | + # reference to 'foo' is merely copied. Thus, both interpreters have their |
| 182 | + # own proxy that are wrapping the same object. |
| 183 | + interp.prepare_main(proxy=proxy) |
| 184 | +
|
| 185 | +
|
| 186 | +Thread-local State |
| 187 | +------------------ |
| 188 | + |
| 189 | +Accessing an object proxy will retain information stored on the current |
| 190 | +:term:`thread state`, such as thread-local variables stored by |
| 191 | +:class:`threading.local` and context variables stored by :mod:`contextvars`. |
| 192 | +This allows the following case to work correctly: |
| 193 | + |
| 194 | +.. code-block:: python |
| 195 | +
|
| 196 | + from concurrent import interpreters |
| 197 | + from threading import local |
| 198 | +
|
| 199 | + thread_local = local() |
| 200 | + thread_local.value = 1 |
| 201 | +
|
| 202 | + def foo(): |
| 203 | + assert thread_local.value == 1 |
| 204 | +
|
| 205 | + interp = interpreters.create() |
| 206 | + proxy = interpreters.share(foo) |
| 207 | + interp.prepare_main(foo=proxy) |
| 208 | + interp.exec("foo()") |
| 209 | +
|
| 210 | +In order to retain thread-local data when accessing an object proxy, each |
| 211 | +thread will have to keep track of the last used thread state for |
| 212 | +each interpreter. In C, this behavior looks like this: |
| 213 | + |
| 214 | +.. code-block:: c |
| 215 | +
|
| 216 | + // Error checking has been omitted for brevity |
| 217 | + PyThreadState *tstate = PyThreadState_New(interp); |
| 218 | +
|
| 219 | + // By swapping the current thread state to 'interp', 'tstate' will be |
| 220 | + // associated with 'interp' for the current thread. That means that accessing |
| 221 | + // a shared object proxy will use 'tstate' instead of creating its own |
| 222 | + // thread state. |
| 223 | + PyThreadState *save = PyThreadState_Swap(tstate); |
| 224 | +
|
| 225 | + // 'save' is now the most recently used thread state, so shared object |
| 226 | + // proxies in this thread will use it instead of 'tstate' when accessing |
| 227 | + // 'interp'. |
| 228 | + PyThreadState_Swap(save); |
| 229 | +
|
| 230 | +In the event that no thread state exists for an interpreter in a given thread, |
| 231 | +a shared object proxy will create its own thread state that will be owned by |
| 232 | +the interpreter (meaning it will not be destroyed until interpreter |
| 233 | +finalization), which will persist across all shared object proxy accesses in |
| 234 | +the thread. In other words, a shared object proxy ensures that thread local |
| 235 | +variables and similar state will not disappear. |
| 236 | + |
| 237 | + |
| 238 | +Memory Management |
| 239 | +----------------- |
| 240 | + |
| 241 | +All proxy objects hold a :term:`strong reference` to the object that they |
| 242 | +wrap. As such, destruction of a shared object proxy may trigger destruction |
| 243 | +of the wrapped object if the proxy holds the last reference to it, even if |
| 244 | +the proxy belongs to a different interpreter. For example: |
| 245 | + |
| 246 | +.. code-block:: python |
| 247 | +
|
| 248 | + from concurrent import interpreters |
| 249 | +
|
| 250 | + interp = interpreters.create() |
| 251 | + foo = object() |
| 252 | + proxy = interpreters.share(foo) |
| 253 | + interp.prepare_main(proxy=proxy) |
| 254 | + del proxy, foo |
| 255 | +
|
| 256 | + # 'foo' is still alive at this point, because the proxy in 'interp' still |
| 257 | + # holds a reference to it. Destruction of 'interp' will then trigger the |
| 258 | + # destruction of 'proxy', and subsequently the destruction of 'foo'. |
| 259 | + interp.close() |
| 260 | +
|
| 261 | +
|
| 262 | +Shared object proxies support the garbage collector protocol, but will only |
| 263 | +traverse the object that they wrap if the garbage collection is occurring |
| 264 | +in the wrapped object's interpreter. To visualize: |
| 265 | + |
| 266 | +.. code-block:: python |
| 267 | +
|
| 268 | + from concurrent import interpreters |
| 269 | + import gc |
| 270 | +
|
| 271 | + proxy = interpreters.share(object()) |
| 272 | +
|
| 273 | + # This prints out [<object object at 0x...>], because the object is owned |
| 274 | + # by this interpreter. |
| 275 | + print(gc.get_referents(proxy)) |
| 276 | +
|
| 277 | + interp = interpreters.create() |
| 278 | + interp.prepare_main(proxy=proxy) |
| 279 | +
|
| 280 | + # This prints out [], because the wrapepd object must be invisible to this |
| 281 | + # interpreter. |
| 282 | + interp.exec("import gc; print(gc.get_referents(proxy))") |
| 283 | +
|
| 284 | +
|
| 285 | +Interpreter Lifetimes |
| 286 | +********************* |
| 287 | + |
| 288 | +When an interpreter is destroyed, shared object proxies wrapping objects |
| 289 | +owned by that interpreter may still exist elsewhere. To prevent this |
| 290 | +from causing crashes, an interpreter will invalidate all proxies pointing |
| 291 | +to any object it owns by overwriting the proxy's wrapped object with ``None``. |
| 292 | + |
| 293 | +To demonstrate, the following snippet first prints out ``Alive``, and then |
| 294 | +``None`` after deleting the interpreter: |
| 295 | + |
| 296 | +.. code-block:: python |
| 297 | +
|
| 298 | + from concurrent import interpreters |
| 299 | +
|
| 300 | + def test(): |
| 301 | + from concurrent import interpreters |
| 302 | +
|
| 303 | + class Test: |
| 304 | + def __str__(self): |
| 305 | + return "Alive" |
| 306 | +
|
| 307 | + return interpreters.share(Test()) |
| 308 | +
|
| 309 | + interp = interpreters.create() |
| 310 | + wrapped = interp.call(test) |
| 311 | + print(wrapped) # Alive |
| 312 | + interp.close() |
| 313 | + print(wrapped) # None |
| 314 | +
|
| 315 | +Note that the proxy is not physically replaced (``wrapped`` in the above example |
| 316 | +is still a ``SharedObjectProxy`` instance), but instead has its wrapped object |
| 317 | +replaced with ``None``. |
| 318 | + |
| 319 | + |
| 320 | +Backwards Compatibility |
| 321 | +======================= |
| 322 | + |
| 323 | +This PEP has no known backwards compatibility issues. |
| 324 | + |
| 325 | +Security Implications |
| 326 | +===================== |
| 327 | + |
| 328 | +This PEP has no known security implications. |
| 329 | + |
| 330 | +How to Teach This |
| 331 | +================= |
| 332 | + |
| 333 | +New APIs and important information about how to use them will be added to the |
| 334 | +:mod:`concurrent.interpreters` documentation. |
| 335 | + |
| 336 | +Reference Implementation |
| 337 | +======================== |
| 338 | + |
| 339 | +The reference implementation of this PEP can be found |
| 340 | +`here <https://github.com/python/cpython/compare/main...ZeroIntensity:cpython:shared-object-proxy>`_. |
| 341 | + |
| 342 | +Rejected Ideas |
| 343 | +============== |
| 344 | + |
| 345 | +Directly Sharing Proxy Objects |
| 346 | +------------------------------ |
| 347 | + |
| 348 | +The initial revision of this proposal took an approach where an instance of |
| 349 | +:class:`~concurrent.interpreters.SharedObjectProxy` was :term:`immortal`. This |
| 350 | +allowed proxy objects to be directly shared across interpreters, because their |
| 351 | +reference count was thread-safe (since it never changed due to immortality). |
| 352 | + |
| 353 | +This proved to make the implementation significantly more complicated, and |
| 354 | +also ended up with a lot of edge cases that would have been a burden on |
| 355 | +CPython maintainers. |
| 356 | + |
| 357 | +Acknowledgements |
| 358 | +================ |
| 359 | + |
| 360 | +This PEP would not have been possible without discussion and feedback from |
| 361 | +Eric Snow, Petr Viktorin, Kirill Podoprigora, Adam Turner, and Yury Selivanov. |
| 362 | + |
| 363 | +Copyright |
| 364 | +========= |
| 365 | + |
| 366 | +This document is placed in the public domain or under the |
| 367 | +CC0-1.0-Universal license, whichever is more permissive. |
0 commit comments