From 4d0c768a41a13bd06c6c9e3ca0b50917d0c62fc8 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Thu, 22 May 2025 10:07:18 +0100 Subject: [PATCH 01/16] Initial draft --- peps/pep-08xx.rst | 578 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 578 insertions(+) create mode 100644 peps/pep-08xx.rst diff --git a/peps/pep-08xx.rst b/peps/pep-08xx.rst new file mode 100644 index 00000000000..04535a3a834 --- /dev/null +++ b/peps/pep-08xx.rst @@ -0,0 +1,578 @@ +PEP: 08XX +Title: Safe Parallel Python +Author: Mark Shannon +Discussions-To: discuss.python.org +Status: Draft +Type: Standards Track +Created: 15-Aug-2025 +Post-History: ??? + +Abstract +======== + +This PEP proposes internal changes to CPython and a new API to support safe, parallel execution of Python. +With this PEP, parallel execution of code is race free: objects must be explicitly made safe if they are to be shared between threads, +or such sharing is prohibited. + +This PEP adds some additional state to each object, so that it is possible to check, at runtime and at low cost, +whether any operation is thread safe and raise an exception when it is not. + +Motivation +========== + +Traditionally, CPython has executed in only a single thread of execution at once. +This has always been seen as a limitation of Python and there has been a desire for +Python to support parallel execution for many years. + +PEP 703, Making the Global Interpreter Lock Optional in CPython, and PEP 554, Multiple Interpreters in the Stdlib, offer ways to support parallelism. +Multiple interpreters are both safe and support parallelism, but they are difficult to use and sharing objects +between multiple interpreters without copying is impossible. +PEP 703 supports parallel execution and sharing, but is unsafe as it allows race conditions. +Race conditions allow dangerous and hard to find bugs. In the most extreme example, Therac-25 [1]_, a race condtion bug resulted in several fatalities. The trouble with race conditions is not that the bugs they introduce are necessarily worse than other bugs, +but that they can be very hard to detect and may easily slip through testing. + +Parallelism, without strong support from the language and runtime, is extremely difficult to get right: + + A large fraction of the flaws in software development are due to programmers not fully understanding all the possible states their code may execute in. + In a multithreaded environment, the lack of understanding and the resulting problems are greatly amplified, + almost to the point of panic if you are paying attention -- John Carmack (Functional Programming in C++) + +Python is used by many technologists and widely in education, not just by professional software engineers. +We cannot expect those users to handle the subtleties of parallel programming using a race-prone model like that of Java or PEP 703. + +Rationale +========= + +We want to allow a familiar model of parallel execution while retaining safety. +Threads, locks, queues and immutability are familiar concepts and provide the building blocks for a safe model of execution. +Objects should either be safe for sharing between trhreads, or the VM should prevent them from being shared. + +The synchronization quadrant diagram +------------------------------------ + ++-------------------+------------+------------+ +| | Unshared | Shared | ++===================+============+============+ +| Mutable objects | 😊 | 🔥😡🔥 | ++-------------------+------------+------------+ +| Immutable objects | 😊 | 😊 | ++-------------------+------------+------------+ + +The table above shows the four synchronization quadrants. It is only when objects can be mutated *and* accessed from parallel threads, that race conditions can occur. +This PEP has three main goals: + +* to provide mechanisms to move execution from the dangerous quadrant into either of the adjacent quadrants, +* to guarantee that execution in the dangerous quadrant is properly synchronized, and +* to provide means to move gradually from a single thread to parallel threads, retaining safety. + +Immutability allows safe execution without synchronization, so this PEP provides mechanisms for making objects immutable. +Where immutability is not possible, this PEP offers mechanisms for safe execution by ensuring that the object is visible only +to one thread (the top-left quadrant), or that it is protected by a mutual exclusion lock (mutex). +The PEP also proposes changes to CPython to prevent unsafe execution when mutable objects are shared. +Finally, the PEP provides a generalization of the GIL to allow incrementally moving away from the GIL. + +This PEP is inspired by ideas from Ocaml, specifically Data Freedom à la Mode [2]_, and the Pyrona project [3]_. +Many of the necessary technologies, such as biased and deferred reference counting, have been developed for PEP 703. + +Specification +============= + +Object states +------------- + +All objects will gain a ``__shared__`` state, which will be used the Python VM to ensure that objects are used safely. +The state can be queried by looking at the ``__shared__`` attribute of an object. + +An object's ``__shared__`` state can be one of the following: + +* Immutable: Cannot be modified, and can be safely shared between threads. +* Local: Only visible to a single thread*, and can be freely mutated by that thread. +* Stop-the-world mutable. Mostly immutable, but can be mutated by acquiring the stop-the-world lock. +* Protected: Object is mutable, and is protected by a mutex. +* Synchronized: A special state for some builtin objects. + All operations on the object are protected by an internal mutex, so no external synchronization is needed. + +\* Strictly, a single "super thread". See :ref:`super-thread` below. + +The ``__shared__`` attribute is read-only. + +All classes, modules and functions will be *stop-the-world mutable* initially, but can be made *immutable*. +Views, iterators and other objects that depend on the internal state of other mutable objects will inherit the state of +those objects. For example, a ``listiterator`` of a *local* ``list`` will be *local*, but a ``listiterator`` of a *protected* +``list`` will be *protected*. Views and iterators of *immutable* (including STW mutable) objects will be *local* when created. + +All objects that not inherently immutable (like tuples or strings) will be created as *local*. +These *local* objects can later be made *immutable* or *protected*. + +Dictionaries and lists will support the ability to be made stop-the-world mutable on a per-object basis. +This is to maintain the necessary immutability for performance and mutability for +backwards campatibility for objects like ``sys.modules`` and ``sys.path``. + +The stop-the-world lock +----------------------- + +A new, internal lock is added to CPython, the "stop the world" lock. When this lock is acquired, all other threads are +stopped until the lock is released. If two threads attempt to acquire the stop-the-world lock at the same time, +in such a way that deadlock would occur, then an exception is raised. + + +.. _super-threads + +SuperThread objects +------------------- + +A new class ``SuperThread`` will be added to help port applications using the GIL +and sub-interpreters. All threads sharing a ``SuperThread`` will be serialized, +in the same way as all threads are currently serialized by the GIL. +``SuperThread``\ s offer much the same capabilites as sub-interpreters, +but more efficiently and with the ability to share more objects. + +There is a many-to-one relationship between threads and ``SuperThread``\ s. +If no super thread is explictly specified when creating a thread, +a new super thread will be created specifically for that thread. +The super thread of a thread cannot be changed. + +All threads that share a ``SuperThread`` are treated as the same thread for *local* objects. +They are still treated as distinct for all locks, and thus for *protected* objects. + +New API +------- + +This PEP proposes adding the following: + +* A ``__freeze__()`` method, added to most classes, which freezes the object making it immutable. +* The ``__protect__(self: Self, obj: T) -> T`` method to mark the ``self`` object as protecting ``obj``. +* The ``__mutex__`` context manager property, added to all objects, for critical sections. +* The ``Channel`` class for passing mutable objects from one thread to another. + +Freezing +'''''''' + +The ``__freeze__()`` method will have the signature ``__freeze__(self: Self) -> Frozen[Self]`` where +``Frozen[T]`` is the frozen class for ``T``. The value returned by ``__freeze__`` is the original object: +``obj.__freeze__() is obj``. Having a return value of a different type can assist type checkers in +tracking which variables refer to frozen objects. + +The ``__freeze__()`` will be added to all pure Python class as well as most standard library builtin collections. +``list``, ``set`` and ``dict`` classes will gain a ``__freeze__()`` method. +For immutable objects like ``tuple``, ``__freeze__()`` will be supported, but will have no effect. + +Note that freezing an object is a shallow operation; ``x.__freeze__()` only freezes ``x`` and not any of the objects +that ``x`` refers to. + +The ``__freeze__`` method can be used to create classes of immutable objects, by calling ``__freeze__`` at the +end of the ``__init__`` method:: + + class ImmutablePoint: + + def __init__(self, x, y): + self.x = x + self.y = y + self.__freeze__() + + +... note:: + + The various ``freeze`` methods have full VM support. Immutability is not merely a convention, it will + be enforced by the VM. + +A ``__deep_freeze__`` method may be added as a :ref:`future enhancement`. + +Mutexes and locking +''''''''''''''''''' + +All objects will gain a ``__mutex__`` context manager for protecting sections of code with a mutual exclusion lock. +These mutual exclusion locks also support the ``+`` operator for locking multiple objects without deadlock. + +``locka + lockb`` creates a new mutex that locks both ``locka`` and ``lockb`` in a globally consistent order. +Addition is commutative, such that:: + + def func1(a, b): + with a.__mutex__ + b.__mutex__: + ... + + def func2(a, b): + with b.__mutex__ + a.__mutex__: + ... + +will not deadlock should ``func1`` and ``func2`` be called concurrently. + +Synchronization +''''''''''''''' + +In order to safely share mutable state between threads, synchronization is needed. + +The ``protected`` state is a way to protect groups of mutable objects when sharing them. + +A ``protected`` object is a mutable object which can be accessed when the +protecting mutex is held by the accessing thread. + +An object ``y`` can be protected by object ``x`` by calling ``x.__protect__(y)`` which prevents ``y`` from +being accessed, unless the accessing thread holds ``x``\ 's mutex, ``x.__mutex__``. +The reference passed to ``__protect__`` must be the sole reference to a *local* object, +or a ``ValueError`` is raised. + +Passing mutable values between threads +'''''''''''''''''''''''''''''''''''''' + +The ``Channel`` class is provided for passing objects from one thread to another. +The ``Channel`` class acts like a ``deque`` but handles tranferring of ownership local objects. +When passing a *local* object, ``channel.put(obj)`` detaches the object ``obj`` from the current thread. +When passing a *local* object, ``Channel.put()`` will fail, raising a ``ValueError``, if there are any other references to the argument. +``Channel.get()`` returns the object passed but to ``Channel.put()``, making the calling +thread the owner of the *local* object, if the object was *local*. + +Non-*local* objects are passed through ``Channel``\ s unchanged. + +Adding a "deep" put mechanism is a possible :ref:`future enhancement`. + +The GIL +''''''' + +On interpreter startup a ``SuperThread`` named "GIL" will be created and stored in ``sys.gil``. +``sys.gil`` is read-only and the GIL ``SuperThread`` will outlive all mortal objects even if +the ``sys`` module is deleted. The main thread's ``SuperThread`` will be the GIL. + +If the environment variable ``PYTHONGIL=1`` is set, then all new threads will default to +``super_thread = sys.gil``. Otherwise all new threads will default to ``super_thread = None``. +Explictly setting the ``super_thread`` argument when creating a thread will override these defaults. + +Deadlock detection +------------------ + +The addition of the stop-the-world lock, and the requirements for locking on all synchronized objects, +may lead to more deadlocks. +Since it is the goal of this PEP to avoid confusing behavior, a deadlock detector will be added to CPython. +There are well known techniques for detecting deadlocks and they can implemented without undue overhead. + +Semantics +--------- + +Although it is performing operations on an object that leads to race conditions, checking every operation +on every object would be prohibitively expensive. Instead, the cost can be reduced hugely by preventing +threads having any access to objects which could cause race conditions. This means that it is only when +a thread reference is created from a heap reference, does that operation need to be checked. +If we do that, then all other operations become safe. + ++------------------------+-----------+-----------------+-----------------+---------------+----------------+ +| Object state | Immutable | Local = thread | Local ≠ thread | Protected | Synchronized | ++========================+===========+=================+=================+===============+================+ +| Acquire reference | Yes | Yes | No | Yes\ :sup:`1` | Yes | ++------------------------+-----------+-----------------+-----------------+---------------+----------------+ +| ``freeze()`` | No effect | Yes\ :sup:`2` | N/A | Yes\ :sup:`2` | Yes\ :sup:`2` | ++------------------------+-----------+-----------------+-----------------+---------------+----------------+ +| ``__protect__()`` | No | Yes\ :sup:`2,3` | N/A | No | No | ++------------------------+-----------+-----------------+-----------------+---------------+----------------+ +| All other operations | Yes | Yes | N/A | Yes | Yes | ++------------------------+-----------+-----------------+-----------------+---------------+----------------+ + +1. If the mutex held by the thread matches the mutex that protects the object +2. If supported for that class. +3. The argument to ``__protect__`` must the sole reference to the object. + +ABI breakage +------------ + +This PEP will require a one time ABI breakage, much like PEP 703, as the ``PyObject`` struct will need to be changed. + +Deferred reclamation +-------------------- + +Immutable (including stop-the-world mutable) objects may have their reclamation deferred. +In other words, they may not be reclaimed immediately that their are no more references to them. + +This is because these objects may be referred to from several threads simultaneously, and the overhead +of serializing the reference count operations would be too high. +PEP 703 also does this. + +Local objects, visible to only one thread, will still be reclaimed immediately that they are no longer referenced. + +New Exceptions +-------------- + +Two new exception classes will be added: + +* ``IllegalThreadAccessException`` for when a thread attempts to acquire a reference to a *local* object belonging to another thread. +* ``UnprotectedAccessException`` for when a thread attempts to acquire a reference to a *protected* object without holding the necessary mutex. + +Backwards Compatibility +======================= + +It is expected that this PEP is mostly backwards compatible, with the exception of mutexes. +Code using mutexes will now need to be more explicit about which objects are protected by which mutexes. + +They may be some cases where mutation of modules causes poor performance or even deadlocks, +but these should be rare. + +Setting ``PYTHONGIL=1`` ensures that all threads are serialized by the GIL, providing backwards compatibility, +and allowing a gradual path to parallelism by setting ``super_thread = None`` for new threads. + +Performance +=========== + +The key to getting good performance out of any dynamic language, including Python, is to specialize code +according to the most likely types or values. Rather than perform an expensive, general operation, a cheap +guard is performed to see that the expectations are met, then an efficient tailored operation is performed. + +Take the example of indexing into a list: ``l[x]`` +With the GIL, this can be done by first checking that ``l`` is a list, ``x`` is an int, and that ``x`` is in-bounds. +Then the the value can be read out of the list's array directly. +However, in the free-threading this approach doesn't work as another thread may have mutated the list at the same +time as it was being indexed. +This PEP restores good performance by adding an additional check to the guard: that the list is *local*. +Since the ``l`` is likely stored in a local variable, it must already be *local* and no additional check is needed. + +However, additional checks will still be needed. Whenever a reference owned by a thread is created, then a check will +be needed that it is legal. Since it is necessary to check that an object is *local* to the thread, +or that it is *immutable*, or that it is *protected* and the correct mutex is held, these checks could be relatively expensive. +However, the specializing adaptive interpreter and JIT can specialize, or sometimes eliminate, these operations. + +The general check:: + + if obj.__state__ == LOCAL and obj.__owner__ == current_tid: + pass # Good + elif obj.__state__ == IMMUTABLE or obj.__state__ == STW_MUTABLE: + pass # Good + elif obj.__state__ == PROTECTED and obj.__owner__ in currently_held_mutexes(): + pass # Good + else: + raise ... # Bad + +is expensive, but by specializing for the expected case, the check can be made cheap. +For example, if we expect a *local* object, we can do a much cheaper check:: + + if obj.__owner__ == current_tid: + pass # Good + else: + do_general_check(obj) + +Provided we make sure that thread ids and mutex ids are distinct. + + +With the GIL enabled +-------------------- + +If all threads belong to the GIL ``SerializedThreadGroup`` then the JIT can completely eliminate checks +for *local* objects (as these checks will always pass), resulting in performance very close to the current with-gil build. + +With threads +------------ + +Single threaded performance should exceed that of free-threading. +Expected performance is within two or three percent of the with-gil build. + +Security Implications +===================== + +The purpose of this PEP is provide stronger security by eliminating most race conditions. + + +How to Teach This +================= + +In order to run code in parallel, some understanding of the model of execution will +be needed. Writing unsafe code is much harder than under PEP 703, but the +new exceptions may surprise users. Extensive documentation will be needed. + +Examples +-------- + +Example 1: Thread safe tuple iterator +''''''''''''''''''''''''''''''''''''' + +:: + + class ThreadSafeIter: + "For thread-safe iterables, only" + + __slots__ = "_iterator", + + def __init__(self, iterable): + self._iterator = self.__protect__(iter(iterable)) + self.__freeze__() + + def __iter__(self): + return self + + def __next__(self): + with self.__mutex__: + return self._iterator.__next__() + + +Example 2: Thread safe list +''''''''''''''''''''''''''' +:: + + class ThreadSafeList: + + __slots__ = "_list", + + def __init__(self): + self._list = self.__protect__([]) + self.__freeze__() + + def append(self, val): + with self.__mutex__: + self._list.append(val) + + def __getitem__(self, index): + with self.__mutex__: + self._list[index] + + def __iter__(self): + with self.__mutex__: + return ThreadSafeListIterator(self) + + ... + + class ThreadSafeListIterator: + + __slots__ = "_list", "_iter" + + def __init__(the_list): + self._list = the_list + with the_list.__mutex__: + self._iter = the_list._list.__iter__() + + def __next__(self): + with self._list: + return self._iter.__next__() + + +Note how the iterator uses the mutex of the ``ThreadSafeList``, ``self._list``, as the ``listiterator`` +inherits the protection from the ``list`` object. + +Comparison to PEP 703 (Making the Global Interpreter Lock Optional in CPython) +============================================================================== + +This PEP should be thought of as building on PEP 703, rather than competing with or replacing it. +Many of the mechanisms needed to implement this PEP have been developed for PEP 703. + +What PEP 703 lacks is well defined semantics for what happens when race conditions are present, +or the means to avoid race conditions other than unverified locking. + +PEP 703 attempts to provide single-threaded performance for lists, dictionaries, +and other mutable objects while providing "reasonable" thread safety. Unfortunately, +no formal definition of expected behavior is provided, which leads to issues like these: + +* +* +* +* + +It is the author's opinion that attempting to preserve single-threaded performance +for mutable objects *and* any sort of thread safe parallel execution for the same object is wishful thinking. + +This PEP provides well defined semantics, single-threaded performance, *and* thread safety for lists and dicts. +It does this by partitioning objects into local objects and shared objects, and enforcing the necessary synchronization. + +Implementation +============== + +There is no actual implementation as yet, so this section outlines how this PEP could be implemented. + +Object state +------------ + +Recording object state requires space in the object header, at least 3 bits but no more than a byte. +Each object also needs additional space to refer to its thread, or protecting mutex. +With these fields, the ``PyObject`` header should be the smaller than is currently implemented for PEP 703, +although larger than for the default (with GIL) build. + +A possible object header: +.. code-block:: C + uint32_t ref_count_local; + uint32_t ref_count_shared; // For biased reference counting + uint32_t owner_id; + uint8_t shared_state; + uint8_t other_flags; + uint8_t mutex; + uint8_t gc_bits; + PyTypeObject *ob_type; + +or if we use atomic reference counting for non-local objects, we can use pointers for owners: +.. code-block:: C + uintptr_t owner; + uint32_t ref_count; + uint8_t shared_state; + uint8_t other_flags; + uint8_t mutex; + uint8_t gc_bits; + PyTypeObject *ob_type; + +Reference counting +------------------ + +Local objects (including immutable objects that are known to be only locally referenced) +can use non-atomic reference counting for speed. +Any object shared between threads would use either biased or atomic reference counting. + +Shared objects will use deferred reference counting where possible. + +Checking object states +---------------------- + +CPython is a stack machine. That means that for a thread to acquire a reference to an object, +that object must come from the heap or an API call and be pushed to the stack. +In order to prevent C extensions seeing objects they should not, all C API calls will need to +check their return value. In addition, the interpreter will need to check any values it gets +direct from the heap before pushing them to the stack. + +This is potentially a lot of new checks so, to avoid a large performance impact, +we need to keep the cost of these checks down. We can do that by: + +* Making the checks cheap. Checks should consist of only one or two simple comparisons with minimal memory accesses. +* Removing as many checks as possible in both the specializing interpreter and the JIT compiler. + +Specialization means that we can perform only one check for the most likely state, rather than checking all legal states. +If we expect a local object, we just check the object's thread ID against the current thread ID. +If, instead, we expect an immutable object, we can just check that the object is immutable. + +The JIT compiler can remove redundant checks on the same object. + +.. _future-enhancements + +Possible future enhancements +============================ + +Deep freezing and deep transfers +-------------------------------- + +Freezing a single object could leave a frozen object with references to mutable objects, and transfering of single objects could leave an object local to one thread, while other objects that it refers to are local to a different thread. +Either of these scanarios are likely to lead to runtime errors. To avoid that problem we need "deep" freezing. + +Deep freezing an object would freeze that object and the transitive closure of other mutable objects referred to by that object. +Deep transfering an object would transfer that object and the transitive closure of other local objects referred to by that object, +but would raise an exception if one of the those objects belonged to a different thread. + +Similar to freezing, a "deep" put mechanism could be added to ``Channel``\ s to move a whole graph of objects from one thread +to another. + +Rejected Ideas +============== + +[Why certain ideas that were brought while discussing this PEP were not ultimately pursued.] + + +Open Issues +=========== + +[Any points that are still being decided/discussed.] + + +Footnotes +========= + +.. [1] https://en.wikipedia.org/wiki/Therac-25 + +.. [2] https://dl.acm.org/doi/10.1145/3704859 + +.. [3] https://wrigstad.com/pldi2025.pdf + + + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive. From 8943f8371afbf66a92d1095c3a68e31f058d585f Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 8 Sep 2025 17:25:30 +0100 Subject: [PATCH 02/16] Fix format --- peps/pep-08xx.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-08xx.rst b/peps/pep-08xx.rst index 04535a3a834..dc9796d2f4d 100644 --- a/peps/pep-08xx.rst +++ b/peps/pep-08xx.rst @@ -171,7 +171,7 @@ end of the ``__init__`` method:: self.__freeze__() -... note:: +.. note:: The various ``freeze`` methods have full VM support. Immutability is not merely a convention, it will be enforced by the VM. From 45e783f8c10c5f0a2b41e1818b5f8d915227fd1d Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 8 Sep 2025 17:30:39 +0100 Subject: [PATCH 03/16] Give PEP a number --- peps/{pep-08xx.rst => pep-0804.rst} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename peps/{pep-08xx.rst => pep-0804.rst} (99%) diff --git a/peps/pep-08xx.rst b/peps/pep-0804.rst similarity index 99% rename from peps/pep-08xx.rst rename to peps/pep-0804.rst index dc9796d2f4d..2456227c315 100644 --- a/peps/pep-08xx.rst +++ b/peps/pep-0804.rst @@ -1,4 +1,4 @@ -PEP: 08XX +PEP: 0804 Title: Safe Parallel Python Author: Mark Shannon Discussions-To: discuss.python.org From 58b5a31be243b10e4c8397af7dd050132135e7b8 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 8 Sep 2025 17:35:04 +0100 Subject: [PATCH 04/16] Today's date --- peps/pep-0804.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0804.rst b/peps/pep-0804.rst index 2456227c315..9788a34fafc 100644 --- a/peps/pep-0804.rst +++ b/peps/pep-0804.rst @@ -4,7 +4,7 @@ Author: Mark Shannon Discussions-To: discuss.python.org Status: Draft Type: Standards Track -Created: 15-Aug-2025 +Created: 08-Sep-2025 Post-History: ??? Abstract From a22470391ac4a11132e4be58ba4c282709cf9bb9 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 8 Sep 2025 17:38:23 +0100 Subject: [PATCH 05/16] Fix links --- peps/pep-0804.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/peps/pep-0804.rst b/peps/pep-0804.rst index 9788a34fafc..d35f0c34c4b 100644 --- a/peps/pep-0804.rst +++ b/peps/pep-0804.rst @@ -116,7 +116,7 @@ stopped until the lock is released. If two threads attempt to acquire the stop-t in such a way that deadlock would occur, then an exception is raised. -.. _super-threads +.. _super-threads: SuperThread objects ------------------- @@ -530,7 +530,7 @@ If, instead, we expect an immutable object, we can just check that the object is The JIT compiler can remove redundant checks on the same object. -.. _future-enhancements +.. _future-enhancements: Possible future enhancements ============================ From c5759cc8837c47f515bcf25f4829a11ce7960cd9 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 8 Sep 2025 17:40:36 +0100 Subject: [PATCH 06/16] Attempt to fix code blocks --- peps/pep-0804.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/peps/pep-0804.rst b/peps/pep-0804.rst index d35f0c34c4b..5f056a1ceef 100644 --- a/peps/pep-0804.rst +++ b/peps/pep-0804.rst @@ -481,6 +481,7 @@ although larger than for the default (with GIL) build. A possible object header: .. code-block:: C + uint32_t ref_count_local; uint32_t ref_count_shared; // For biased reference counting uint32_t owner_id; @@ -492,6 +493,7 @@ A possible object header: or if we use atomic reference counting for non-local objects, we can use pointers for owners: .. code-block:: C + uintptr_t owner; uint32_t ref_count; uint8_t shared_state; From 21b8d4f0410b10d524658c5cf4791f9cd39b6490 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 8 Sep 2025 17:48:20 +0100 Subject: [PATCH 07/16] Now 805 --- peps/{pep-0804.rst => pep-0805.rst} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename peps/{pep-0804.rst => pep-0805.rst} (99%) diff --git a/peps/pep-0804.rst b/peps/pep-0805.rst similarity index 99% rename from peps/pep-0804.rst rename to peps/pep-0805.rst index 5f056a1ceef..ae21de11c7d 100644 --- a/peps/pep-0804.rst +++ b/peps/pep-0805.rst @@ -1,4 +1,4 @@ -PEP: 0804 +PEP: 805 Title: Safe Parallel Python Author: Mark Shannon Discussions-To: discuss.python.org From 1805d9ec8295268ea75423103f1ddffca1bef8ce Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 8 Sep 2025 17:52:36 +0100 Subject: [PATCH 08/16] Fix header until discuss links is available --- peps/pep-0805.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/peps/pep-0805.rst b/peps/pep-0805.rst index ae21de11c7d..ff1ee28d09a 100644 --- a/peps/pep-0805.rst +++ b/peps/pep-0805.rst @@ -1,11 +1,10 @@ PEP: 805 Title: Safe Parallel Python Author: Mark Shannon -Discussions-To: discuss.python.org +Discussions-To: Pending Status: Draft Type: Standards Track Created: 08-Sep-2025 -Post-History: ??? Abstract ======== @@ -491,6 +490,7 @@ A possible object header: uint8_t gc_bits; PyTypeObject *ob_type; + or if we use atomic reference counting for non-local objects, we can use pointers for owners: .. code-block:: C @@ -502,6 +502,7 @@ or if we use atomic reference counting for non-local objects, we can use pointer uint8_t gc_bits; PyTypeObject *ob_type; + Reference counting ------------------ From 40cc8f6f4a620e68b10777a49216b092422bc92b Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 8 Sep 2025 18:11:40 +0100 Subject: [PATCH 09/16] Add newlines --- peps/pep-0805.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/peps/pep-0805.rst b/peps/pep-0805.rst index ff1ee28d09a..b05649d9e4c 100644 --- a/peps/pep-0805.rst +++ b/peps/pep-0805.rst @@ -479,6 +479,7 @@ With these fields, the ``PyObject`` header should be the smaller than is current although larger than for the default (with GIL) build. A possible object header: + .. code-block:: C uint32_t ref_count_local; @@ -492,6 +493,7 @@ A possible object header: or if we use atomic reference counting for non-local objects, we can use pointers for owners: + .. code-block:: C uintptr_t owner; From 2f49abfbb9e216aea319af0a28111cf0b599d3e2 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 8 Sep 2025 18:15:54 +0100 Subject: [PATCH 10/16] Fix ref --- peps/pep-0805.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0805.rst b/peps/pep-0805.rst index b05649d9e4c..11cdcfb5f44 100644 --- a/peps/pep-0805.rst +++ b/peps/pep-0805.rst @@ -91,7 +91,7 @@ An object's ``__shared__`` state can be one of the following: * Synchronized: A special state for some builtin objects. All operations on the object are protected by an internal mutex, so no external synchronization is needed. -\* Strictly, a single "super thread". See :ref:`super-thread` below. +\* Strictly, a single "super thread". See :ref:`super-threads` below. The ``__shared__`` attribute is read-only. From 2fa8f80685c9e6709c449c49b070946e521a6166 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 8 Sep 2025 22:00:46 +0100 Subject: [PATCH 11/16] Apply suggestions from code review Co-authored-by: Jelle Zijlstra --- peps/pep-0805.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/peps/pep-0805.rst b/peps/pep-0805.rst index 11cdcfb5f44..3dbe8862167 100644 --- a/peps/pep-0805.rst +++ b/peps/pep-0805.rst @@ -27,7 +27,7 @@ PEP 703, Making the Global Interpreter Lock Optional in CPython, and PEP 554, Mu Multiple interpreters are both safe and support parallelism, but they are difficult to use and sharing objects between multiple interpreters without copying is impossible. PEP 703 supports parallel execution and sharing, but is unsafe as it allows race conditions. -Race conditions allow dangerous and hard to find bugs. In the most extreme example, Therac-25 [1]_, a race condtion bug resulted in several fatalities. The trouble with race conditions is not that the bugs they introduce are necessarily worse than other bugs, +Race conditions allow dangerous and hard to find bugs. In the most extreme example, Therac-25 [1]_, a race condition bug resulted in several fatalities. The trouble with race conditions is not that the bugs they introduce are necessarily worse than other bugs, but that they can be very hard to detect and may easily slip through testing. Parallelism, without strong support from the language and runtime, is extremely difficult to get right: @@ -70,7 +70,7 @@ to one thread (the top-left quadrant), or that it is protected by a mutual exclu The PEP also proposes changes to CPython to prevent unsafe execution when mutable objects are shared. Finally, the PEP provides a generalization of the GIL to allow incrementally moving away from the GIL. -This PEP is inspired by ideas from Ocaml, specifically Data Freedom à la Mode [2]_, and the Pyrona project [3]_. +This PEP is inspired by ideas from OCaml, specifically Data Freedom à la Mode [2]_, and the Pyrona project [3]_. Many of the necessary technologies, such as biased and deferred reference counting, have been developed for PEP 703. Specification @@ -100,7 +100,7 @@ Views, iterators and other objects that depend on the internal state of other mu those objects. For example, a ``listiterator`` of a *local* ``list`` will be *local*, but a ``listiterator`` of a *protected* ``list`` will be *protected*. Views and iterators of *immutable* (including STW mutable) objects will be *local* when created. -All objects that not inherently immutable (like tuples or strings) will be created as *local*. +All objects that are not inherently immutable (like tuples or strings) will be created as *local*. These *local* objects can later be made *immutable* or *protected*. Dictionaries and lists will support the ability to be made stop-the-world mutable on a per-object basis. @@ -156,7 +156,7 @@ The ``__freeze__()`` will be added to all pure Python class as well as most stan ``list``, ``set`` and ``dict`` classes will gain a ``__freeze__()`` method. For immutable objects like ``tuple``, ``__freeze__()`` will be supported, but will have no effect. -Note that freezing an object is a shallow operation; ``x.__freeze__()` only freezes ``x`` and not any of the objects +Note that freezing an object is a shallow operation; ``x.__freeze__()`` only freezes ``x`` and not any of the objects that ``x`` refers to. The ``__freeze__`` method can be used to create classes of immutable objects, by calling ``__freeze__`` at the From 49a2e664e2bc87844ecf6d442fb016d0ff8c8245 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Tue, 9 Sep 2025 15:55:10 +0100 Subject: [PATCH 12/16] Apply Hugo's format fixes and spelling corrections Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> --- peps/pep-0805.rst | 57 +++++++++++++++++++---------------------------- 1 file changed, 23 insertions(+), 34 deletions(-) diff --git a/peps/pep-0805.rst b/peps/pep-0805.rst index 3dbe8862167..8f1d8552bfd 100644 --- a/peps/pep-0805.rst +++ b/peps/pep-0805.rst @@ -4,7 +4,7 @@ Author: Mark Shannon Discussions-To: Pending Status: Draft Type: Standards Track -Created: 08-Sep-2025 +Python-Version: 3.15 Abstract ======== @@ -23,11 +23,11 @@ Traditionally, CPython has executed in only a single thread of execution at once This has always been seen as a limitation of Python and there has been a desire for Python to support parallel execution for many years. -PEP 703, Making the Global Interpreter Lock Optional in CPython, and PEP 554, Multiple Interpreters in the Stdlib, offer ways to support parallelism. +:pep:`703`, Making the Global Interpreter Lock Optional in CPython, and :pep:`554`, Multiple Interpreters in the Stdlib, offer ways to support parallelism. Multiple interpreters are both safe and support parallelism, but they are difficult to use and sharing objects between multiple interpreters without copying is impossible. PEP 703 supports parallel execution and sharing, but is unsafe as it allows race conditions. -Race conditions allow dangerous and hard to find bugs. In the most extreme example, Therac-25 [1]_, a race condition bug resulted in several fatalities. The trouble with race conditions is not that the bugs they introduce are necessarily worse than other bugs, +Race conditions allow dangerous and hard to find bugs. In the most extreme example, `Therac-25 `__, a race condition bug resulted in several fatalities. The trouble with race conditions is not that the bugs they introduce are necessarily worse than other bugs, but that they can be very hard to detect and may easily slip through testing. Parallelism, without strong support from the language and runtime, is extremely difficult to get right: @@ -71,7 +71,7 @@ The PEP also proposes changes to CPython to prevent unsafe execution when mutabl Finally, the PEP provides a generalization of the GIL to allow incrementally moving away from the GIL. This PEP is inspired by ideas from OCaml, specifically Data Freedom à la Mode [2]_, and the Pyrona project [3]_. -Many of the necessary technologies, such as biased and deferred reference counting, have been developed for PEP 703. +Many of the necessary technologies, such as biased and deferred reference counting, have been developed for :pep:`703`. Specification ============= @@ -85,7 +85,7 @@ The state can be queried by looking at the ``__shared__`` attribute of an object An object's ``__shared__`` state can be one of the following: * Immutable: Cannot be modified, and can be safely shared between threads. -* Local: Only visible to a single thread*, and can be freely mutated by that thread. +* Local: Only visible to a single thread\ :sup:`*`, and can be freely mutated by that thread. * Stop-the-world mutable. Mostly immutable, but can be mutated by acquiring the stop-the-world lock. * Protected: Object is mutable, and is protected by a mutex. * Synchronized: A special state for some builtin objects. @@ -98,14 +98,14 @@ The ``__shared__`` attribute is read-only. All classes, modules and functions will be *stop-the-world mutable* initially, but can be made *immutable*. Views, iterators and other objects that depend on the internal state of other mutable objects will inherit the state of those objects. For example, a ``listiterator`` of a *local* ``list`` will be *local*, but a ``listiterator`` of a *protected* -``list`` will be *protected*. Views and iterators of *immutable* (including STW mutable) objects will be *local* when created. +``list`` will be *protected*. Views and iterators of *immutable* (including stop-the-world mutable) objects will be *local* when created. All objects that are not inherently immutable (like tuples or strings) will be created as *local*. These *local* objects can later be made *immutable* or *protected*. Dictionaries and lists will support the ability to be made stop-the-world mutable on a per-object basis. This is to maintain the necessary immutability for performance and mutability for -backwards campatibility for objects like ``sys.modules`` and ``sys.path``. +backwards compatibility for objects like ``sys.modules`` and ``sys.path``. The stop-the-world lock ----------------------- @@ -123,11 +123,11 @@ SuperThread objects A new class ``SuperThread`` will be added to help port applications using the GIL and sub-interpreters. All threads sharing a ``SuperThread`` will be serialized, in the same way as all threads are currently serialized by the GIL. -``SuperThread``\ s offer much the same capabilites as sub-interpreters, +``SuperThread``\ s offer much the same capabilities as sub-interpreters, but more efficiently and with the ability to share more objects. There is a many-to-one relationship between threads and ``SuperThread``\ s. -If no super thread is explictly specified when creating a thread, +If no super thread is explicitly specified when creating a thread, a new super thread will be created specifically for that thread. The super thread of a thread cannot be changed. @@ -215,7 +215,7 @@ Passing mutable values between threads '''''''''''''''''''''''''''''''''''''' The ``Channel`` class is provided for passing objects from one thread to another. -The ``Channel`` class acts like a ``deque`` but handles tranferring of ownership local objects. +The ``Channel`` class acts like a ``deque`` but handles transferring of ownership local objects. When passing a *local* object, ``channel.put(obj)`` detaches the object ``obj`` from the current thread. When passing a *local* object, ``Channel.put()`` will fail, raising a ``ValueError``, if there are any other references to the argument. ``Channel.get()`` returns the object passed but to ``Channel.put()``, making the calling @@ -234,7 +234,7 @@ the ``sys`` module is deleted. The main thread's ``SuperThread`` will be the GIL If the environment variable ``PYTHONGIL=1`` is set, then all new threads will default to ``super_thread = sys.gil``. Otherwise all new threads will default to ``super_thread = None``. -Explictly setting the ``super_thread`` argument when creating a thread will override these defaults. +Explicitly setting the ``super_thread`` argument when creating a thread will override these defaults. Deadlock detection ------------------ @@ -265,14 +265,14 @@ If we do that, then all other operations become safe. | All other operations | Yes | Yes | N/A | Yes | Yes | +------------------------+-----------+-----------------+-----------------+---------------+----------------+ -1. If the mutex held by the thread matches the mutex that protects the object +1. If the mutex held by the thread matches the mutex that protects the object. 2. If supported for that class. 3. The argument to ``__protect__`` must the sole reference to the object. ABI breakage ------------ -This PEP will require a one time ABI breakage, much like PEP 703, as the ``PyObject`` struct will need to be changed. +This PEP will require a one time ABI breakage, much like :pep:`703`, as the ``PyObject`` struct will need to be changed. Deferred reclamation -------------------- @@ -282,7 +282,7 @@ In other words, they may not be reclaimed immediately that their are no more ref This is because these objects may be referred to from several threads simultaneously, and the overhead of serializing the reference count operations would be too high. -PEP 703 also does this. +:pep:`703` also does this. Local objects, visible to only one thread, will still be reclaimed immediately that they are no longer referenced. @@ -370,7 +370,7 @@ How to Teach This ================= In order to run code in parallel, some understanding of the model of execution will -be needed. Writing unsafe code is much harder than under PEP 703, but the +be needed. Writing unsafe code is much harder than under :pep:`703`, but the new exceptions may surprise users. Extensive documentation will be needed. Examples @@ -444,7 +444,7 @@ inherits the protection from the ``list`` object. Comparison to PEP 703 (Making the Global Interpreter Lock Optional in CPython) ============================================================================== -This PEP should be thought of as building on PEP 703, rather than competing with or replacing it. +This PEP should be thought of as building on :pep:`703`, rather than competing with or replacing it. Many of the mechanisms needed to implement this PEP have been developed for PEP 703. What PEP 703 lacks is well defined semantics for what happens when race conditions are present, @@ -454,10 +454,10 @@ PEP 703 attempts to provide single-threaded performance for lists, dictionaries, and other mutable objects while providing "reasonable" thread safety. Unfortunately, no formal definition of expected behavior is provided, which leads to issues like these: -* -* -* -* +* `python/cpython#129619 `__ +* `python/cpython#129139 `__ +* `python/cpython#126559 `__ +* `python/cpython#130744 `__ It is the author's opinion that attempting to preserve single-threaded performance for mutable objects *and* any sort of thread safe parallel execution for the same object is wishful thinking. @@ -475,7 +475,7 @@ Object state Recording object state requires space in the object header, at least 3 bits but no more than a byte. Each object also needs additional space to refer to its thread, or protecting mutex. -With these fields, the ``PyObject`` header should be the smaller than is currently implemented for PEP 703, +With these fields, the ``PyObject`` header should be the smaller than is currently implemented for :pep:`703`, although larger than for the default (with GIL) build. A possible object header: @@ -543,11 +543,11 @@ Possible future enhancements Deep freezing and deep transfers -------------------------------- -Freezing a single object could leave a frozen object with references to mutable objects, and transfering of single objects could leave an object local to one thread, while other objects that it refers to are local to a different thread. +Freezing a single object could leave a frozen object with references to mutable objects, and transferring of single objects could leave an object local to one thread, while other objects that it refers to are local to a different thread. Either of these scanarios are likely to lead to runtime errors. To avoid that problem we need "deep" freezing. Deep freezing an object would freeze that object and the transitive closure of other mutable objects referred to by that object. -Deep transfering an object would transfer that object and the transitive closure of other local objects referred to by that object, +Deep transferring an object would transfer that object and the transitive closure of other local objects referred to by that object, but would raise an exception if one of the those objects belonged to a different thread. Similar to freezing, a "deep" put mechanism could be added to ``Channel``\ s to move a whole graph of objects from one thread @@ -565,17 +565,6 @@ Open Issues [Any points that are still being decided/discussed.] -Footnotes -========= - -.. [1] https://en.wikipedia.org/wiki/Therac-25 - -.. [2] https://dl.acm.org/doi/10.1145/3704859 - -.. [3] https://wrigstad.com/pldi2025.pdf - - - Copyright ========= From 3fdd98766817da3dc920d42d688048350e0c651a Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Tue, 9 Sep 2025 15:58:04 +0100 Subject: [PATCH 13/16] Fix links --- peps/pep-0805.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/peps/pep-0805.rst b/peps/pep-0805.rst index 8f1d8552bfd..290081ac8d3 100644 --- a/peps/pep-0805.rst +++ b/peps/pep-0805.rst @@ -70,7 +70,8 @@ to one thread (the top-left quadrant), or that it is protected by a mutual exclu The PEP also proposes changes to CPython to prevent unsafe execution when mutable objects are shared. Finally, the PEP provides a generalization of the GIL to allow incrementally moving away from the GIL. -This PEP is inspired by ideas from OCaml, specifically Data Freedom à la Mode [2]_, and the Pyrona project [3]_. +This PEP is inspired by ideas from OCaml, specifically `Data Freedom à la Mode`__, +and the `Pyrona project`__. Many of the necessary technologies, such as biased and deferred reference counting, have been developed for :pep:`703`. Specification From 6ef78e1847d82815e053df2ff789a1a390aa6fb0 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Tue, 9 Sep 2025 16:18:33 +0100 Subject: [PATCH 14/16] Add back creation date --- peps/pep-0805.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/peps/pep-0805.rst b/peps/pep-0805.rst index 290081ac8d3..bc53e52b6e3 100644 --- a/peps/pep-0805.rst +++ b/peps/pep-0805.rst @@ -4,6 +4,7 @@ Author: Mark Shannon Discussions-To: Pending Status: Draft Type: Standards Track +Created: 08-Sep-2025 Python-Version: 3.15 Abstract From 7d99716dde4ecbbd415cc0a49c16cb47d961acee Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Tue, 9 Sep 2025 16:32:54 +0100 Subject: [PATCH 15/16] Add some spaces. Sphinx likes spaces --- peps/pep-0805.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/peps/pep-0805.rst b/peps/pep-0805.rst index bc53e52b6e3..51525908e49 100644 --- a/peps/pep-0805.rst +++ b/peps/pep-0805.rst @@ -71,8 +71,8 @@ to one thread (the top-left quadrant), or that it is protected by a mutual exclu The PEP also proposes changes to CPython to prevent unsafe execution when mutable objects are shared. Finally, the PEP provides a generalization of the GIL to allow incrementally moving away from the GIL. -This PEP is inspired by ideas from OCaml, specifically `Data Freedom à la Mode`__, -and the `Pyrona project`__. +This PEP is inspired by ideas from OCaml, specifically `Data Freedom à la Mode `__, +and the `Pyrona project `__. Many of the necessary technologies, such as biased and deferred reference counting, have been developed for :pep:`703`. Specification From 78261bd4d389d80e0de9375331b146327d0c6a27 Mon Sep 17 00:00:00 2001 From: Pablo Galindo Salgado Date: Thu, 2 Oct 2025 23:17:57 +0100 Subject: [PATCH 16/16] Update peps/pep-0805.rst Co-authored-by: Truls Henriksson --- peps/pep-0805.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0805.rst b/peps/pep-0805.rst index 51525908e49..206e758491c 100644 --- a/peps/pep-0805.rst +++ b/peps/pep-0805.rst @@ -45,7 +45,7 @@ Rationale We want to allow a familiar model of parallel execution while retaining safety. Threads, locks, queues and immutability are familiar concepts and provide the building blocks for a safe model of execution. -Objects should either be safe for sharing between trhreads, or the VM should prevent them from being shared. +Objects should either be safe for sharing between threads, or the VM should prevent them from being shared. The synchronization quadrant diagram ------------------------------------