-
-
Notifications
You must be signed in to change notification settings - Fork 33.3k
gh-140374: Add glossary entries related to multithreading #140375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
64157d0
0825a98
3f44be0
568d201
22cf4f6
bd7b731
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -134,6 +134,13 @@ Glossary | |||||||||
| iterator's :meth:`~object.__anext__` method until it raises a | ||||||||||
| :exc:`StopAsyncIteration` exception. Introduced by :pep:`492`. | ||||||||||
|
|
||||||||||
| atomic operation | ||||||||||
| An operation that completes as a single indivisible unit without | ||||||||||
| interruption from other threads. Atomic operations are critical for | ||||||||||
| :term:`thread-safe` programming because they cannot be observed in a | ||||||||||
| partially completed state by other threads. See also | ||||||||||
| :term:`race condition` and :term:`data race`. | ||||||||||
|
|
||||||||||
| attached thread state | ||||||||||
|
|
||||||||||
| A :term:`thread state` that is active for the current OS thread. | ||||||||||
|
|
@@ -289,6 +296,21 @@ Glossary | |||||||||
| advanced mathematical feature. If you're not aware of a need for them, | ||||||||||
| it's almost certain you can safely ignore them. | ||||||||||
|
|
||||||||||
| concurrency | ||||||||||
| The ability of different parts of a program to be executed out-of-order | ||||||||||
| or in partial order without affecting the outcome. This allows for | ||||||||||
| multiple tasks to make progress during overlapping time periods, though | ||||||||||
| not necessarily simultaneously. In Python, concurrency can be achieved | ||||||||||
| through :mod:`threading` (using OS threads), :mod:`asyncio` (cooperative | ||||||||||
| multitasking), or :mod:`multiprocessing` (separate processes). | ||||||||||
| See also :term:`parallelism`. | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't like the "out-of-order" definition. Concurrency is about things happening or being performed at the same time (in both computing and non-computing contexts). Something like: The ability of a computer program to perform multiple tasks at the same time. Python provides libraries for writing programs that make use of different forms of concurrency. :mod: |
||||||||||
|
|
||||||||||
| concurrent modification | ||||||||||
| When multiple threads modify shared data at the same time without | ||||||||||
| proper synchronization. Concurrent modification can cause | ||||||||||
| :term:`race conditions <race condition>`, and might also trigger a | ||||||||||
| :term:`data race <data race>`, data corruption, or both. | ||||||||||
|
|
||||||||||
| context | ||||||||||
| This term has different meanings depending on where and how it is used. | ||||||||||
| Some common meanings: | ||||||||||
|
|
@@ -343,6 +365,15 @@ Glossary | |||||||||
| :keyword:`async with` keywords. These were introduced | ||||||||||
| by :pep:`492`. | ||||||||||
|
|
||||||||||
| critical section | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we should define critical section, at least not for now. The way we use I'd rather talk about those details in the C API docs instead of the Python glossary. |
||||||||||
| A section of code that accesses shared resources and must not be | ||||||||||
| executed by multiple threads simultaneously. Critical sections are | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this should emphasize that critical sections are purely a concept in the C API and aren't exposed in the python language There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
| typically protected using :term:`locks <lock>` or other | ||||||||||
| :term:`synchronization primitives <synchronization primitive>` to | ||||||||||
| ensure :term:`thread-safe` access. Critical section are purely a concept | ||||||||||
| in the C API and are not exposed in Python. See also :term:`lock` and | ||||||||||
| :term:`race condition`. | ||||||||||
|
|
||||||||||
| CPython | ||||||||||
| The canonical implementation of the Python programming language, as | ||||||||||
| distributed on `python.org <https://www.python.org>`_. The term "CPython" | ||||||||||
|
|
@@ -363,6 +394,27 @@ Glossary | |||||||||
| the :term:`cyclic garbage collector <garbage collection>` is to identify these groups and break the reference | ||||||||||
| cycles so that the memory can be reclaimed. | ||||||||||
|
|
||||||||||
| data race | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we are only talking about native code (C or C++ or Rust or whatever), maybe we should leave this out of the glossary for now. |
||||||||||
| A situation where multiple threads access the same memory location | ||||||||||
| concurrently, at least one of the accesses is a write, and the threads | ||||||||||
| do not use any synchronization to control their access. Data races | ||||||||||
| lead to :term:`non-deterministic` behavior and can cause data corruption. | ||||||||||
| Proper use of :term:`locks <lock>` and other :term:`synchronization primitives | ||||||||||
| <synchronization primitive>` prevents data races. Note that data races | ||||||||||
| can only happen in native code, but that :term:`native code` might be | ||||||||||
| exposed in a Python API. See also :term:`race condition` and | ||||||||||
| :term:`thread-safe`. | ||||||||||
|
|
||||||||||
| deadlock | ||||||||||
| A situation where two or more threads are unable to proceed because | ||||||||||
| each is waiting for the other to release a resource. For example, | ||||||||||
| if thread A holds lock 1 and waits for lock 2, while thread B holds | ||||||||||
| lock 2 and waits for lock 1, both threads will wait indefinitely. Any | ||||||||||
| program that makes blocking calls using more than one lock is possibly | ||||||||||
| susceptible to deadlocks. Deadlocks can be avoided by always acquiring | ||||||||||
| multiple :term:`locks <lock>` in a consistent order or by using | ||||||||||
| timeout-based locking. See also :term:`lock` and :term:`reentrant`. | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. First sentence: A situation in which two or more tasks (threads, processes, or coroutines) wait indefinitely for each other to release resources or complete actions, preventing any from making progress. And maybe: In Python this often arises from acquiring multiple locks in conflicting orders or from circular join()/await dependencies.
This is too strong.
I'd get rid of this. It may be true in some sense, but I don't know of a situation where that would actually be useful advice. |
||||||||||
|
|
||||||||||
| decorator | ||||||||||
| A function returning another function, usually applied as a function | ||||||||||
| transformation using the ``@wrapper`` syntax. Common examples for | ||||||||||
|
|
@@ -662,6 +714,17 @@ Glossary | |||||||||
| requires the GIL to be held in order to use it. This refers to having an | ||||||||||
| :term:`attached thread state`. | ||||||||||
|
|
||||||||||
| global state | ||||||||||
| Data that is accessible throughout a program, such as module-level | ||||||||||
| variables, class variables, or C static variables in :term:`extension modules | ||||||||||
| <extension module>`. In multi-threaded programs, global state shared | ||||||||||
| between threads typically requires synchronization to avoid | ||||||||||
| :term:`race conditions <race condition>` and | ||||||||||
| :term:`data races <data race>`. In the | ||||||||||
| :term:`free-threaded <free threading>` build, :term:`per-module state` | ||||||||||
| is often preferred over global state for C extension modules. | ||||||||||
| See also :term:`per-module state`. | ||||||||||
|
|
||||||||||
| hash-based pyc | ||||||||||
| A bytecode cache file that uses the hash rather than the last-modified | ||||||||||
| time of the corresponding source file to determine its validity. See | ||||||||||
|
|
@@ -706,7 +769,9 @@ Glossary | |||||||||
| tuples. Such an object cannot be altered. A new object has to | ||||||||||
| be created if a different value has to be stored. They play an important | ||||||||||
| role in places where a constant hash value is needed, for example as a key | ||||||||||
| in a dictionary. | ||||||||||
| in a dictionary. Immutable objects are inherently :term:`thread-safe` | ||||||||||
| because their state cannot be modified after creation, eliminating concerns | ||||||||||
| about :term:`concurrent modification`. | ||||||||||
|
|
||||||||||
| import path | ||||||||||
| A list of locations (or :term:`path entries <path entry>`) that are | ||||||||||
|
|
@@ -796,8 +861,9 @@ Glossary | |||||||||
|
|
||||||||||
| CPython does not consistently apply the requirement that an iterator | ||||||||||
| define :meth:`~iterator.__iter__`. | ||||||||||
| And also please note that the free-threading CPython does not guarantee | ||||||||||
| the thread-safety of iterator operations. | ||||||||||
| And also please note that :term:`free-threaded <free threading>` | ||||||||||
| CPython does not guarantee :term:`thread-safe` behavior of iterator | ||||||||||
| operations. | ||||||||||
|
|
||||||||||
|
|
||||||||||
| key function | ||||||||||
|
|
@@ -835,10 +901,11 @@ Glossary | |||||||||
| :keyword:`if` statements. | ||||||||||
|
|
||||||||||
| In a multi-threaded environment, the LBYL approach can risk introducing a | ||||||||||
| race condition between "the looking" and "the leaping". For example, the | ||||||||||
| code, ``if key in mapping: return mapping[key]`` can fail if another | ||||||||||
| :term:`race condition` between "the looking" and "the leaping". For example, | ||||||||||
| the code, ``if key in mapping: return mapping[key]`` can fail if another | ||||||||||
| thread removes *key* from *mapping* after the test, but before the lookup. | ||||||||||
| This issue can be solved with locks or by using the EAFP approach. | ||||||||||
| This issue can be solved with :term:`locks <lock>` or by using the | ||||||||||
| :term:`EAFP` approach. See also :term:`thread-safe`. | ||||||||||
|
|
||||||||||
| lexical analyzer | ||||||||||
|
|
||||||||||
|
|
@@ -857,6 +924,20 @@ Glossary | |||||||||
| clause is optional. If omitted, all elements in ``range(256)`` are | ||||||||||
| processed. | ||||||||||
|
|
||||||||||
| lock | ||||||||||
| A :term:`synchronization primitive` that allows only one thread at a | ||||||||||
| time to access a shared resource. A thread must acquire a lock before | ||||||||||
| accessing the protected resource and release it afterward. If a thread | ||||||||||
| attempts to acquire a lock that is already held by another thread, it | ||||||||||
| will block until the lock becomes available. Python's :mod:`threading` | ||||||||||
| module provides :class:`~threading.Lock` (a basic lock) and | ||||||||||
| :class:`~threading.RLock` (a :term:`reentrant` lock). Locks are used | ||||||||||
| to prevent :term:`race conditions <race condition>` and ensure | ||||||||||
| :term:`thread-safe` access to shared data. Alternative design patterns | ||||||||||
| to locks exist such as queues, producer/consumer patterns, and | ||||||||||
| thread-local state. See also :term:`critical section`, :term:`deadlock`, | ||||||||||
| and :term:`reentrant`. | ||||||||||
|
|
||||||||||
| loader | ||||||||||
| An object that loads a module. | ||||||||||
| It must define the :meth:`!exec_module` and :meth:`!create_module` methods | ||||||||||
|
|
@@ -942,8 +1023,11 @@ Glossary | |||||||||
| See :term:`method resolution order`. | ||||||||||
|
|
||||||||||
| mutable | ||||||||||
| Mutable objects can change their value but keep their :func:`id`. See | ||||||||||
| also :term:`immutable`. | ||||||||||
| An :term:`object` with state that is allowed to change during the course | ||||||||||
| of the program. In multi-threaded programs, mutable objects that are | ||||||||||
| shared between threads require careful synchronization to avoid | ||||||||||
| :term:`concurrent modification` issues. See also :term:`immutable`, | ||||||||||
| :term:`thread-safe`, and :term:`concurrent modification`. | ||||||||||
|
|
||||||||||
| named tuple | ||||||||||
| The term "named tuple" applies to any type or class that inherits from | ||||||||||
|
|
@@ -995,6 +1079,13 @@ Glossary | |||||||||
|
|
||||||||||
| See also :term:`module`. | ||||||||||
|
|
||||||||||
| native code | ||||||||||
| Code that is compiled to machine instructions and runs directly on the | ||||||||||
| processor, as opposed to code that is interpreted or runs in a virtual | ||||||||||
| machine. In the context of Python, native code typically refers to | ||||||||||
| C, C++, Rust of Fortran code in :term:`extension modules <extension module>` | ||||||||||
| that can be called from Python. See also :term:`extension module`. | ||||||||||
|
|
||||||||||
| nested scope | ||||||||||
| The ability to refer to a variable in an enclosing definition. For | ||||||||||
| instance, a function defined inside another function can refer to | ||||||||||
|
|
@@ -1011,6 +1102,15 @@ Glossary | |||||||||
| properties, :meth:`~object.__getattribute__`, class methods, and static | ||||||||||
| methods. | ||||||||||
|
|
||||||||||
| non-deterministic | ||||||||||
| Behavior where the outcome of a program can vary between executions with | ||||||||||
| the same inputs. In multi-threaded programs, non-deterministic behavior | ||||||||||
| often results from :term:`race conditions <race condition>` where the | ||||||||||
| relative timing or interleaving of threads affects the result. | ||||||||||
| Proper synchronization using :term:`locks <lock>` and other | ||||||||||
| :term:`synchronization primitives <synchronization primitive>` helps | ||||||||||
| ensure deterministic behavior. | ||||||||||
|
|
||||||||||
| object | ||||||||||
| Any data with state (attributes or value) and defined behavior | ||||||||||
| (methods). Also the ultimate base class of any :term:`new-style | ||||||||||
|
|
@@ -1032,6 +1132,16 @@ Glossary | |||||||||
|
|
||||||||||
| See also :term:`regular package` and :term:`namespace package`. | ||||||||||
|
|
||||||||||
| parallelism | ||||||||||
| The simultaneous execution of operations on multiple processors. | ||||||||||
| True parallelism requires multiple processors or processor cores where | ||||||||||
| operations run at exactly the same time and are not just interleaved. | ||||||||||
| In Python, the :term:`free-threaded <free threading>` build enables | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe something like: Executing multiple operations at the same time (e.g., on multiple CPU cores). In Python builds with the |
||||||||||
| parallelism for multi-threaded programs that access state stored in the | ||||||||||
| interpreter by disabling the :term:`global interpreter lock`. The | ||||||||||
| :mod:`multiprocessing` module also enables parallelism by using separate | ||||||||||
| processes. See also :term:`concurrency`. | ||||||||||
|
|
||||||||||
| parameter | ||||||||||
| A named entity in a :term:`function` (or method) definition that | ||||||||||
| specifies an :term:`argument` (or in some cases, arguments) that the | ||||||||||
|
|
@@ -1116,6 +1226,13 @@ Glossary | |||||||||
| :class:`str` or :class:`bytes` result instead, respectively. Introduced | ||||||||||
| by :pep:`519`. | ||||||||||
|
|
||||||||||
| per-module state | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't we elsewhere define global state as including per-module state? |
||||||||||
| State that is stored separately for each instance of a module, rather | ||||||||||
| than in :term:`global state`. Per-module state is accessed through the | ||||||||||
| module object rather than through C static variables. | ||||||||||
| See :ref:`isolating-extensions-howto` for more information. See also | ||||||||||
| :term:`global state`. | ||||||||||
|
|
||||||||||
| PEP | ||||||||||
| Python Enhancement Proposal. A PEP is a design document | ||||||||||
| providing information to the Python community, or describing a new | ||||||||||
|
|
@@ -1206,6 +1323,18 @@ Glossary | |||||||||
| >>> email.mime.text.__name__ | ||||||||||
| 'email.mime.text' | ||||||||||
|
|
||||||||||
| race condition | ||||||||||
| A condition of a program where the its behavior | ||||||||||
| depends on the relative timing or ordering of events, particularly in | ||||||||||
| multi-threaded programs. Race conditions can lead to | ||||||||||
| :term:`non-deterministic` behavior and bugs that are difficult to | ||||||||||
| reproduce. A :term:`data race` is a specific type of race condition | ||||||||||
| involving unsynchronized access to shared memory. The :term:`LBYL` | ||||||||||
| coding style is particularly susceptible to race conditions in | ||||||||||
| multi-threaded code. Using :term:`locks <lock>` and other | ||||||||||
| :term:`synchronization primitives <synchronization primitive>` | ||||||||||
| helps prevent race conditions. | ||||||||||
|
|
||||||||||
| reference count | ||||||||||
| The number of references to an object. When the reference count of an | ||||||||||
| object drops to zero, it is deallocated. Some objects are | ||||||||||
|
|
@@ -1227,6 +1356,25 @@ Glossary | |||||||||
|
|
||||||||||
| See also :term:`namespace package`. | ||||||||||
|
|
||||||||||
| reentrant | ||||||||||
| A property of a function or :term:`lock` that allows it to be called or | ||||||||||
| acquired multiple times by the same thread without causing errors or a | ||||||||||
| :term:`deadlock`. | ||||||||||
|
|
||||||||||
| For functions, reentrancy means the function can be safely called again | ||||||||||
| before a previous invocation has completed, which is important when | ||||||||||
| functions may be called recursively or from signal handlers. Thread-unsafe | ||||||||||
| functions may be :term:`non-deterministic` if they're called reentrantly in a | ||||||||||
| multithreaded program. | ||||||||||
|
|
||||||||||
| For locks, Python's :class:`threading.RLock` (reentrant lock) is | ||||||||||
| reentrant, meaning a thread that already holds the lock can acquire it | ||||||||||
| again without blocking. In contrast, :class:`threading.Lock` is not | ||||||||||
| reentrant - attempting to acquire it twice from the same thread will cause | ||||||||||
| a deadlock. | ||||||||||
|
|
||||||||||
| See also :term:`lock` and :term:`deadlock`. | ||||||||||
|
|
||||||||||
| REPL | ||||||||||
| An acronym for the "read–eval–print loop", another name for the | ||||||||||
| :term:`interactive` interpreter shell. | ||||||||||
|
|
@@ -1331,6 +1479,19 @@ Glossary | |||||||||
|
|
||||||||||
| See also :term:`borrowed reference`. | ||||||||||
|
|
||||||||||
| synchronization primitive | ||||||||||
| A basic building block for coordinating the execution of multiple threads | ||||||||||
| to ensure :term:`thread-safe` access to shared resources. Python's | ||||||||||
| :mod:`threading` module provides several synchronization primitives | ||||||||||
| including :class:`~threading.Lock`, :class:`~threading.RLock`, | ||||||||||
| :class:`~threading.Semaphore`, :class:`~threading.Condition`, | ||||||||||
| :class:`~threading.Event`, and :class:`~threading.Barrier`. Additionally, | ||||||||||
| the :mod:`queue` module provides multi-producer, multi-consumer queues | ||||||||||
| that are especially useful in multithreaded programs. These | ||||||||||
| primitives help prevent :term:`race conditions <race condition>` and | ||||||||||
| coordinate thread execution. See also :term:`lock` and | ||||||||||
| :term:`critical section`. | ||||||||||
|
|
||||||||||
| t-string | ||||||||||
| t-strings | ||||||||||
| String literals prefixed with ``t`` or ``T`` are commonly called | ||||||||||
|
|
@@ -1383,6 +1544,19 @@ Glossary | |||||||||
| See :ref:`Thread State and the Global Interpreter Lock <threads>` for more | ||||||||||
| information. | ||||||||||
|
|
||||||||||
| thread-safe | ||||||||||
| Code that functions correctly when accessed by multiple threads | ||||||||||
| concurrently. Thread-safe code uses appropriate | ||||||||||
|
Comment on lines
+1548
to
+1549
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd like to rework the first sentence a bit. We talk about thread-safe modules and data structures (classes), not just "code" Maybe something like: A module, function, or class that behaves correctly when used by multiple threads concurrently. |
||||||||||
| :term:`synchronization primitives <synchronization primitive>` like | ||||||||||
| :term:`locks <lock>` to protect shared mutable state, or is designed | ||||||||||
| to avoid shared mutable state entirely. In the | ||||||||||
| :term:`free-threaded <free threading>` build, built-in types like | ||||||||||
| :class:`dict`, :class:`list`, and :class:`set` use internal locking | ||||||||||
| to make many operations thread-safe, although thread safety is not | ||||||||||
| necessarily guaranteed. Code that is not thread-safe may experience | ||||||||||
| :term:`race conditions <race condition>` and :term:`data races <data race>` | ||||||||||
| when used in multi-threaded programs. | ||||||||||
|
|
||||||||||
| token | ||||||||||
|
|
||||||||||
| A small unit of source code, generated by the | ||||||||||
|
|
||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen "interrupt" in various definitions online, but I think "interrupt" is confusing in this context. I'm not really sure what it refers to here and the granularity of "atomic" vs. "interrupt" can be different. For example, a thread performing an atomic operation with locks can rescheduled (interrupted) by the OS without breaking atomicity.
I like this definition (from ChatGPT):
An operation that appears to execute as a single, indivisible step: no other thread can observe it half-done, and its effects become visible all at once. Python does not guarantee that ordinary high-level statements are atomic (for example, x += 1 performs multiple bytecode operations and is not atomic). Atomicity is only guaranteed where explicitly documented (e.g., operations performed while holding a lock, or methods of synchronization primitives such as those in threading and queue).