Skip to content
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
182 changes: 174 additions & 8 deletions Doc/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,16 @@ Glossary
iterator's :meth:`~object.__anext__` method until it raises a
:exc:`StopAsyncIteration` exception. Introduced by :pep:`492`.

atomic operation
An operation that completes as a single indivisible unit without
interruption from other threads. Atomic operations are critical for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen "interrupt" in various definitions online, but I think "interrupt" is confusing in this context. I'm not really sure what it refers to here and the granularity of "atomic" vs. "interrupt" can be different. For example, a thread performing an atomic operation with locks can rescheduled (interrupted) by the OS without breaking atomicity.

I like this definition (from ChatGPT):

An operation that appears to execute as a single, indivisible step: no other thread can observe it half-done, and its effects become visible all at once. Python does not guarantee that ordinary high-level statements are atomic (for example, x += 1 performs multiple bytecode operations and is not atomic). Atomicity is only guaranteed where explicitly documented (e.g., operations performed while holding a lock, or methods of synchronization primitives such as those in threading and queue).

:term:`thread-safe` programming because they cannot be observed in a
partially completed state by other threads. In the
:term:`free-threaded <free threading>` build, elementary operations
should generally be assumed to be atomic unless the documentation
explicitly states otherwise. See also :term:`race condition` and
:term:`data race`.

attached thread state

A :term:`thread state` that is active for the current OS thread.
Expand Down Expand Up @@ -289,6 +299,20 @@ Glossary
advanced mathematical feature. If you're not aware of a need for them,
it's almost certain you can safely ignore them.

concurrency
The ability of different parts of a program to be executed out-of-order
or in partial order without affecting the outcome. This allows for
multiple tasks to make progress during overlapping time periods, though
not necessarily simultaneously. In Python, concurrency can be achieved
through :mod:`threading` (using OS threads), :mod:`asyncio` (cooperative
multitasking), or :mod:`multiprocessing` (separate processes).
See also :term:`parallelism`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the "out-of-order" definition. Concurrency is about things happening or being performed at the same time (in both computing and non-computing contexts).

Something like:

The ability of a computer program to perform multiple tasks at the same time. Python provides libraries for writing programs that make use of different forms of concurrency. :mod:asyncio is a library for dealing with asynchronous tasks and coroutines. :mod:threading provides access to operating system threads and :mod:multiprocessing to operating system processes. Multi-core processors can execute threads and processes on different CPU cores at the same time (see :term:parallelism).


concurrent modification
When multiple threads modify shared data at the same time without
proper synchronization. Concurrent modification can lead to
:term:`data races <data race>` and corrupted data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
proper synchronization. Concurrent modification can lead to
:term:`data races <data race>` and corrupted data.
proper synchronization. Concurrent modification cause
:term:`race conditions <race condition>`, and might also trigger a
:term:`data race <data race>`, data corruption, or both.

I think concurrent modification without synchronization is a synonym for a race condition? So I want to link to that term in this term. I also think a data race doesn't necessarily imply data corruption, so I wanted to be a little more measured about that.


context
This term has different meanings depending on where and how it is used.
Some common meanings:
Expand Down Expand Up @@ -343,6 +367,15 @@ Glossary
:keyword:`async with` keywords. These were introduced
by :pep:`492`.

critical section
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should define critical section, at least not for now. The way we use Py_BEGIN_CRITICAL_SECTION in the C API strays from the classical definition of a "critical section" because it can be suspended/interrupted.

I'd rather talk about those details in the C API docs instead of the Python glossary.

A section of code that accesses shared resources and must not be
executed by multiple threads simultaneously. Critical sections are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should emphasize that critical sections are purely a concept in the C API and aren't exposed in the python language

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
executed by multiple threads simultaneously. Critical sections are
executed by multiple threads simultaneously. Critical sections are
purely a concept in the C API and are not exposed in Python.
Critical sections are

typically protected using :term:`locks <lock>` or other
:term:`synchronization primitives <synchronization primitive>` to
ensure :term:`thread-safe` access. Critical section are purely a concept
in the C API and are not exposed in Python. See also :term:`lock` and
:term:`race condition`.

CPython
The canonical implementation of the Python programming language, as
distributed on `python.org <https://www.python.org>`_. The term "CPython"
Expand All @@ -363,6 +396,25 @@ Glossary
the :term:`cyclic garbage collector <garbage collection>` is to identify these groups and break the reference
cycles so that the memory can be reclaimed.

data race
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are only talking about native code (C or C++ or Rust or whatever), maybe we should leave this out of the glossary for now.

A situation where multiple threads access the same memory location
concurrently, at least one of the accesses is a write, and the threads
do not use any synchronization to control their access. Data races
lead to :term:`non-deterministic` behavior and can cause data corruption.
Proper use of :term:`locks <lock>` and other :term:`synchronization primitives
<synchronization primitive>` prevents data races. See also
:term:`race condition` and :term:`thread-safe`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should explain that data races are only possible via extensions and not via pure-python code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to keep the glossary entry generic since adding this may give the false impression that user code can not create a data race.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @willingc that we should keep the glossary entries generic. I've removed references to APIs from other entries as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still worry this is a little vague in terms of what precisely causes a data race and we can maybe make that clearer. Maybe we can include the fact that the read and write needs to be in low-level code, although the low-level issue might be triggered by a high-level Python API.

It's certainly hard to be precise without being confusing to someone who isn't familiar with C extensions.

I worry that the current phrasing implies that two threads racing to update an attribute of a python class is a data race. But it's not, it's a race condition, and no data race happens because there is low-level synchronization in the CPython implementation. I realize the current definition includes this, but I worry that a reader might gather that "synchronization" is always via a threading.Lock or another Python API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "Note that data races can only happen in native code, but that native code might be exposed in a Python API"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that, although maybe "native code" deserves an entry as well...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


deadlock
A situation where two or more threads are unable to proceed because
each is waiting for the other to release a resource. For example,
if thread A holds lock 1 and waits for lock 2, while thread B holds
lock 2 and waits for lock 1, both threads will wait indefinitely. Any
program that makes blocking calls using more than one lock is possibly
susceptible to deadlocks. Deadlocks can be avoided by always acquiring
multiple :term:`locks <lock>` in a consistent order or by using
timeout-based locking. See also :term:`lock` and :term:`reentrant`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First sentence:

A situation in which two or more tasks (threads, processes, or coroutines) wait indefinitely for each other to release resources or complete actions, preventing any from making progress.

And maybe:

In Python this often arises from acquiring multiple locks in conflicting orders or from circular join()/await dependencies.

Any
program that makes blocking calls using more than one lock is possibly
susceptible to deadlocks

This is too strong.

or by using timeout-based locking

I'd get rid of this. It may be true in some sense, but I don't know of a situation where that would actually be useful advice.


decorator
A function returning another function, usually applied as a function
transformation using the ``@wrapper`` syntax. Common examples for
Expand Down Expand Up @@ -662,6 +714,16 @@ Glossary
requires the GIL to be held in order to use it. This refers to having an
:term:`attached thread state`.

global state
Data that is accessible throughout a program, such as module-level
variables, class variables, or C static variables in :term:`extension modules
<extension module>`. In multi-threaded programs, global state shared
between threads typically requires synchronization to avoid
:term:`race conditions <race condition>`. In the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's include both terms.

Suggested change
:term:`race conditions <race condition>`. In the
:term:`race conditions <race condition>` and :term:`data races <data race>`. In the

:term:`free-threaded <free threading>` build, :term:`per-module state`
is often preferred over global state for C extension modules.
See also :term:`per-module state`.

hash-based pyc
A bytecode cache file that uses the hash rather than the last-modified
time of the corresponding source file to determine its validity. See
Expand Down Expand Up @@ -706,7 +768,9 @@ Glossary
tuples. Such an object cannot be altered. A new object has to
be created if a different value has to be stored. They play an important
role in places where a constant hash value is needed, for example as a key
in a dictionary.
in a dictionary. Immutable objects are inherently :term:`thread-safe`
because their state cannot be modified after creation, eliminating concerns
about :term:`concurrent modification`.

import path
A list of locations (or :term:`path entries <path entry>`) that are
Expand Down Expand Up @@ -796,8 +860,9 @@ Glossary

CPython does not consistently apply the requirement that an iterator
define :meth:`~iterator.__iter__`.
And also please note that the free-threading CPython does not guarantee
the thread-safety of iterator operations.
And also please note that :term:`free-threaded <free threading>`
CPython does not guarantee :term:`thread-safe` behavior of iterator
operations.


key function
Expand Down Expand Up @@ -835,10 +900,11 @@ Glossary
:keyword:`if` statements.

In a multi-threaded environment, the LBYL approach can risk introducing a
race condition between "the looking" and "the leaping". For example, the
code, ``if key in mapping: return mapping[key]`` can fail if another
:term:`race condition` between "the looking" and "the leaping". For example,
the code, ``if key in mapping: return mapping[key]`` can fail if another
thread removes *key* from *mapping* after the test, but before the lookup.
This issue can be solved with locks or by using the EAFP approach.
This issue can be solved with :term:`locks <lock>` or by using the
:term:`EAFP` approach. See also :term:`thread-safe`.

lexical analyzer

Expand All @@ -857,6 +923,20 @@ Glossary
clause is optional. If omitted, all elements in ``range(256)`` are
processed.

lock
A :term:`synchronization primitive` that allows only one thread at a
time to access a shared resource. A thread must acquire a lock before
accessing the protected resource and release it afterward. If a thread
attempts to acquire a lock that is already held by another thread, it
will block until the lock becomes available. Python's :mod:`threading`
module provides :class:`~threading.Lock` (a basic lock) and
:class:`~threading.RLock` (a :term:`reentrant` lock). Locks are used
to prevent :term:`race conditions <race condition>` and ensure
:term:`thread-safe` access to shared data. Alternative design patterns
to locks exist such as queues, producer/consumer patterns, and
thread-local state. See also :term:`critical section`, :term:`deadlock`,
and :term:`reentrant`.

loader
An object that loads a module.
It must define the :meth:`!exec_module` and :meth:`!create_module` methods
Expand Down Expand Up @@ -942,8 +1022,11 @@ Glossary
See :term:`method resolution order`.

mutable
Mutable objects can change their value but keep their :func:`id`. See
also :term:`immutable`.
An :term:`object` with state that is allowed to change during the course
of the program. In multi-threaded programs, mutable objects that are
shared between threads require careful synchronization to avoid
:term:`concurrent modification` issues. See also :term:`immutable`,
:term:`thread-safe`, and :term:`concurrent modification`.

named tuple
The term "named tuple" applies to any type or class that inherits from
Expand Down Expand Up @@ -1011,6 +1094,15 @@ Glossary
properties, :meth:`~object.__getattribute__`, class methods, and static
methods.

non-deterministic
Behavior where the outcome of a program can vary between executions with
the same inputs. In multi-threaded programs, non-deterministic behavior
often results from :term:`race conditions <race condition>` where the
relative timing or interleaving of threads affects the result.
Proper synchronization using :term:`locks <lock>` and other
:term:`synchronization primitives <synchronization primitive>` helps
ensure deterministic behavior.

object
Any data with state (attributes or value) and defined behavior
(methods). Also the ultimate base class of any :term:`new-style
Expand All @@ -1032,6 +1124,16 @@ Glossary

See also :term:`regular package` and :term:`namespace package`.

parallelism
The simultaneous execution of multiple operations on different CPU cores.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The simultaneous execution of multiple operations on different CPU cores.
The simultaneous execution of operations on multiple processors.

To allow for GPUs

True parallelism requires multiple processors or processor cores where
operations run at exactly the same time and are not just interleaved.
In Python, the :term:`free-threaded <free threading>` build enables
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like:

Executing multiple operations at the same time (e.g., on multiple CPU cores). In Python builds with the :term:global interpreter lock (GIL), only one thread runs Python bytecode at a time, so taking advantage of multiple CPU cores typically involves multiple processes (e.g., :term:multiprocessing) or native extensions that release the GIL. In :term:free-threaded` Python, multiple Python threads can run Python code simultaneously on different cores.

parallelism for multi-threaded programs that access state stored in the
interpreter by disabling the :term:`global interpreter lock`. The
:mod:`multiprocessing` module also enables parallelism by using separate
processes. See also :term:`concurrency`.

parameter
A named entity in a :term:`function` (or method) definition that
specifies an :term:`argument` (or in some cases, arguments) that the
Expand Down Expand Up @@ -1116,6 +1218,13 @@ Glossary
:class:`str` or :class:`bytes` result instead, respectively. Introduced
by :pep:`519`.

per-module state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we elsewhere define global state as including per-module state?

State that is stored separately for each instance of a module, rather
than in :term:`global state`. Per-module state is accessed through the
module object rather than through C static variables.
See :ref:`isolating-extensions-howto` for more information. See also
:term:`global state`.

PEP
Python Enhancement Proposal. A PEP is a design document
providing information to the Python community, or describing a new
Expand Down Expand Up @@ -1206,6 +1315,18 @@ Glossary
>>> email.mime.text.__name__
'email.mime.text'

race condition
A condition of a program where the its behavior
depends on the relative timing or ordering of events, particularly in
multi-threaded programs. Race conditions can lead to
:term:`non-deterministic` behavior and bugs that are difficult to
reproduce. A :term:`data race` is a specific type of race condition
involving unsynchronized access to shared memory. The :term:`LBYL`
coding style is particularly susceptible to race conditions in
multi-threaded code. Using :term:`locks <lock>` and other
:term:`synchronization primitives <synchronization primitive>`
helps prevent race conditions.

reference count
The number of references to an object. When the reference count of an
object drops to zero, it is deallocated. Some objects are
Expand All @@ -1227,6 +1348,25 @@ Glossary

See also :term:`namespace package`.

reentrant
A property of a function or :term:`lock` that allows it to be called or
acquired multiple times by the same thread without causing errors or a
:term:`deadlock`.

For functions, reentrancy means the function can be safely called again
before a previous invocation has completed, which is important when
functions may be called recursively or from signal handlers. Thread-unsafe
functions may be :term:`non-deterministic` if they're called reentrantly in a
multithreaded program.

For locks, Python's :class:`threading.RLock` (reentrant lock) is
reentrant, meaning a thread that already holds the lock can acquire it
again without blocking. In contrast, :class:`threading.Lock` is not
reentrant - attempting to acquire it twice from the same thread will cause
a deadlock.

See also :term:`lock` and :term:`deadlock`.

REPL
An acronym for the "read–eval–print loop", another name for the
:term:`interactive` interpreter shell.
Expand Down Expand Up @@ -1331,6 +1471,19 @@ Glossary

See also :term:`borrowed reference`.

synchronization primitive
A basic building block for coordinating the execution of multiple threads
to ensure :term:`thread-safe` access to shared resources. Python's
:mod:`threading` module provides several synchronization primitives
including :class:`~threading.Lock`, :class:`~threading.RLock`,
:class:`~threading.Semaphore`, :class:`~threading.Condition`,
:class:`~threading.Event`, and :class:`~threading.Barrier`. Additionally,
the :mod:`queue` module provides multi-producer, multi-consumer queues
that are especially usedul in multithreaded programs. These
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

useful

primitives help prevent :term:`race conditions <race condition>` and
coordinate thread execution. See also :term:`lock` and
:term:`critical section`.

t-string
t-strings
String literals prefixed with ``t`` or ``T`` are commonly called
Expand Down Expand Up @@ -1383,6 +1536,19 @@ Glossary
See :ref:`Thread State and the Global Interpreter Lock <threads>` for more
information.

thread-safe
Code that functions correctly when accessed by multiple threads
concurrently. Thread-safe code uses appropriate
Comment on lines +1548 to +1549
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to rework the first sentence a bit. We talk about thread-safe modules and data structures (classes), not just "code"

Maybe something like:

A module, function, or class that behaves correctly when used by multiple threads concurrently.

:term:`synchronization primitives <synchronization primitive>` like
:term:`locks <lock>` to protect shared mutable state, or is designed
to avoid shared mutable state entirely. In the
:term:`free-threaded <free threading>` build, built-in types like
:class:`dict`, :class:`list`, and :class:`set` use internal locking
to provide thread-safe operations, though this doesn't guarantee safety
Copy link
Contributor

@ngoldbaum ngoldbaum Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
to provide thread-safe operations, though this doesn't guarantee safety
to make many operations thread-safe, although thread safety is not necessarily guaranteed

The way you have this line now is a little confusing IMO, since it says that everything is thread-safe, but no guarantees. I think the way I rephrased it here is correct and leaves it a little clearer that lots of stuff is thread-safe but there are exceptions.

When eventually we have a full listing of what exactly is thread-safe and thread-unsafe in the APIs of the builtins, we can link to that here, maybe?

for all use patterns. Code that is not thread-safe may experience
:term:`race conditions <race condition>` and :term:`data races <data race>`
when used in multi-threaded programs.

token

A small unit of source code, generated by the
Expand Down
Loading