@@ -403,31 +403,54 @@ and :keyword:`raise` statement in section :ref:`raise`.
403
403
Runtime Components
404
404
==================
405
405
406
- Python's execution model does not operate in a vacuum. It runs on a
407
- computer. When a program runs, the conceptual layers of how it runs
408
- on the computer look something like this::
409
-
410
- host machine and operating system (OS)
411
- process
412
- OS thread (runs machine code)
413
-
414
- Hosts and processes are isolated and independent from one another.
415
- However, threads are not.
416
-
417
- A program always starts with exactly one thread, known as the "main"
418
- thread, it may grow to run in multiple. Not all platforms support
419
- threads, but most do. For those that do, all threads in a process
420
- share all the process' resources, including memory.
421
-
422
- The fundamental point of threads is that each thread does *run *
406
+ General Computing Model
407
+ -----------------------
408
+
409
+ Python's execution model does not operate in a vacuum. It runs on
410
+ a host machine and through that host's runtime environment, including
411
+ its operating system (OS), if there is one. When a program runs,
412
+ the conceptual layers of how it runs on the host look something
413
+ like this::
414
+
415
+ **host machine**
416
+ **process** (global resources)
417
+ **thread** (runs machine code)
418
+
419
+ Each process represents a program running on the host. Think of each
420
+ process itself as the data part of its program. Think of the process'
421
+ threads as the execution part of the program. This distinction will
422
+ be important to understand the conceptual Python runtime.
423
+
424
+ The process, as the data part, is the execution context in which the
425
+ program runs. It mostly consists of the set of resources assigned to
426
+ the program by the host, including memory, signals, file handles,
427
+ sockets, and environment variables.
428
+
429
+ Processes are isolated and independent from one another. (The same
430
+ is true for hosts.) The host manages the process' access to its
431
+ assigned resources, in addition to coordinating between processes.
432
+
433
+ Each thread represents the actual execution of the program's machine
434
+ code, running relative to the resources assigned to the program's
435
+ process. It's strictly up to the host how and when that execution
436
+ takes place.
437
+
438
+ From the point of view of Python, a program always starts with exactly
439
+ one thread. However, the program may grow to run in multiple
440
+ simultaneous threads. Not all hosts support multiple threads per
441
+ process, but most do. Unlike processes, threads in a process are not
442
+ isolated and independent from one another. Specifically, all threads
443
+ in a process share all of the process' resources.
444
+
445
+ The fundamental point of threads is that each one does *run *
423
446
independently, at the same time as the others. That may be only
424
447
conceptually at the same time ("concurrently") or physically
425
448
("in parallel"). Either way, the threads effectively run
426
449
at a non-synchronized rate.
427
450
428
451
.. note ::
429
452
430
- That non-synchronized rate means none of the global state is
453
+ That non-synchronized rate means none of the process' memory is
431
454
guaranteed to stay consistent for the code running in any given
432
455
thread. Thus multi-threaded programs must take care to coordinate
433
456
access to intentionally shared resources. Likewise, they must take
@@ -438,70 +461,152 @@ at a non-synchronized rate.
438
461
Python runtime.
439
462
440
463
The cost of this broad, unstructured requirement is the tradeoff for
441
- the concurrency and, especially, parallelism that threads provide.
442
- The alternative generally means dealing with non-deterministic bugs
443
- and data corruption.
444
-
445
- The same layers apply to each Python program, with some extra layers
446
- specific to Python::
447
-
448
- host
449
- process
450
- Python runtime
451
- interpreter
452
- Python thread (runs bytecode)
453
-
454
- When a Python program starts, it looks exactly like that, with one
455
- of each. The process has a single global runtime to manage Python's
456
- process-global resources. The runtime may grow to include multiple
457
- interpreters and each interpreter may grow to include multiple Python
458
- threads. The initial interpreter is known as the "main" interpreter,
459
- and the initial thread, where the runtime was initialized, is known
460
- as the "main" thread.
461
-
462
- An interpreter completely encapsulates all of the non-process-global
463
- runtime state that the interpreter's Python threads share. For example,
464
- all its threads share :data: `sys.modules `, but each interpreter has its
465
- own :data: `sys.modules `.
464
+ the kind of raw concurrency that threads provide. The alternative
465
+ to the required discipline generally means dealing with
466
+ non-deterministic bugs and data corruption.
467
+
468
+ Python Runtime Model
469
+ --------------------
470
+
471
+ The same conceptual layers apply to each Python program, with some
472
+ extra data layers specific to Python::
473
+
474
+ **host machine**
475
+ **process** (global resources)
476
+ globl runtime (*state*)
477
+ interpreter (*state*)
478
+ **thread** (runs "C-API" and Python bytecode)
479
+ thread *state*
480
+
481
+ At the conceptual level: when a Python program starts, it looks exactly
482
+ like that diagram, with one of each. The runtime may grow to include
483
+ multiple interpreters, and each interpreter may grow to include
484
+ multiple thread states.
466
485
467
486
.. note ::
468
487
469
- The interpreter here is not the same as the "bytecode interpreter",
470
- which is what regularly runs in threads, executing compiled Python code.
488
+ A Python implementation won't necessarily implement the runtime
489
+ layers distinctly or even concretely. The only exception is places
490
+ where distinct layers are directly specified or exposed to users,
491
+ like through the :mod: `threading ` module.
471
492
472
- A Python thread represents the state necessary for the Python runtime
473
- to *run * in an OS thread. It also represents the execution of Python
474
- code (or any supported C-API) in that OS thread. Depending on the
475
- implementation, this probably includes the current exception and
476
- the Python call stack. The Python thread always identifies the
477
- interpreter it belongs to, meaning the state it shares
478
- with other threads.
493
+ .. note ::
494
+
495
+ The initial interpreter is typically called the "main" interpreter.
496
+ Some Python implementations, like CPython, assign special roles
497
+ to the main interpreter.
498
+
499
+ Likewise, the host thread where the runtime was initialized is known
500
+ as the "main" thread. It may be different from the process' initial
501
+ thread, though they are often the same. In some cases "main thread"
502
+ may be even more specific and refer to the initial thread state.
503
+ A Python runtime might assign specific responsibilities
504
+ to the main thread, such as handling signals.
505
+
506
+ As a whole, the Python runtime consists of the global runtime state,
507
+ interpreters, and thread states. The runtime ensures all that state
508
+ stays consistent over its lifetime, particularly when used with
509
+ multiple host threads. The runtime also exposes a way for host threads
510
+ to "call into Python", which will be covered in the next subsection.
511
+
512
+ The global runtime, at the conceptual level, is just a set of
513
+ interpreters. While they are otherwise isolated and independent from
514
+ one another, they may share some data or other resources. The runtime
515
+ is responsible for managing these global resources safely. The actual
516
+ nature and management of these resources is implementation-specific.
517
+ Ultimately, the external utility of the global runtime is limited
518
+ to managing interpreters.
519
+
520
+ In contrast, an "interpreter" is conceptually what we would normally
521
+ think of as the (full-featured) "Python runtime". When machine code
522
+ executing in a host thread interacts with the Python runtime, it calls
523
+ into Python in the context of a specific interpreter.
479
524
480
525
.. note ::
481
526
482
- Here "Python thread" does not necessarily refer to a thread created
483
- using the :mod: `threading ` module.
527
+ The term "interpreter" here is not the same as the "bytecode
528
+ interpreter", which is what regularly runs in threads, executing
529
+ compiled Python code.
530
+
531
+ In an ideal world, "Python runtime" would refer to what we currently
532
+ call "interpreter". However, it's been called "interpreter" at least
533
+ since introduced in 1997 (a027efa5b).
534
+
535
+ Each interpreter completely encapsulates all of the non-process-global,
536
+ non-thread-specific state needed for the Python runtime to work.
537
+ Notably, the interpreter's state persists between uses. It includes
538
+ fundamental data like :data: `sys.modules `. The runtime ensures
539
+ multiple threads using the same interpreter will safely
540
+ share it between them.
541
+
542
+ A Python implementation may support using multiple interpreters at the
543
+ same time in the same process. They are independent and isolated from
544
+ one another. For example, each interpreter has its own
545
+ :data: `sys.modules `.
546
+
547
+ For thread-specific runtime state, each interpreter has a set of thread
548
+ states, which it manages, in the same way the global runtime contains
549
+ a set of interpreters. It can have thread states for as many host
550
+ threads as it needs. It may even have multiple thread states for
551
+ the same host thread, though that isn't as common.
552
+
553
+ Each thread state, conceptually, has all the thread-specific runtime
554
+ data an interpreter needs to operate in one host thread. The thread
555
+ state includes the current raised exception and the thread's Python
556
+ call stack. It may include other thread-specific resources.
557
+
558
+ .. note ::
484
559
485
- Each Python thread is associated with a single OS thread, which is where
486
- it can run. In the opposite direction, a single OS thread can have many
487
- Python threads associated with it. However, only one of those Python
488
- threads is "active" in the OS thread at time. The runtime will operate
489
- in the OS thread relative to the active Python thread.
560
+ The term "Python thread" can sometimes refer to a thread state, but
561
+ normally it means a thread created using the :mod: `threading ` module.
490
562
491
- For an interpreter to be used in an OS thread, it must have a
492
- corresponding active Python thread. Thus switching between interpreters
493
- means changing the active Python thread. An interpreter can have Python
494
- threads, active or inactive, for as many OS threads as it needs. It may
495
- even have multiple Python threads for the same OS thread, though at most
496
- one can be active at a time.
563
+ Each thread state, over its lifetime, is always tied to exactly one
564
+ interpreter and exactly one host thread. It will only ever be used in
565
+ that thread. In the other direction, a host thread may have many
566
+ Python thread states tied to it, for different interpreters.
497
567
498
568
Once a program is running, new Python threads can be created using the
499
569
:mod: `threading ` module (on platforms and Python implementations that
500
570
support threads). Additional processes can be created using the
501
571
:mod: `os `, :mod: `subprocess `, and :mod: `multiprocessing ` modules.
502
572
You can run coroutines (async) in the main thread using :mod: `asyncio `.
503
573
Interpreters can be created and used with the
504
- :mod: `concurrent.interpreters ` module.
574
+ :mod: `~concurrent.interpreters ` module.
575
+
576
+ Calls into Python
577
+ -----------------
578
+
579
+ A "call into Python" is an abstraction of "ask the Python runtime
580
+ to do something". It necessarily involves targeting a single runtime
581
+ context, whether global, interpreter, or thread. The layer depends
582
+ on the desired operation. Most operations require a thread state.
583
+
584
+ When a running host thread calls into Python, the actual mechanism
585
+ is implementation-specific. For example, CPython provides a C-API and
586
+ the thread will literally call into Python through a C-API function.
587
+
588
+ .. drop paragraph?
589
+
590
+ Some thread-specific operations must only target a new thread state,
591
+ while others may target any thread state, including one with a Python
592
+ call already on its stack or a current exception set.
593
+
594
+ A thread-specific call into Python can target only one thread state.
595
+ That means, when there are multiple Python thread states tied to the
596
+ current host thread, only one of them can be in use at a time. It
597
+ doesn't matter if the thread states belong to different interpreters
598
+ or the same interpreter.
599
+
600
+ Calls into Python can be nested. Even if a thread has already called
601
+ into Python, that operation could be interrupted by another call into
602
+ Python targeting a different runtime context. For example, the
603
+ implementation of the outer call might make the inner call directly.
604
+ Alternately, the host or Python runtime might trigger some
605
+ asyncronous callback that calls into Python.
606
+
607
+ Regardless, at the point of the inner call, the target is swapped.
608
+ When the inner call finishes, the target is swapped back and the outer
609
+ call resumes.
505
610
506
611
507
612
.. rubric :: Footnotes
0 commit comments