@@ -391,6 +391,192 @@ See also the description of the :keyword:`try` statement in section :ref:`try`
391
391
and :keyword: `raise ` statement in section :ref: `raise `.
392
392
393
393
394
+ .. _execcomponents :
395
+
396
+ Runtime Components
397
+ ==================
398
+
399
+ General Computing Model
400
+ -----------------------
401
+
402
+ Python's execution model does not operate in a vacuum. It runs on
403
+ a host machine and through that host's runtime environment, including
404
+ its operating system (OS), if there is one. When a program runs,
405
+ the conceptual layers of how it runs on the host look something
406
+ like this:
407
+
408
+ | **host machine**
409
+ | **process** (global resources)
410
+ | **thread** (runs machine code)
411
+
412
+ Each process represents a program running on the host. Think of each
413
+ process itself as the data part of its program. Think of the process'
414
+ threads as the execution part of the program. This distinction will
415
+ be important to understand the conceptual Python runtime.
416
+
417
+ The process, as the data part, is the execution context in which the
418
+ program runs. It mostly consists of the set of resources assigned to
419
+ the program by the host, including memory, signals, file handles,
420
+ sockets, and environment variables.
421
+
422
+ Processes are isolated and independent from one another. (The same
423
+ is true for hosts.) The host manages the process' access to its
424
+ assigned resources, in addition to coordinating between processes.
425
+
426
+ Each thread represents the actual execution of the program's machine
427
+ code, running relative to the resources assigned to the program's
428
+ process. It's strictly up to the host how and when that execution
429
+ takes place.
430
+
431
+ From the point of view of Python, a program always starts with exactly
432
+ one thread. However, the program may grow to run in multiple
433
+ simultaneous threads. Not all hosts support multiple threads per
434
+ process, but most do. Unlike processes, threads in a process are not
435
+ isolated and independent from one another. Specifically, all threads
436
+ in a process share all of the process' resources.
437
+
438
+ The fundamental point of threads is that each one does *run *
439
+ independently, at the same time as the others. That may be only
440
+ conceptually at the same time ("concurrently") or physically
441
+ ("in parallel"). Either way, the threads effectively run
442
+ at a non-synchronized rate.
443
+
444
+ .. note ::
445
+
446
+ That non-synchronized rate means none of the process' memory is
447
+ guaranteed to stay consistent for the code running in any given
448
+ thread. Thus multi-threaded programs must take care to coordinate
449
+ access to intentionally shared resources. Likewise, they must take
450
+ care to be absolutely diligent about not accessing any *other *
451
+ resources in multiple threads; otherwise two threads running at the
452
+ same time might accidentally interfere with each other's use of some
453
+ shared data. All this is true for both Python programs and the
454
+ Python runtime.
455
+
456
+ The cost of this broad, unstructured requirement is the tradeoff for
457
+ the kind of raw concurrency that threads provide. The alternative
458
+ to the required discipline generally means dealing with
459
+ non-deterministic bugs and data corruption.
460
+
461
+ Python Runtime Model
462
+ --------------------
463
+
464
+ The same conceptual layers apply to each Python program, with some
465
+ extra data layers specific to Python:
466
+
467
+ | **host machine**
468
+ | **process** (global resources)
469
+ | Python global runtime (*state*)
470
+ | Python interpreter (*state*)
471
+ | **thread** (runs Python bytecode and "C-API")
472
+ | Python thread *state*
473
+
474
+ At the conceptual level: when a Python program starts, it looks exactly
475
+ like that diagram, with one of each. The runtime may grow to include
476
+ multiple interpreters, and each interpreter may grow to include
477
+ multiple thread states.
478
+
479
+ .. note ::
480
+
481
+ A Python implementation won't necessarily implement the runtime
482
+ layers distinctly or even concretely. The only exception is places
483
+ where distinct layers are directly specified or exposed to users,
484
+ like through the :mod: `threading ` module.
485
+
486
+ .. note ::
487
+
488
+ The initial interpreter is typically called the "main" interpreter.
489
+ Some Python implementations, like CPython, assign special roles
490
+ to the main interpreter.
491
+
492
+ Likewise, the host thread where the runtime was initialized is known
493
+ as the "main" thread. It may be different from the process' initial
494
+ thread, though they are often the same. In some cases "main thread"
495
+ may be even more specific and refer to the initial thread state.
496
+ A Python runtime might assign specific responsibilities
497
+ to the main thread, such as handling signals.
498
+
499
+ As a whole, the Python runtime consists of the global runtime state,
500
+ interpreters, and thread states. The runtime ensures all that state
501
+ stays consistent over its lifetime, particularly when used with
502
+ multiple host threads.
503
+
504
+ The global runtime, at the conceptual level, is just a set of
505
+ interpreters. While those interpreters are otherwise isolated and
506
+ independent from one another, they may share some data or other
507
+ resources. The runtime is responsible for managing these global
508
+ resources safely. The actual nature and management of these resources
509
+ is implementation-specific. Ultimately, the external utility of the
510
+ global runtime is limited to managing interpreters.
511
+
512
+ In contrast, an "interpreter" is conceptually what we would normally
513
+ think of as the (full-featured) "Python runtime". When machine code
514
+ executing in a host thread interacts with the Python runtime, it calls
515
+ into Python in the context of a specific interpreter.
516
+
517
+ .. note ::
518
+
519
+ The term "interpreter" here is not the same as the "bytecode
520
+ interpreter", which is what regularly runs in threads, executing
521
+ compiled Python code.
522
+
523
+ In an ideal world, "Python runtime" would refer to what we currently
524
+ call "interpreter". However, it's been called "interpreter" at least
525
+ since introduced in 1997 (`CPython:a027efa5b `_).
526
+
527
+ .. _CPython:a027efa5b : https://github.com/python/cpython/commit/a027efa5b
528
+
529
+ Each interpreter completely encapsulates all of the non-process-global,
530
+ non-thread-specific state needed for the Python runtime to work.
531
+ Notably, the interpreter's state persists between uses. It includes
532
+ fundamental data like :data: `sys.modules `. The runtime ensures
533
+ multiple threads using the same interpreter will safely
534
+ share it between them.
535
+
536
+ A Python implementation may support using multiple interpreters at the
537
+ same time in the same process. They are independent and isolated from
538
+ one another. For example, each interpreter has its own
539
+ :data: `sys.modules `.
540
+
541
+ For thread-specific runtime state, each interpreter has a set of thread
542
+ states, which it manages, in the same way the global runtime contains
543
+ a set of interpreters. It can have thread states for as many host
544
+ threads as it needs. It may even have multiple thread states for
545
+ the same host thread, though that isn't as common.
546
+
547
+ Each thread state, conceptually, has all the thread-specific runtime
548
+ data an interpreter needs to operate in one host thread. The thread
549
+ state includes the current raised exception and the thread's Python
550
+ call stack. It may include other thread-specific resources.
551
+
552
+ .. note ::
553
+
554
+ The term "Python thread" can sometimes refer to a thread state, but
555
+ normally it means a thread created using the :mod: `threading ` module.
556
+
557
+ Each thread state, over its lifetime, is always tied to exactly one
558
+ interpreter and exactly one host thread. It will only ever be used in
559
+ that thread and with that interpreter.
560
+
561
+ Multiple thread states may be tied to the same host thread, whether for
562
+ different interpreters or even the same interpreter. However, for any
563
+ given host thread, only one of the thread states tied to it can be used
564
+ by the thread at a time.
565
+
566
+ Thread states are isolated and independent from one another and don't
567
+ share any data, except for possibly sharing an interpreter and objects
568
+ or other resources belonging to that interpreter.
569
+
570
+ Once a program is running, new Python threads can be created using the
571
+ :mod: `threading ` module (on platforms and Python implementations that
572
+ support threads). Additional processes can be created using the
573
+ :mod: `os `, :mod: `subprocess `, and :mod: `multiprocessing ` modules.
574
+ Interpreters can be created and used with the
575
+ :mod: `~concurrent.interpreters ` module. Coroutines (async) can
576
+ be run using :mod: `asyncio ` in each interpreter, typically only
577
+ in a single thread (often the main thread).
578
+
579
+
394
580
.. rubric :: Footnotes
395
581
396
582
.. [# ] This limitation occurs because the code that is executed by these operations
0 commit comments