1
1
PEP: 744
2
2
Title: JIT Compilation
3
3
Author: Brandt Bucher <
[email protected] >
4
+ Discussions-To: https://discuss.python.org/t/pep-744-jit-compilation/50756
4
5
Status: Draft
5
6
Type: Informational
6
7
Created: 11-Apr-2024
7
8
Python-Version: 3.13
9
+ Post-History: `11-Apr-2024 <https://discuss.python.org/t/pep-744-jit-compilation/50756 >`__
8
10
9
11
Abstract
10
12
========
11
13
12
14
Earlier this year, an `experimental "just-in-time" compiler
13
- <https://github.com/python/cpython/pull/113465> `_ was merged into CPython's
15
+ <https://github.com/python/cpython/pull/113465> `__ was merged into CPython's
14
16
``main `` development branch. While recent CPython releases have included other
15
17
substantial internal changes, this addition represents a particularly
16
18
significant departure from the way CPython has traditionally executed Python
@@ -27,22 +29,22 @@ introduction.
27
29
Readers interested in learning more about the new JIT are encouraged to consult
28
30
the following resources:
29
31
30
- - The `presentation <https://youtu.be/HxSHIpEQRjs >`_ which first introduced the
32
+ - The `presentation <https://youtu.be/HxSHIpEQRjs >`__ which first introduced the
31
33
JIT at the 2023 CPython Core Developer Sprint. It includes relevant
32
34
background, a light technical introduction to the "copy-and-patch" technique
33
35
used, and an open discussion of its design amongst the core developers
34
36
present.
35
37
36
- - The `open access paper <https://dl.acm.org/doi/10.1145/3485513 >`_ originally
38
+ - The `open access paper <https://dl.acm.org/doi/10.1145/3485513 >`__ originally
37
39
describing copy-and-patch.
38
40
39
- - The `blog post <https://sillycross.github.io/2023/05/12/2023-05-12 >`_ by the
41
+ - The `blog post <https://sillycross.github.io/2023/05/12/2023-05-12 >`__ by the
40
42
paper's author detailing the implementation of a copy-and-patch JIT compiler
41
43
for Lua. While this is a great low-level explanation of the approach, note
42
44
that it also incorporates other techniques and makes implementation decisions
43
45
that are not particularly relevant to CPython's JIT.
44
46
45
- - The `implementation <#reference-implementation >`_ itself.
47
+ - The `implementation <#reference-implementation >`__ itself.
46
48
47
49
Motivation
48
50
==========
@@ -53,7 +55,7 @@ direct translation of the source code: it is untyped, and largely unoptimized.
53
55
54
56
Since the Python 3.11 release, CPython has used a "specializing adaptive
55
57
interpreter" (:pep: `659 `), which `rewrites these bytecode instructions in-place
56
- <https://youtu.be/shQtrn1v7sQ> `_ with type-specialized versions as they run.
58
+ <https://youtu.be/shQtrn1v7sQ> `__ with type-specialized versions as they run.
57
59
This new interpreter delivers significant performance improvements, despite the
58
60
fact that its optimization potential is limited by the boundaries of individual
59
61
bytecode instructions. It also collects a wealth of new profiling information:
@@ -63,7 +65,7 @@ what paths through the program are being executed the most. In other words,
63
65
64
66
Since the Python 3.12 release, CPython has generated this interpreter from a
65
67
`C-like domain-specific language
66
- <https://github.com/python/cpython/blob/main/Python/bytecodes.c> `_ (DSL). In
68
+ <https://github.com/python/cpython/blob/main/Python/bytecodes.c> `__ (DSL). In
67
69
addition to taming some of the complexity of the new adaptive interpreter, the
68
70
DSL also allows CPython's maintainers to avoid hand-writing tedious boilerplate
69
71
code in many parts of the interpreter, compiler, and standard library that must
@@ -98,7 +100,7 @@ Since much of this data varies even between identical runs of a program and the
98
100
existing optimization pipeline makes heavy use of runtime profiling information,
99
101
it doesn't make much sense to compile these traces ahead of time. As has been
100
102
demonstrated for many other dynamic languages (`and even Python itself
101
- <https://www.pypy.org> `_ ), the most promising approach is to compile the
103
+ <https://www.pypy.org> `__ ), the most promising approach is to compile the
102
104
optimized micro-ops "just in time" for execution.
103
105
104
106
Rationale
@@ -168,7 +170,7 @@ Support
168
170
The JIT has been developed for all of :pep: `11 `'s current tier one platforms,
169
171
most of its tier two platforms, and one of its tier three platforms.
170
172
Specifically, CPython's ``main `` branch has `CI
171
- <https://github.com/python/cpython/blob/main/.github/workflows/jit.yml> `_
173
+ <https://github.com/python/cpython/blob/main/.github/workflows/jit.yml> `__
172
174
building and testing the JIT for both release and debug builds on:
173
175
174
176
- ``aarch64-apple-darwin/clang ``
@@ -202,7 +204,7 @@ failures on tier one and tier two platforms should block releases. Though it's
202
204
not necessary to update :pep: `11 ` to specify JIT support, it may be helpful to
203
205
do so anyway. Otherwise, a list of supported platforms should be maintained in
204
206
`the JIT's README
205
- <https://github.com/python/cpython/blob/main/Tools/jit/README.md> `_ .
207
+ <https://github.com/python/cpython/blob/main/Tools/jit/README.md> `__ .
206
208
207
209
Since it should always be possible to build CPython without the JIT, removing
208
210
JIT support for a platform should *not * be considered a backwards-incompatible
@@ -253,7 +255,7 @@ This JIT, like any JIT, produces large amounts of executable data at runtime.
253
255
This introduces a potential new attack surface to CPython, since a malicious
254
256
actor capable of influencing the contents of this data is therefore capable of
255
257
executing arbitrary code. This is a `well-known vulnerability
256
- <https://en.wikipedia.org/wiki/Just-in-time_compilation#Security> `_ of JIT
258
+ <https://en.wikipedia.org/wiki/Just-in-time_compilation#Security> `__ of JIT
257
259
compilers.
258
260
259
261
In order to mitigate this risk, the JIT has been written with best practices in
@@ -282,7 +284,7 @@ Apple Silicon
282
284
Though difficult to test without actually signing and packaging a macOS release,
283
285
it *appears * that macOS releases should `enable the JIT Entitlement for the
284
286
Hardened Runtime
285
- <https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon#Enable-the-JIT-Entitlement-for-the-Hardened-Runtime> `_ .
287
+ <https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon#Enable-the-JIT-Entitlement-for-the-Hardened-Runtime> `__ .
286
288
287
289
This shouldn't make *installing * Python any harder, but may add additional steps
288
290
for release managers to perform.
@@ -322,7 +324,7 @@ Choose the sections that best describe you:
322
324
323
325
- ...if you don't wish to build the JIT, you can simply ignore it. Otherwise,
324
326
you will need to `install a compatible version of LLVM
325
- <https://github.com/python/cpython/blob/main/Tools/jit/README.md> `_ , and
327
+ <https://github.com/python/cpython/blob/main/Tools/jit/README.md> `__ , and
326
328
pass the appropriate flag to the build scripts. Your build may take up to a
327
329
minute longer. Note that the JIT should *not * be distributed to end users or
328
330
used in production while it is still in the experimental phase.
@@ -446,7 +448,7 @@ Support multiple compiler toolchains
446
448
Clang is specifically needed because it's the only C compiler with support for
447
449
guaranteed tail calls (|musttail |_), which are required by CPython's
448
450
`continuation-passing-style
449
- <https://en.wikipedia.org/wiki/Continuation-passing_style#Tail_calls> `_ approach
451
+ <https://en.wikipedia.org/wiki/Continuation-passing_style#Tail_calls> `__ approach
450
452
to JIT compilation. Without it, the tail-recursive calls between templates could
451
453
result in unbounded C stack growth (and eventual overflow).
452
454
@@ -481,7 +483,7 @@ Add GPU support
481
483
482
484
The JIT is currently CPU-only. It does not, for example, offload NumPy array
483
485
computations to CUDA GPUs, as JITs like `Numba
484
- <https://numba.pydata.org/numba-doc/latest/cuda/overview.html> `_ do.
486
+ <https://numba.pydata.org/numba-doc/latest/cuda/overview.html> `__ do.
485
487
486
488
There is already a rich ecosystem of tools for accelerating these sorts of
487
489
specialized tasks, and CPython's JIT is not intended to replace them. Instead,
@@ -495,12 +497,12 @@ Speed
495
497
-----
496
498
497
499
Currently, the JIT is `about as fast as the existing specializing interpreter
498
- <https://github.com/faster-cpython/benchmarking-public/blob/main/configs.png> `_
500
+ <https://github.com/faster-cpython/benchmarking-public/blob/main/configs.png> `__
499
501
on most platforms. Improving this is obviously a top priority at this point,
500
502
since providing a significant performance gain is the entire motivation for
501
503
having a JIT at all. A number of proposed improvements are already underway, and
502
504
this ongoing work is being tracked in `GH-115802
503
- <https://github.com/python/cpython/issues/115802> `_ .
505
+ <https://github.com/python/cpython/issues/115802> `__ .
504
506
505
507
Memory
506
508
------
@@ -509,7 +511,7 @@ Because it allocates additional memory for executable machine code, the JIT does
509
511
use more memory than the existing interpreter at runtime. According to the
510
512
official benchmarks, the JIT currently uses about `10-20% more memory than the
511
513
base interpreter
512
- <https://github.com/faster-cpython/benchmarking-public/blob/main/memory_configs.png> `_ .
514
+ <https://github.com/faster-cpython/benchmarking-public/blob/main/memory_configs.png> `__ .
513
515
The upper end of this range is due to ``aarch64-apple-darwin ``, which has larger
514
516
page sizes (and thus, a larger minimum allocation granularity).
515
517
@@ -522,7 +524,7 @@ likely to be a real concern.
522
524
Not much effort has been put into optimizing the JIT's memory usage yet, so
523
525
these numbers likely represent a maximum that will be reduced over time.
524
526
Improving this is a medium priority, and is being tracked in `GH-116017
525
- <https://github.com/python/cpython/issues/116017> `_ .
527
+ <https://github.com/python/cpython/issues/116017> `__ .
526
528
527
529
Earlier versions of the JIT had a more complicated memory allocation scheme
528
530
which imposed a number of fragile limitations on the size and layout of the
@@ -547,7 +549,7 @@ since installing the required tools is not prohibitively difficult for most
547
549
people building CPython, and the build step is not particularly time-consuming.
548
550
549
551
Since some still remain interested in this possibility, discussion is being
550
- tracked in `GH-115869 <https://github.com/python/cpython/issues/115869 >`_ .
552
+ tracked in `GH-115869 <https://github.com/python/cpython/issues/115869 >`__ .
551
553
552
554
Footnotes
553
555
=========
0 commit comments