Skip to content

Commit 6566000

Browse files
committed
[GR-33179] Experimental bytecode interpreter
PullRequest: graalpython/2245
2 parents 9918000 + f760ab4 commit 6566000

File tree

905 files changed

+67442
-641
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

905 files changed

+67442
-641
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ language
5353
/request_cache/
5454
/org.eclipse.jdt.core.prefs
5555
Python3.g4.stamp
56+
python.gram.stamp
5657
*.orig
5758
/*.diff
5859
/testenv

ci.jsonnet

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
{ "overlay": "25178436615d9a3ee1cca03ec10f4d90fe66fe63" }
1+
{ "overlay": "a3f5acbe1104349cce9ec62e9b39f0c5c8403d92" }

docs/contributor/CONTRIBUTING.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -199,10 +199,14 @@ print its path as the last output, if successful.
199199
If you made changes to the parser, you may have to regenerate the golden files
200200
like so:
201201

202-
find graalpython -name *.scope -delete
203-
find graalpython -name *.tast -delete
202+
find graalpython -name '*.scope' -delete
203+
find graalpython -name '*.tast' -delete
204204
mx punittest com.oracle.graal.python.test.parser
205205

206+
If you made changes to the bytecode compiler, you may have to regenerate its golden files:
207+
208+
find graalpython -name '*.co' -delete
209+
mx punittest com.oracle.graal.python.test.compiler
206210

207211
### Benchmarking
208212

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
/*
2+
* Copyright (c) 2020, 2022, Oracle and/or its affiliates. All rights reserved.
3+
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
4+
*
5+
* The Universal Permissive License (UPL), Version 1.0
6+
*
7+
* Subject to the condition set forth below, permission is hereby granted to any
8+
* person obtaining a copy of this software, associated documentation and/or
9+
* data (collectively the "Software"), free of charge and under any and all
10+
* copyright rights in the Software, and any and all patent rights owned or
11+
* freely licensable by each licensor hereunder covering either (i) the
12+
* unmodified Software as contributed to or provided by such licensor, or (ii)
13+
* the Larger Works (as defined below), to deal in both
14+
*
15+
* (a) the Software, and
16+
*
17+
* (b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
18+
* one is included with the Software each a "Larger Work" to which the Software
19+
* is contributed by such licensors),
20+
*
21+
* without restriction, including without limitation the rights to copy, create
22+
* derivative works of, display, perform, and distribute the Software and make,
23+
* use, sell, offer for sale, import, export, have made, and have sold the
24+
* Software and the Larger Work(s), and to sublicense the foregoing rights on
25+
* either these or other terms.
26+
*
27+
* This license is subject to the following condition:
28+
*
29+
* The above copyright notice and either this complete permission notice or at a
30+
* minimum a reference to the UPL must be included in all copies or substantial
31+
* portions of the Software.
32+
*
33+
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
34+
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
35+
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
36+
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
37+
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
38+
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
39+
* SOFTWARE.
40+
*/
41+
package com.oracle.graal.python.annotations;
42+
43+
import java.lang.annotation.ElementType;
44+
import java.lang.annotation.Target;
45+
46+
@Target(ElementType.TYPE)
47+
public @interface GenerateEnumConstants {
48+
}

graalpython/com.oracle.graal.python.benchmarks/python/harness.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright (c) 2018, 2021, Oracle and/or its affiliates. All rights reserved.
1+
# Copyright (c) 2018, 2022, Oracle and/or its affiliates. All rights reserved.
22
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
33
#
44
# The Universal Permissive License (UPL), Version 1.0

graalpython/com.oracle.graal.python.benchmarks/python/meso/richards3.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/usr/bin/env python
22
# Copyright 2008-2010 Isaac Gouy
33
# Copyright (c) 2013, 2014, Regents of the University of California
4-
# Copyright (c) 2017, 2021, Oracle and/or its affiliates.
4+
# Copyright (c) 2017, 2022, Oracle and/or its affiliates.
55
# All rights reserved.
66
#
77
# Revised BSD license
Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
# Copyright (c) 2021, 2022, Oracle and/or its affiliates. All rights reserved.
2+
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
3+
#
4+
# The Universal Permissive License (UPL), Version 1.0
5+
#
6+
# Subject to the condition set forth below, permission is hereby granted to any
7+
# person obtaining a copy of this software, associated documentation and/or
8+
# data (collectively the "Software"), free of charge and under any and all
9+
# copyright rights in the Software, and any and all patent rights owned or
10+
# freely licensable by each licensor hereunder covering either (i) the
11+
# unmodified Software as contributed to or provided by such licensor, or (ii)
12+
# the Larger Works (as defined below), to deal in both
13+
#
14+
# (a) the Software, and
15+
#
16+
# (b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
17+
# one is included with the Software each a "Larger Work" to which the Software
18+
# is contributed by such licensors),
19+
#
20+
# without restriction, including without limitation the rights to copy, create
21+
# derivative works of, display, perform, and distribute the Software and make,
22+
# use, sell, offer for sale, import, export, have made, and have sold the
23+
# Software and the Larger Work(s), and to sublicense the foregoing rights on
24+
# either these or other terms.
25+
#
26+
# This license is subject to the following condition:
27+
#
28+
# The above copyright notice and either this complete permission notice or at a
29+
# minimum a reference to the UPL must be included in all copies or substantial
30+
# portions of the Software.
31+
#
32+
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
33+
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
34+
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
35+
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
36+
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
37+
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
38+
# SOFTWARE.
39+
import marshal
40+
import pydoc_data.topics
41+
42+
# pollute the profile of the bytecode loop
43+
exec("""
44+
import sys
45+
def foo(): pass
46+
len(sys.__name__)
47+
print(sys, flush=False)
48+
pass
49+
""")
50+
51+
CODESTR1 = "\n".join(["""
52+
# import sys
53+
def foo(): pass
54+
len(foo.__name__)
55+
"""] * 100)
56+
57+
CODESTR2 = "\n".join(["""
58+
def bar(): pass
59+
x = None
60+
len([])
61+
y = x
62+
"""] * 100)
63+
64+
with open(pydoc_data.topics.__file__, "r") as f:
65+
CODESTR3 = f.read()
66+
67+
JUST_PYC_1 = marshal.dumps(compile(CODESTR1, "1", "exec"))
68+
JUST_PYC_2 = marshal.dumps(compile(CODESTR2, "2", "exec"))
69+
JUST_PYC_3 = marshal.dumps(compile(CODESTR3, pydoc_data.topics.__file__, "exec"))
70+
71+
CODEOBJECTS = []
72+
73+
74+
def __setup__(num):
75+
__cleanup__(num)
76+
77+
78+
def __cleanup__(num):
79+
CODEOBJECTS.clear()
80+
for _ in range(0, num, 3):
81+
CODEOBJECTS.append(marshal.loads(JUST_PYC_1))
82+
CODEOBJECTS.append(marshal.loads(JUST_PYC_2))
83+
CODEOBJECTS.append(marshal.loads(JUST_PYC_3))
84+
85+
86+
def measure(num):
87+
for i in range(num):
88+
exec(CODEOBJECTS[i])
89+
90+
91+
def __benchmark__(num=2000):
92+
measure(num)
93+
94+
# I've written the bytecode benchmark to simulate loading thousands of modules. I
95+
# always execute thousands of iterations, to make it somewhat like loading a big
96+
# project with many pyc files, but not so large that the compiler would have time
97+
# to compile many of the operations involved in loading code.
98+
#
99+
# If we compare unmarshalling a code object from a bytes that has bytecode vs one
100+
# that has SST, we see that unmarshalling 10_000 bytecode code objects is 2-3x
101+
# faster than unmarshalling the SST. This is for two bits of code with 300 lines
102+
# of statements (creating functions, imports, assignments, calling
103+
# functions). This is just loading, not running. CPython on the same two codes is
104+
# more than 10x faster still.
105+
#
106+
# If we compare executing 10_000 times (cycling through the code objects so each
107+
# call target is not executed more than 10 times), the AST interpreter is faster
108+
# by around 20%.
109+
#
110+
# Loading artificial bytecode data with just dummy content (~2000 NOPs, PUSH/POP,
111+
# LOAD/STORE instructions...), CPython is ~15x faster unmarshalling it.
112+
#
113+
# Just opening files and reading their contents of that artificial bytecode data
114+
# (like the importlib would do with pyc files), CPython is ~13x faster than we
115+
# are. For CPython, this operation is 10x slower than unmarshalling the already
116+
# loaded bytes. For us the loading is ~8x slower than unmarshalling.
117+
#
118+
# Opening the files *and* loading the bytecode data yields that CPython is ~14x
119+
# faster than us.
120+
#
121+
# Now, if we preload that artificial code before the benchmark and just execute
122+
# those code objects 10_000 times, CPython is a whopping 20-30x faster than we
123+
# are. But the numbers are so small, it's hard to say (CPython 0.006-0.008s,
124+
# Graal 0.16-0.19s). OTOH, it's hard to argue that we would ever load more
125+
# modules than this during some application startup. Since this models quite well
126+
# what might happen when we load a big Python application consisting of many pyc
127+
# files, that seems a problem.
128+
#
129+
# Now, combining those: Loading + executing prepared bytes: is ~25-30x slower for
130+
# us than on CPython. Combining all three operations, CPython is 13-16x faster
131+
# than us. The large amount of time spent loading files helps us a little to skew
132+
# it into that ratio.
133+
#
134+
# The ratios don't quite add up, because doing more work also gives libgraal more
135+
# time to optimize parts of the work. It's still interesting, since this one-shot
136+
# loading of many pyc files is crucial for our perceived startup performance.
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
QUIETLY$(MX_VERBOSE) = @
2+
3+
PARSER_PATH ?= ../com.oracle.graal.python.pegparser/src/com/oracle/graal/python/pegparser
4+
ifdef MX_PYTHON
5+
PYTHON_EXE ?= ${MX_PYTHON}
6+
else ifdef MX_PYTHON_VERSION
7+
PYTHON_EXE ?= python${MX_PYTHON_VERSION}
8+
else
9+
PYTHON_EXE ?= python3
10+
endif
11+
12+
TARGET=${PARSER_PATH}/Parser.java
13+
14+
GRAMMAR=${PARSER_PATH}/python.gram
15+
TOKENS=${PARSER_PATH}/Tokens
16+
17+
PEGEN_FILES=$(shell find pegen pegjava -name '*.py')
18+
19+
STAMP=${GRAMMAR}.stamp
20+
21+
.PHONY: default clean
22+
default: ${STAMP}
23+
24+
${STAMP}: ${GRAMMAR} ${TOKENS} ${PEGEN_FILES} main_parser_gen.py
25+
$(QUIETLY) ${PYTHON_EXE} main_parser_gen.py ${GRAMMAR} ${TOKENS} ${TARGET}
26+
$(QUIETLY) touch $@
27+
28+
clean:
29+
$(QUIETLY) rm -f ${TARGET}
30+
$(QUIETLY) rm -f ${STAMP}

0 commit comments

Comments
 (0)