Skip to content

Commit 2ab52b6

Browse files
committed
[GR-20332] Implement the asIndex message to mimic PyNumber_AsIndex in PythonObjectLibrary
PullRequest: graalpython/767
2 parents d57dca3 + 3d5558e commit 2ab52b6

25 files changed

+626
-459
lines changed

doc/IMPLEMENTATION_DETAILS.md

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
## Python's global thread state
2+
3+
In CPython, each stack frame is allocated on the heap, and there's a global
4+
thread state holding on to the chain of currently handled exceptions (e.g. if
5+
you're nested inside `except:` blocks) as well as the currently flying exception
6+
(e.g. we're just unwinding the stack).
7+
8+
In PyPy, this is done via their virtualizable frames and a global reference to
9+
the current top frame. Each frame also has a "virtual reference" to its parent
10+
frame, so code can just "force" these references to make the stack reachable if
11+
necessary.
12+
13+
Unfortunately, the elegant solution of "virtual references" doesn't work for us,
14+
mostly because we're not a tracing JIT: we want the reference to be "virtual"
15+
even when there are multiple compilation units. With PyPy's solution, this also
16+
isn't the case, but it only hurts them for nested loops when large stacks must
17+
be forced to the heap.
18+
19+
In Graal Python, the implementation is thus a bit more involved. Here's how it
20+
works.
21+
22+
#### The PFrame.Reference
23+
24+
A `PFrame.Reference` is created when entering a Python function. By default it
25+
only holds on to another reference, that of the Python caller. If there are
26+
non-Python frames between the newly entered frame and the last Python frame,
27+
those are ignored - our linked list only connects Python frames. The entry point
28+
into the interpreter has a `PFrame.Reference` with no caller.
29+
30+
###### ExecutionContext.CallContext and ExecutionContext.CalleeContext
31+
32+
If we're only calling between Python, we pass our `PFrame.Reference` as implicit
33+
argument to any callees. On entry, they will create their own `PFrame.Reference`
34+
as the next link in this backwards-connected linked-list. As an optimization, we
35+
use assumptions both on the calling node as well as on the callee root node to
36+
avoid passing the reference (in the caller) and linking it (on the callee
37+
side). This assumption is invalidated the first time the reference is actually
38+
needed. But even then, often the `PFrame.Reference` doesn't hold on to anything
39+
else, because it was only used for traversal, so this is pretty cheap even in
40+
the not inlined case.
41+
42+
When an event forces the frame to materialize on the heap, the reference is
43+
filled. This is usually only the case when someone uses `sys._getframe` or
44+
accesses the traceback of an exception. If the stack is still live, we walk the
45+
stack and insert the "calling node" and create a "PyFrame" object that mirrors
46+
the locals in the Truffle frame. But we need to be able to do this also for
47+
frames that are no longer live, e.g. when an exception was a few frames up. To
48+
ensure this, we set a boolean flag on `PFrame.Reference` to mark it as "escaped"
49+
when it is attached to an exception (or anything else), but not accessed,
50+
yet. Whenever a Python call returns and its `PFrame.Reference` was marked such,
51+
the "PyFrame" is also filled in by copying from the VirtualFrame. This way, the
52+
stack is lazily forced to the heap as we return from functions. If we're lucky
53+
and it is never actually accessed *and* the calls are all inlined, those fill-in
54+
operations can be escape-analyzed away.
55+
56+
To implement all this, we use the ExecutionContext.CallContext and
57+
ExecutionContext.CalleeContext classes. These also use profiling information to
58+
eagerly fill in frame information if the callees actually access the stack, for
59+
example, so that no further stack walks need to take place.
60+
61+
###### ExecutionContext.IndirectCallContext and ExecutionContext.IndirectCalleeContext
62+
63+
If we're mixing Python frames with non-Python frames, or if we are making calls
64+
to methods and cannot pass the Truffle frame, we need to store the last
65+
`PFrame.Reference` on the context so that, if we ever return back into a Python
66+
function, it can properly link to the last frame. However, this is potentially
67+
expensive, because it means storing a linked list of frames on the context. So
68+
instead, we do it only lazily. When an "indirect" Python callee needs its
69+
caller, it initially walks the stack to find it. But it will also tell the last
70+
Python node that made a call to a "foreign" callee that it will have to store
71+
its `PFrame.Reference` globally in the future for it to be available later.
72+
73+
#### The current PException
74+
75+
Now that we have a mechanism to lazily make available only as much frame state
76+
as needed, we use the same mechanism to also pass the currently handled
77+
exception. Unlike CPython we do not use a stack of currently handled exceptions,
78+
instead we utilize the call stack of Java by always passing the current exception
79+
and holding on to the last (if any) in a local variable.
80+
81+
## Abstract operations on Python objects
82+
83+
Many generic operations on Python objects in CPython are defined in the header
84+
files `abstract.c` and `abstract.h`. These operations are widely used and their
85+
interplay and intricacies are the cause for the conversion, error message, and
86+
control flow bugs when not mimicked correctly. Our current approach is to
87+
provide many of these abstract operations as part of the
88+
`PythonObjectLibrary`. Usually, this means there are at least two messages for
89+
each operation - one that takes a `ThreadState` argument, and one that
90+
doesn't. The intent is to allow passing of exception state and caller
91+
information similar to how we do it with the `PFrame` argument even across
92+
library messages, which cannot take a VirtualFrame.
93+
94+
All nodes that are used in message implementations must allow uncached
95+
usage. Often (e.g. in the case of the generic `CallNode`) they offer execute
96+
methods with and without frames. If a `ThreadState` was passed to the message, a
97+
frame to pass to the node can be reconstructed using
98+
`PArguments.frameForCall(threadState)`. Here's an example:
99+
100+
```java
101+
@ExportMessage
102+
long messageWithState(ThreadState state,
103+
@Cached CallNode callNode) {
104+
Object callable = ...
105+
106+
if (state != null) {
107+
return callNode.execute(PArguments.frameForCall(state), callable, arguments);
108+
} else {
109+
return callNode.execute(callable, arguments);
110+
}
111+
}
112+
```
113+
114+
*Note*: It is **always** preferable to call an `execute` method with a
115+
`VirtualFrame` when both one with and without exist! The reason is that this
116+
avoids materialization of the frame state in more cases, as described on the
117+
section on Python's global thread state above.

graalpython/com.oracle.graal.python/src/com/oracle/graal/python/builtins/modules/BuiltinFunctions.java

Lines changed: 68 additions & 105 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
/*
2-
* Copyright (c) 2017, 2019, Oracle and/or its affiliates.
2+
* Copyright (c) 2017, 2020, Oracle and/or its affiliates.
33
* Copyright (c) 2013, Regents of the University of California
44
*
55
* All rights reserved.
@@ -164,7 +164,6 @@
164164
import com.oracle.graal.python.nodes.object.IsBuiltinClassProfile;
165165
import com.oracle.graal.python.nodes.subscript.GetItemNode;
166166
import com.oracle.graal.python.nodes.truffle.PythonArithmeticTypes;
167-
import com.oracle.graal.python.nodes.util.CastToIntegerFromIndexNode;
168167
import com.oracle.graal.python.nodes.util.CastToJavaStringNode;
169168
import com.oracle.graal.python.nodes.util.CoerceToStringNode;
170169
import com.oracle.graal.python.nodes.util.CoerceToStringNodeGen;
@@ -260,160 +259,124 @@ public Object absObject(VirtualFrame frame, Object object,
260259
@TypeSystemReference(PythonArithmeticTypes.class)
261260
@GenerateNodeFactory
262261
public abstract static class BinNode extends PythonUnaryBuiltinNode {
263-
264-
public abstract String executeObject(VirtualFrame frame, Object x);
265-
266262
@TruffleBoundary
267-
private static String buildString(boolean isNegative, String number) {
263+
protected String buildString(boolean isNegative, String number) {
268264
StringBuilder sb = new StringBuilder();
269265
if (isNegative) {
270266
sb.append('-');
271267
}
272-
sb.append("0b");
268+
sb.append(prefix());
273269
sb.append(number);
274270
return sb.toString();
275271
}
276272

277-
@Specialization
278-
String doL(long x) {
279-
return buildString(x < 0, longToBinaryString(x));
273+
protected String prefix() {
274+
return "0b";
280275
}
281276

282277
@TruffleBoundary
283-
private static String longToBinaryString(long x) {
278+
protected String longToString(long x) {
284279
return Long.toBinaryString(Math.abs(x));
285280
}
286281

282+
@TruffleBoundary
283+
protected String bigToString(BigInteger x) {
284+
return x.toString(2);
285+
}
286+
287287
@Specialization
288-
String doD(double x) {
289-
throw raise(TypeError, "'%p' object cannot be interpreted as an integer", x);
288+
String doL(long x) {
289+
return buildString(x < 0, longToString(x));
290+
}
291+
292+
@Specialization
293+
String doD(double x,
294+
@Cached PRaiseNode raise) {
295+
throw raise.raiseIntegerInterpretationError(x);
290296
}
291297

292298
@Specialization
293299
@TruffleBoundary
294300
String doPI(PInt x) {
295301
BigInteger value = x.getValue();
296-
return buildString(value.compareTo(BigInteger.ZERO) < 0, value.abs().toString(2));
302+
return buildString(value.compareTo(BigInteger.ZERO) < 0, bigToString(value.abs()));
297303
}
298304

299-
@Specialization
305+
@Specialization(replaces = {"doL", "doD", "doPI"})
300306
String doO(VirtualFrame frame, Object x,
301-
@Cached("create()") CastToIntegerFromIndexNode toIntNode,
302-
@Cached("create()") BinNode recursiveNode) {
303-
Object value = toIntNode.execute(frame, x);
304-
return recursiveNode.executeObject(frame, value);
305-
}
306-
307-
protected static BinNode create() {
308-
return BuiltinFunctionsFactory.BinNodeFactory.create();
307+
@Cached IsSubtypeNode isSubtype,
308+
@CachedLibrary(limit = "getCallSiteInlineCacheMaxDepth()") PythonObjectLibrary lib,
309+
@Cached BranchProfile isInt,
310+
@Cached BranchProfile isLong,
311+
@Cached BranchProfile isPInt) {
312+
Object index = lib.asIndexWithState(x, PArguments.getThreadState(frame));
313+
if (isSubtype.execute(lib.getLazyPythonClass(index), PythonBuiltinClassType.PInt)) {
314+
if (index instanceof Boolean || index instanceof Integer) {
315+
isInt.enter();
316+
return doL(lib.asSize(index));
317+
} else if (index instanceof Long) {
318+
isLong.enter();
319+
return doL((long) index);
320+
} else if (index instanceof PInt) {
321+
isPInt.enter();
322+
return doPI((PInt) index);
323+
} else {
324+
CompilerDirectives.transferToInterpreter();
325+
throw raise(PythonBuiltinClassType.NotImplementedError, "bin/oct/hex with native integer subclasses");
326+
}
327+
}
328+
CompilerDirectives.transferToInterpreter();
329+
/*
330+
* It should not be possible to get here, as PyNumber_Index already has a check for the
331+
* same condition
332+
*/
333+
throw raise(PythonBuiltinClassType.ValueError, "PyNumber_ToBase: index not int");
309334
}
310335
}
311336

312337
// oct(object)
313338
@Builtin(name = OCT, minNumOfPositionalArgs = 1)
314339
@TypeSystemReference(PythonArithmeticTypes.class)
315340
@GenerateNodeFactory
316-
public abstract static class OctNode extends PythonUnaryBuiltinNode {
317-
318-
public abstract String executeObject(VirtualFrame frame, Object x);
319-
320-
@TruffleBoundary
321-
private static String buildString(boolean isNegative, String number) {
322-
StringBuilder sb = new StringBuilder();
323-
if (isNegative) {
324-
sb.append('-');
325-
}
326-
sb.append("0o");
327-
sb.append(number);
328-
return sb.toString();
329-
}
330-
331-
@Specialization
332-
public String doL(long x) {
333-
return buildString(x < 0, longToOctString(x));
334-
}
335-
341+
public abstract static class OctNode extends BinNode {
342+
@Override
336343
@TruffleBoundary
337-
private static String longToOctString(long x) {
338-
return Long.toOctalString(Math.abs(x));
339-
}
340-
341-
@Specialization
342-
public String doD(double x) {
343-
throw raise(TypeError, "'%p' object cannot be interpreted as an integer", x);
344+
protected String bigToString(BigInteger x) {
345+
return x.toString(8);
344346
}
345347

346-
@Specialization
348+
@Override
347349
@TruffleBoundary
348-
public String doPI(PInt x) {
349-
BigInteger value = x.getValue();
350-
return buildString(value.compareTo(BigInteger.ZERO) < 0, value.abs().toString(8));
351-
}
352-
353-
@Specialization
354-
String doO(VirtualFrame frame, Object x,
355-
@Cached("create()") CastToIntegerFromIndexNode toIntNode,
356-
@Cached("create()") OctNode recursiveNode) {
357-
Object value = toIntNode.execute(frame, x);
358-
return recursiveNode.executeObject(frame, value);
350+
protected String longToString(long x) {
351+
return Long.toOctalString(x);
359352
}
360353

361-
protected static OctNode create() {
362-
return BuiltinFunctionsFactory.OctNodeFactory.create();
354+
@Override
355+
protected String prefix() {
356+
return "0o";
363357
}
364358
}
365359

366360
// hex(object)
367361
@Builtin(name = HEX, minNumOfPositionalArgs = 1)
368362
@TypeSystemReference(PythonArithmeticTypes.class)
369363
@GenerateNodeFactory
370-
public abstract static class HexNode extends PythonUnaryBuiltinNode {
371-
372-
public abstract String executeObject(VirtualFrame frame, Object x);
373-
374-
@TruffleBoundary
375-
private static String buildString(boolean isNegative, String number) {
376-
StringBuilder sb = new StringBuilder();
377-
if (isNegative) {
378-
sb.append('-');
379-
}
380-
sb.append("0x");
381-
sb.append(number);
382-
return sb.toString();
383-
}
384-
385-
@Specialization
386-
String doL(long x) {
387-
return buildString(x < 0, longToHexString(x));
388-
}
389-
364+
public abstract static class HexNode extends BinNode {
365+
@Override
390366
@TruffleBoundary
391-
private static String longToHexString(long x) {
392-
return Long.toHexString(Math.abs(x));
393-
}
394-
395-
@Specialization
396-
String doD(double x) {
397-
throw raise(TypeError, "'%p' object cannot be interpreted as an integer", x);
367+
protected String bigToString(BigInteger x) {
368+
return x.toString(16);
398369
}
399370

400-
@Specialization
371+
@Override
401372
@TruffleBoundary
402-
String doPI(PInt x) {
403-
BigInteger value = x.getValue();
404-
return buildString(value.compareTo(BigInteger.ZERO) < 0, value.abs().toString(8));
373+
protected String longToString(long x) {
374+
return Long.toHexString(x);
405375
}
406376

407-
@Specialization
408-
String doO(VirtualFrame frame, Object x,
409-
@Cached("create()") CastToIntegerFromIndexNode toIntNode,
410-
@Cached("create()") HexNode recursiveNode) {
411-
Object value = toIntNode.execute(frame, x);
412-
return recursiveNode.executeObject(frame, value);
413-
}
414-
415-
protected static HexNode create() {
416-
return BuiltinFunctionsFactory.HexNodeFactory.create();
377+
@Override
378+
protected String prefix() {
379+
return "0x";
417380
}
418381
}
419382

0 commit comments

Comments
 (0)