-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
Open
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement
Description
Currently to check whether a value is True
or False
we need to compare it either True
or False
.
In the both the JIT and the interpreter (although this is worse in the JIT) we need to load a full 8 byte value from the instruction stream in order to perform the comparison.
Here's the x86 stencil for _GUARD_IS_TRUE_POP
(with TOS caching):
// 0: 48 b8 00 00 00 00 00 00 00 00 movabsq $0x0, %rax
// 0000000000000002: R_X86_64_64 _Py_TrueStruct+0x1
// a: 4c 39 f8 cmpq %r15, %rax
// d: 0f 85 00 00 00 00 jne exit
By putting True
and False
in an aligned array:
struct _booleans {
PyLongObject False;
PyLongObject True;
};
alignas(sizeof(struct _booleans)) struct _booleans _PyBooleans = {
/* Data for False */
/* Data for True */
};
we can use the alignment to check for True
or False
by testing a single bit: ref.bits & sizeof(PyLongObject)
, resulting in this very efficient stencil for _GUARD_IS_TRUE_POP
:
// 0: 41 f6 c7 20 testb $0x20, %r15b
// 4: 0f 84 00 00 00 00 je exit
Eclips4, StanFromIreland, sobolevn, abebus, AIdjis and 3 more
Metadata
Metadata
Assignees
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement