-
-
Notifications
You must be signed in to change notification settings - Fork 33.6k
Open
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement
Description
Currently to check whether a value is True or False we need to compare it either True or False.
In the both the JIT and the interpreter (although this is worse in the JIT) we need to load a full 8 byte value from the instruction stream in order to perform the comparison.
Here's the x86 stencil for _GUARD_IS_TRUE_POP (with TOS caching):
// 0: 48 b8 00 00 00 00 00 00 00 00 movabsq $0x0, %rax
// 0000000000000002: R_X86_64_64 _Py_TrueStruct+0x1
// a: 4c 39 f8 cmpq %r15, %rax
// d: 0f 85 00 00 00 00 jne exit
By putting True and False in an aligned array:
struct _booleans {
PyLongObject False;
PyLongObject True;
};
alignas(sizeof(struct _booleans)) struct _booleans _PyBooleans = {
/* Data for False */
/* Data for True */
};we can use the alignment to check for True or False by testing a single bit: ref.bits & sizeof(PyLongObject), resulting in this very efficient stencil for _GUARD_IS_TRUE_POP:
// 0: 41 f6 c7 20 testb $0x20, %r15b
// 4: 0f 84 00 00 00 00 je exit
Eclips4, StanFromIreland, sobolevn, albertedwardson, AIdjis and 3 more
Metadata
Metadata
Assignees
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement