-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
gh-126868: Add freelist for compact int objects #126865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 12 commits
0713034
be58ade
3f50b54
fa97302
d72486f
e07c218
328e0c1
e1dc2b3
9df776b
6b73046
db8247e
d1e4aa2
644e85a
9f86b6e
88274d6
1e548bd
13bc3e9
60220c0
593d621
efde111
928e912
437c24c
b034948
19f64f6
14681c1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Increase performance of int by adding a freelist for compact ints. | ||
eendebakpt marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -6,6 +6,7 @@ | |||||||||||||
#include "pycore_bitutils.h" // _Py_popcount32() | ||||||||||||||
#include "pycore_initconfig.h" // _PyStatus_OK() | ||||||||||||||
#include "pycore_call.h" // _PyObject_MakeTpCall | ||||||||||||||
#include "pycore_freelist.h" // _Py_FREELIST_FREE(), _Py_FREELIST_POP() | ||||||||||||||
|
||||||||||||||
#include "pycore_long.h" // _Py_SmallInts | ||||||||||||||
#include "pycore_object.h" // _PyObject_Init() | ||||||||||||||
#include "pycore_runtime.h" // _PY_NSMALLPOSINTS | ||||||||||||||
|
@@ -42,7 +43,7 @@ static inline void | |||||||||||||
_Py_DECREF_INT(PyLongObject *op) | ||||||||||||||
{ | ||||||||||||||
assert(PyLong_CheckExact(op)); | ||||||||||||||
_Py_DECREF_SPECIALIZED((PyObject *)op, (destructor)PyObject_Free); | ||||||||||||||
_Py_DECREF_SPECIALIZED((PyObject *)op, (destructor) _PyLong_ExactDealloc); | ||||||||||||||
|
_Py_DECREF_SPECIALIZED((PyObject *)op, (destructor) _PyLong_ExactDealloc); | |
_Py_DECREF_SPECIALIZED((PyObject *)op, (destructor)_PyLong_ExactDealloc); |
I'm also not sure whether you need the cast. If you need the cast, I'd suggest changing the signature of the drstructor itself.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you go back to _Py_FREELIST_POP
and avoid reinitializing the object on line 237. We don't need to re-assign the ob_type
field in objects taken from the freelist. Something like:
PyLongObject *v = (PyLongObject *)_Py_FREELIST_POP(ints);
if (v == NULL) {
v = PyObject_Malloc(sizeof(PyLongObject));
if (v == NULL) {
PyErr_NoMemory();
return NULL;
}
_PyObject_Init((PyObject*)v, &PyLong_Type);
}
digit abs_x = x < 0 ? -x : x;
_PyLong_SetSignAndDigitCount(v, x<0?-1:1, 1);
v->long_value.ob_digit[0] = abs_x;
return (PyObject*)v;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realised that we didnt check for if it's a smallint cached value here. It should be rejected there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind. It's not needed because (theoretically) the refcount of a smallint never hits 0 due to it being immortal, so this code will never be called on it. Can you add an assert to be safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also means my comment about the small ints was wrong. Could you remove that please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment has been adapted. If the refcount can never become zero, then the code in long_dealloc
checking for small ints can be removed (perhaps also with an assert). Since this is something unrelated to the freelists I will make a separate PR for that and rebase this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fidget-Spinner I think your first remark was right after all and we do need to check for smallints (at least in the 32-bit stable ABI build). I will add the check once there is consensus on #127120 (which might very well be the status quo)
Fidget-Spinner marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* we accidentally decref small Ints out of existence. Instead, | |
* since small Ints are immortal, re-set the reference count. | |
*/ | |
* we accidentally decref small Ints out of existence. Instead, | |
* since small Ints are immortal, re-set the reference count. | |
*/ |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fidget-Spinner The _PyLong_IsCompact
check has already been done in this method, can we move this into the if (pylong && _PyLong_IsCompact(pylong))
part?
Also, can we remove the pylong &&
part? I think pylong
can never be NULL. (PyLong_CheckExact
assumes the pointer is not NULL I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that sounds good
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we align the .tp_free comment? (maybe tabs and spaces are mixed, hence that's why I see them unaligned on monile). Also, is the cast necessary to avoid UBSan failures? If not, you can just use .tp_free = ...
to emphasize the semantic
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -26,6 +26,7 @@ | |
#include "pycore_pyerrors.h" // _PyErr_GetRaisedException() | ||
#include "pycore_pystate.h" // _PyInterpreterState_GET() | ||
#include "pycore_range.h" // _PyRangeIterObject | ||
#include "pycore_long.h" // void _PyLong_ExactDealloc(PyLongObject *op); | ||
|
||
#include "pycore_setobject.h" // _PySet_NextEntry() | ||
#include "pycore_sliceobject.h" // _PyBuildSlice_ConsumeRefs | ||
#include "pycore_tuple.h" // _PyTuple_ITEMS() | ||
|
@@ -514,8 +515,8 @@ dummy_func( | |
|
||
STAT_INC(BINARY_OP, hit); | ||
PyObject *res_o = _PyLong_Multiply((PyLongObject *)left_o, (PyLongObject *)right_o); | ||
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)PyObject_Free); | ||
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)PyObject_Free); | ||
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)_PyLong_ExactDealloc); | ||
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)_PyLong_ExactDealloc); | ||
INPUTS_DEAD(); | ||
ERROR_IF(res_o == NULL, error); | ||
res = PyStackRef_FromPyObjectSteal(res_o); | ||
|
@@ -527,8 +528,8 @@ dummy_func( | |
|
||
STAT_INC(BINARY_OP, hit); | ||
PyObject *res_o = _PyLong_Add((PyLongObject *)left_o, (PyLongObject *)right_o); | ||
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)PyObject_Free); | ||
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)PyObject_Free); | ||
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)_PyLong_ExactDealloc); | ||
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)_PyLong_ExactDealloc); | ||
INPUTS_DEAD(); | ||
ERROR_IF(res_o == NULL, error); | ||
res = PyStackRef_FromPyObjectSteal(res_o); | ||
|
@@ -540,8 +541,8 @@ dummy_func( | |
|
||
STAT_INC(BINARY_OP, hit); | ||
PyObject *res_o = _PyLong_Subtract((PyLongObject *)left_o, (PyLongObject *)right_o); | ||
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)PyObject_Free); | ||
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)PyObject_Free); | ||
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)_PyLong_ExactDealloc); | ||
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)_PyLong_ExactDealloc); | ||
INPUTS_DEAD(); | ||
ERROR_IF(res_o == NULL, error); | ||
res = PyStackRef_FromPyObjectSteal(res_o); | ||
|
@@ -797,7 +798,7 @@ dummy_func( | |
PyObject *res_o = PyList_GET_ITEM(list, index); | ||
assert(res_o != NULL); | ||
Py_INCREF(res_o); | ||
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free); | ||
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc); | ||
DEAD(sub_st); | ||
PyStackRef_CLOSE(list_st); | ||
res = PyStackRef_FromPyObjectSteal(res_o); | ||
|
@@ -817,7 +818,7 @@ dummy_func( | |
DEOPT_IF(Py_ARRAY_LENGTH(_Py_SINGLETON(strings).ascii) <= c); | ||
STAT_INC(BINARY_SUBSCR, hit); | ||
PyObject *res_o = (PyObject*)&_Py_SINGLETON(strings).ascii[c]; | ||
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free); | ||
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc); | ||
DEAD(sub_st); | ||
PyStackRef_CLOSE(str_st); | ||
res = PyStackRef_FromPyObjectSteal(res_o); | ||
|
@@ -838,7 +839,7 @@ dummy_func( | |
PyObject *res_o = PyTuple_GET_ITEM(tuple, index); | ||
assert(res_o != NULL); | ||
Py_INCREF(res_o); | ||
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free); | ||
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc); | ||
DEAD(sub_st); | ||
PyStackRef_CLOSE(tuple_st); | ||
res = PyStackRef_FromPyObjectSteal(res_o); | ||
|
@@ -950,7 +951,7 @@ dummy_func( | |
PyList_SET_ITEM(list, index, PyStackRef_AsPyObjectSteal(value)); | ||
assert(old_value != NULL); | ||
Py_DECREF(old_value); | ||
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free); | ||
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc); | ||
DEAD(sub_st); | ||
PyStackRef_CLOSE(list_st); | ||
} | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need it to be exported like that or could you live with a simple
extern
? If not, can you explain which file needs this export?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's needed by the JIT on Windows, as it's used in bytecodes.c. This is a pretty common pattern in the internal C API unfortunately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For semantic purposes, would it be better to have a macro named differently? (it would be a simple alias but it could help semantics and reviewers)