-
Notifications
You must be signed in to change notification settings - Fork 52
Description
This is in part motivated by #402.
It is also an attempt to avoid the inefficiencies in python/cpython#27738
It also relates to #132.
it is also needed to implement python/cpython#98260 efficiently
Almost all objects end up on a freelist when de-allocated, about half in an explicit freelist, and the other half in an ob_malloc freelist.
However, the amount of indirection and overhead to get from _Py_Dealloc to adding something to the freelist can be huge. To free an int the following happens:
_Py_DealloccallsPyLongType.tp_dealloc(via a function pointer, just to prevent the compiler doing its job π )PyLongType.tp_dealloccallsPyObject_Free(again via function pointer)PyObject_Freecalls_PyObject_Free(again via function pointer)_PyObject_Freecallspymalloc_freewhich:- Does a radix tree search to check that the object belongs to
ob_malloc - Finds the pool to which the object belongs
- Add the object to the pool's freelist
- Do some pool management if the pool is now emtpy, or was previously full.
- Does a radix tree search to check that the object belongs to
We want to do two things to improve performance.
- Get from
Py_DECREF()toPyObject_Freemore efficiently - Get from
PyObject_Freeto putting the memory on the freelist more efficiently.
Getting from Py_DECREF() to PyObject_Free more efficiently
Rather than every extension class writing its own dealloc and free functions, types should set flags to indicate whether they:
- Are just bits of memory and need no dealloc, e.g. ints, floats.
- Need deallocation of the objects and memory they contain, but do not need finalization
- Have explicitly separate deallocation and finalization functions.
- Legacy code, with a
tp_deallocfunction that can do anything.
We need two bits in tp_flags to express this.
For objects that are just lumps of memory we can set tp_dealloc to point to PyObject_Free avoiding the extra indirection.
The other cases would get their own function pointers, but would can do some of the dispatching at class creation time, not at object deallocation time.
Getting from PyObject_Free to putting the memory on the freelist more efficiently.
See #132 for implementation details of freelists.
We need to compute the size of the object quickly to determine the freelist to use.
Any class that uses the standard allocator PyType_GenericAlloc can have its size computed reliably.
Other classes would need to use the current generic approach, possibly with a few customizations