-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
Open
Labels
3.15new features, bugs and security fixesnew features, bugs and security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement
Description
Currently, we store an array of bytes at the end of the inline value array of objects to record the insertion order.
This is expensive as we need to compute not only the value to insert, but where to insert it. Here's the code:
PyDictValues *values = _PyObject_InlineValues(owner_o);
Py_ssize_t index = value_ptr - values->values;
_PyDictValues_AddToInsertionOrder(values, index);
_PyObject_InlineValues
requires some lookup, as it depends on tp->basicsize
return (PyDictValues *)((char *)obj + tp->tp_basicsize);
but it is _PyDictValues_AddToInsertionOrder
that is the slowest:
int size = values->size;
uint8_t *array = (uint8_t *)&values->values[values->capacity];
array[size] = (uint8_t)ix;
values->size = size+1;
If instead of storing the delta of the index and position, instead of just the index, for the majority of objects the insert order array will go from { 0, 1, 2, 3, 4, ... }
to { 0, 0, 0, 0, 0, ... }
And if the delta is zero, we don't need to store anything.
PyDictValues *values = _PyObject_InlineValues(owner_o);
Py_ssize_t index = value_ptr - values->values;
Py_ssize_t delta = index - values->size;
if (delta != 0) {
/* This is the expensive part */
_PyDictValues_AddToInsertionOrder(values, delta);
}
values->size++;
In the JIT we can track the size of the inline values, and know when delta
will be zero. Reducing the above code to
values->size = KNOWN_SIZE;
CarlosEduR
Metadata
Metadata
Assignees
Labels
3.15new features, bugs and security fixesnew features, bugs and security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement