-
-
Notifications
You must be signed in to change notification settings - Fork 33.6k
gh-139772: Add PyDict_FromItems() function #139963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
For consuming from PyO3 / Rust, I can see this function being obviously useful for cases of small dictionaries with statically known keys (think producing things that look like I think for arbitrary-sized collections, it's probably the case (in Rust) that either:
|
Do you mean producing an array of |
|
Adding this function would avoid having to make the private PyObject *
_PyStack_AsDict(PyObject *const *values, PyObject *kwnames)
{
Py_ssize_t nkwargs;
assert(kwnames != NULL);
nkwargs = PyTuple_GET_SIZE(kwnames);
return _PyDict_FromItems(&PyTuple_GET_ITEM(kwnames, 0), 1,
values, 1, nkwargs);
} |
|
I was thinking more like 2-tuples, the type might be written in Rust as The 2-tuples are quite a natural structure for Rust producers of "items" (it's what they would expect when iterating a mapping type, for example). But maybe the more common case would be the second one I suggest - a rust iterator producing item 2-tuples with a size hint. At the moment we just start from I could of course use the |
Or name the function in this PR Note that the current private The case of building small literal dicts could also use a
That's really not much different from |
I think it's ok, the individual tuple items are pointers and so will be aligned appropriately. AFAIK Rust is allowed to reorder tuples to improve packing but guarantees all elements are properly aligned for their type.
True, just that we try not to use private APIs at all in PyO3 so having a public API for this would open up the possibility to use it in PyO3. I understand there's a question about what to do about the unicode optimization with the "presized" API, I suggest we just make it roughly match whatever a normal Python dict would do if created empty and then had items repeatedly added (with the exception that the storage is preallocated). |
So you would prefer this API? PyObject *
PyDict_FromItems(PyObject *const *keys, Py_ssize_t keys_offset,
PyObject *const *values, Py_ssize_t values_offset,
Py_ssize_t length) |
In short, you would prefer #139773 API? |
|
I think so, yes (will comment on that thread). |
|
Benchmark comparing:
diff --git a/Modules/_testcapimodule.c b/Modules/_testcapimodule.c
index 4e73be20e1b..adae3fa2dc3 100644
--- a/Modules/_testcapimodule.c
+++ b/Modules/_testcapimodule.c
@@ -2562,6 +2562,90 @@ toggle_reftrace_printer(PyObject *ob, PyObject *arg)
Py_RETURN_NONE;
}
+
+static PyObject *
+bench_dict_new(PyObject *ob, PyObject *args)
+{
+ Py_ssize_t size, loops;
+ if (!PyArg_ParseTuple(args, "nn", &size, &loops)) {
+ return NULL;
+ }
+
+ PyTime_t t1, t2;
+ PyTime_PerfCounterRaw(&t1);
+ for (Py_ssize_t loop=0; loop < loops; loop++) {
+ PyObject *d = PyDict_New();
+ if (d == NULL) {
+ return NULL;
+ }
+
+ for (Py_ssize_t i=0; i < size; i++) {
+ PyObject *key = PyUnicode_FromFormat("%zi", i);
+ assert(key != NULL);
+
+ PyObject *value = PyLong_FromLong(i);
+ assert(value != NULL);
+
+ assert(PyDict_SetItem(d, key, value) == 0);
+ }
+
+ assert(PyDict_Size(d) == size);
+ Py_DECREF(d);
+ }
+ PyTime_PerfCounterRaw(&t2);
+
+ return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
+
+static PyObject *
+bench_dict_fromitems(PyObject *ob, PyObject *args)
+{
+ Py_ssize_t size, loops;
+ if (!PyArg_ParseTuple(args, "nn", &size, &loops)) {
+ return NULL;
+ }
+
+ PyTime_t t1, t2;
+ PyTime_PerfCounterRaw(&t1);
+ for (Py_ssize_t loop=0; loop < loops; loop++) {
+ PyObject **keys = (PyObject **)PyMem_Malloc(size * sizeof(PyObject*));
+ if (keys == NULL) {
+ return NULL;
+ }
+ PyObject **values = (PyObject **)PyMem_Malloc(size * sizeof(PyObject*));
+ if (values == NULL) {
+ return NULL;
+ }
+
+ for (Py_ssize_t i=0; i < size; i++) {
+ PyObject *key = PyUnicode_FromFormat("%zi", i);
+ assert(key != NULL);
+
+ PyObject *value = PyLong_FromLong(i);
+ assert(value != NULL);
+
+ keys[i] = key;
+ values[i] = value;
+ }
+
+ PyObject *d = PyDict_FromItems(keys, values, size);
+ assert(d != NULL);
+ Py_DECREF(d);
+
+ for (Py_ssize_t i=0; i < size; i++) {
+ Py_DECREF(keys[i]);
+ Py_DECREF(values[i]);
+ }
+ PyMem_Free(keys);
+ PyMem_Free(values);
+ }
+ PyTime_PerfCounterRaw(&t2);
+
+ return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
+
static PyMethodDef TestMethods[] = {
{"set_errno", set_errno, METH_VARARGS},
{"test_config", test_config, METH_NOARGS},
@@ -2656,6 +2740,8 @@ static PyMethodDef TestMethods[] = {
{"test_atexit", test_atexit, METH_NOARGS},
{"code_offset_to_line", _PyCFunction_CAST(code_offset_to_line), METH_FASTCALL},
{"toggle_reftrace_printer", toggle_reftrace_printer, METH_O},
+ {"bench_dict_new", bench_dict_new, METH_VARARGS},
+ {"bench_dict_fromitems", bench_dict_fromitems, METH_VARARGS},
{NULL, NULL} /* sentinel */
};
|
|
Regarding the benchmark numbers, internal loops are obviously faster than a large series of repeated API calls, but I doubt that a |
|
What about a |
|
65bf623 to
9d33600
Compare
|
I wrote #141682 to add |
📚 Documentation preview 📚: https://cpython-previews--139963.org.readthedocs.build/