[mypyc] feat: cache ids for fallback pythonic method lookups [1/1] #19870

BobTheBuidler · 2025-09-17T19:47:55Z

This PR microoptimizes the usage of _Py_IDENTIFIER and _PyUnicode_FromId in various C files.

Changes:

For each identifier (e.g., setdefault, update, keys, values, items, clear, copy), a static cache variable (e.g., setdefault_id_unicode) is used to store the result of _PyUnicode_FromId.
The _Py_IDENTIFIER(name) macro is now declared only inside the conditional block that runs the first time a function is called (i.e., when the corresponding cache variable is NULL).
This ensures that the interned unicode object is initialized only once, and only if needed.
All subsequent calls reuse the cached unicode object, avoiding repeated calls to _PyUnicode_FromId and repeated static identifier declarations.

Example pattern after refactor:

static PyObject *setdefault_id_unicode = NULL;

PyObject *CPyDict_SetDefault(PyObject *dict, PyObject *key, PyObject *value) {
    if (PyDict_CheckExact(dict)) {
        PyObject* ret = PyDict_SetDefault(dict, key, value);
        Py_XINCREF(ret);
        return ret;
    }
    if (setdefault_id_unicode == NULL) {
        _Py_IDENTIFIER(setdefault);
        setdefault_id_unicode = _PyUnicode_FromId(&PyId_setdefault); /* borrowed */
        if (setdefault_id_unicode == NULL) {
            return NULL;
        }
    }
    return PyObject_CallMethodObjArgs(dict, setdefault_id_unicode, key, value, NULL);
}

JukkaL · 2025-10-14T10:47:43Z

mypyc/lib-rt/bytes_ops.c

-            return NULL;
+        if (join_id_unicode == NULL) {
+            _Py_IDENTIFIER(join);
+            join_id_unicode = _PyUnicode_FromId(&PyId_join); /* borrowed */


This is not thread-safe on free-threaded builds. I'm not sure what's the best way to work around this though. Using a relaxed memory order read could be sufficient. If we intern the string, this seems like a thread safe approach (as long as we don't use multiple subinterpreters, which are currently not supported).

Also the _Py_IDENTIFIER API is no longer part of the public API: python/cpython#108593. It would be good to have a replacement that uses public API as much as feasible if we are going to change these.

Below I give some ideas.

Here's how the atomic read/store might work (didn't check, based on LLM output):

#include <stdatomic.h> static _Atomic(PyObject *) join_id_unicode = ATOMIC_VAR_INIT(NULL); ... if (atomic_load_explicit(&join_id_unicode, memory_order_relaxed) == NULL) { ... atomic_store_explicit(...) ... }

Use PyUnicode_InternFromString to create a unicode object (once), instead of _PyUnicode_FromId.

Only update one use case first, and once we've agreed on a good approach, create a follow-up PR that migrates remaining use cases. This minimizes extra effort required to iteratively update based on review feedback.

BobTheBuidler and others added 5 commits September 17, 2025 19:43

[mypyc] feat: cache ids for fallback pythonic method lookups

f77f297

fix declared in 2 places

1f3d409

fix linker

fd66205

fix segfault

d574135

Merge branch 'master' into cache-ids

3267aea

BobTheBuidler changed the title ~~[mypyc] feat: cache ids for fallback pythonic method lookups~~ [mypyc] feat: cache ids for fallback pythonic method lookups [1/1] Oct 1, 2025

BobTheBuidler added 3 commits October 2, 2025 03:14

Merge branch 'master' into cache-ids

83e35de

Merge branch 'master' into cache-ids

22af2bc

Merge branch 'master' into cache-ids

da9b52c

JukkaL reviewed Oct 14, 2025

View reviewed changes

Merge branch 'master' into cache-ids

f6cabe0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[mypyc] feat: cache ids for fallback pythonic method lookups [1/1] #19870

[mypyc] feat: cache ids for fallback pythonic method lookups [1/1] #19870

Uh oh!

BobTheBuidler commented Sep 17, 2025

Uh oh!

JukkaL Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[mypyc] feat: cache ids for fallback pythonic method lookups [1/1] #19870

Are you sure you want to change the base?

[mypyc] feat: cache ids for fallback pythonic method lookups [1/1] #19870

Uh oh!

Conversation

BobTheBuidler commented Sep 17, 2025

Changes:

Example pattern after refactor:

Uh oh!

JukkaL Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants