-
-
Notifications
You must be signed in to change notification settings - Fork 33.4k
GH-137759:Limit _PyObject_HashFast to dicts keys, rename it, and mark it as Py_ALWAYS_INLINE #137828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…d mark it as Py_ALWAYS_INLINE
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
I'm not totally convinced that this is true. Detecting the fast-path for strings is just one pointer comparison, which really won't have a noticeable performance impact. I agree that we want |
I'm fine with adding a fast-path for strings and integers (but then do we even need Footnotes
|
Yeah, because even if there's no speedup there, it's just simpler maintenance (in the future, we might want to make other assumptions about a dictionary key's hash that improve performance). |
Playing devil's advocate: if we don't detect it, we're a a function pointer dereference and call away from getting the hash anyway, right? The hash is not recomputed without that fast-path since it's stored (for strings, but for integers it's just an AND operation if I remember correctly)? |
|
A function pointer call has a lot more to do than just some reads, but maybe the overhead isn't noticeable? |
The best thing to answer this without guessing is probably to compile both (or all three, if we also include a fast path for integers) and run pyperformance. |


Context:
setobject.cused a mix of_PyObject_HashFastandPyObject_Hash. The consensus1 was that_PyObject_HashFastshould only really be used for dict keys (string-only sets are not so omnipresent). This PR uses the standardPyObject_Hashwithout the fast path (makingset[Any]faster andset[str]slower2), renames_PyObject_HashFastsince its name was misleading and marked it asPy_ALWAYS_INLINEto force inlining.Pretty sure this doesn't require a NEWS entry?
set_intersectionuse_PyObject_HashFast()? #137759Footnotes
There were discussions in the same issue around removing
_PyObject_HashFastcompletely and adding the fast path for strings (and even potentially integers) toPyObject_Hashbut there was no consensus so I'm not including it for now. ↩In this comment , I run
pyperformanceon two versions ofsetobject.c, one using HashFast everywhere and one using the standard Hash everywhere. None of these correspond to the current implementation ofsetobject.csince the current implementation uses a mix of both. Obviously in all cases, dictionaries use HashFast. ↩