|
1 | 1 | Reimplement base UUID type, uuid4(), and uuid7() in C |
2 | 2 |
|
3 | | -The C implementation considerably boosts the performance of the key UUID |
4 | | -operations: |
5 | | - |
6 | | ------------------------------------- |
7 | | -Operation Speedup |
8 | | ------------------------------------- |
9 | | -uuid4() generation 15.01x |
10 | | -uuid7() generation 29.64x |
11 | | -UUID from string 6.76x |
12 | | -UUID from bytes 5.16x |
13 | | -str(uuid) conversion 6.66x |
14 | | ------------------------------------- |
15 | | - |
16 | | -Summary of changes: |
17 | | - |
18 | | -* The UUID type is reimplemented in C in its entirety. |
19 | | - |
20 | | -* The pure-Python is kept around and is used of the C implementation |
21 | | - isn't available for some reason. |
22 | | - |
23 | | -* Both implementations are tested extensively; additional tests are |
24 | | - added to ensure that the C implementation of the type follows the pure |
25 | | - Python implementation fully. |
26 | | - |
27 | | -* The Python implementation stores UUID values as int objects. The C |
28 | | - implementation stores them as ``uint8_t[16]`` array. |
29 | | - |
30 | | -* The C implementation has faster hash() implementation but also caches |
31 | | - the computed hash value to speedup cases when UUIDs are used as |
32 | | - set/dict keys. |
33 | | - |
34 | | -* The C implementation has a freelist to make new UUID object |
35 | | - instantiation as fast as possible. |
36 | | - |
37 | | -* uuid4() and uuid7() are now implmented in C. The most performance |
38 | | - boost (10x) comes from overfetching entropy to decrease the number of |
39 | | - _PyOS_URandom() calls. On its own it's a safe optimization with the |
40 | | - edge case that Unix fork needs to be explicitly handled. We do that by |
41 | | - comparing the current PID to the PID of when the random buffer was |
42 | | - populated. |
43 | | - |
44 | | -* Portions of code are coming from my implementation of faster UUID |
45 | | - in gel-python [1]. I did use AI during the development, but basically |
46 | | - had to rewrite the code it generated to be more idiomatic and |
47 | | - efficient. |
48 | | - |
49 | | -* The benchmark can be found here [2]. |
50 | | - |
51 | | -* This PR makes Python UUID operations as fast as they are in NodeJS and |
52 | | - Bun runtimes. |
53 | | - |
54 | | -[1] |
55 | | -https://github.com/MagicStack/py-pgproto/blob/b8109fb311a59f30f9947567a13508da9a776564/uuid.pyx |
56 | | - |
57 | | -[2] https://gist.github.com/1st1/f03e816f34a61e4d46c78ff98baf4818 |
0 commit comments