-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
gh-89083: add support for UUID version 7 (RFC 9562) #121119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 62 commits
42d55b4
6826fa1
edc2cab
c6d26b6
2ddb4b8
bcd1417
4630c8f
cd80afb
c3d4745
392d289
26889ea
44b66e6
7be6dc4
8ba3d8b
a14ae9b
7a169c9
b082c90
94c70e9
05b7a2b
275deb7
5e97cc3
051f34e
bdf9a77
00661fc
0474de4
a446d53
2e39072
ebc1a07
694e07f
965dbc8
7ff4368
7c3cab6
e758741
c18d0c4
2df6f41
6fcb6a1
f6048c9
be3f024
99c6761
06befca
2aacadf
f7f536e
aee2898
1a5ac19
8764b28
af0baef
939b5a8
ef85b20
2d08821
eaa9ad4
571d2fe
f9ac658
a756b9d
4406796
d4eeded
0e54a72
40ab2fa
5ee85ad
3ce8943
59e6d7e
437d8cf
2d917b0
73ab656
54d07ae
6d76389
bd4ab55
e9ddb74
8755de0
12d7ad4
560d87c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -919,8 +919,9 @@ urllib | |
| uuid | ||
| ---- | ||
|
|
||
| * Add support for UUID versions 6 and 8 via :func:`uuid.uuid6` and | ||
| :func:`uuid.uuid8` respectively, as specified in :rfc:`9562`. | ||
| * Add support for UUID versions 6, 7, and 8 via :func:`uuid.uuid6`, | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not to be pedantic, but is via correct? It seems to suggest that the functions are the only thing that support these versions, but the support is added in the UUID class and the functions are there too as a convenience. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, but in general, we don't really want people to directly use the UUID class. Strictly speaking, we're only adding the support for the version value but we don't check how it's been generated. I prefer users to actually use the factories. Otherwise, I can say that the UUID class now accepts There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I slightly disagree but won’t argue 🙂 |
||
| :func:`uuid.uuid7`, and :func:`uuid.uuid8` respectively, as specified | ||
| in :rfc:`9562`. | ||
| (Contributed by Bénédikt Tran in :gh:`89083`.) | ||
|
|
||
| * :const:`uuid.NIL` and :const:`uuid.MAX` are now available to represent the | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -1,8 +1,8 @@ | ||||||||
| r"""UUID objects (universally unique identifiers) according to RFC 4122/9562. | ||||||||
|
|
||||||||
| This module provides immutable UUID objects (class UUID) and the functions | ||||||||
| uuid1(), uuid3(), uuid4(), uuid5(), uuid6(), and uuid8() for generating | ||||||||
| version 1, 3, 4, 5, 6, and 8 UUIDs as specified in RFC 4122/9562. | ||||||||
| uuid{N}() for generating UUIDs version N as specified in RFC 4122/9562 for | ||||||||
| N = 1, 3, 4, 5, 6, 7, and 8. | ||||||||
picnixz marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
picnixz marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||
|
|
||||||||
| If all you want is a unique ID, you should probably call uuid1() or uuid4(). | ||||||||
| Note that uuid1() may compromise privacy since it creates a UUID containing | ||||||||
|
|
@@ -54,6 +54,7 @@ | |||||||
|
|
||||||||
| import os | ||||||||
| import sys | ||||||||
| import time | ||||||||
|
|
||||||||
| from enum import Enum, _simple_enum | ||||||||
|
|
||||||||
|
|
@@ -102,6 +103,7 @@ class SafeUUID: | |||||||
| _RFC_4122_VERSION_4_FLAGS = ((4 << 76) | (0x8000 << 48)) | ||||||||
| _RFC_4122_VERSION_5_FLAGS = ((5 << 76) | (0x8000 << 48)) | ||||||||
| _RFC_4122_VERSION_6_FLAGS = ((6 << 76) | (0x8000 << 48)) | ||||||||
| _RFC_4122_VERSION_7_FLAGS = ((7 << 76) | (0x8000 << 48)) | ||||||||
| _RFC_4122_VERSION_8_FLAGS = ((8 << 76) | (0x8000 << 48)) | ||||||||
|
|
||||||||
|
|
||||||||
|
|
@@ -720,7 +722,6 @@ def uuid1(node=None, clock_seq=None): | |||||||
| return UUID(bytes=uuid_time, is_safe=is_safe) | ||||||||
|
|
||||||||
| global _last_timestamp | ||||||||
| import time | ||||||||
| nanoseconds = time.time_ns() | ||||||||
| # 0x01b21dd213814000 is the number of 100-ns intervals between the | ||||||||
| # UUID epoch 1582-10-15 00:00:00 and the Unix epoch 1970-01-01 00:00:00. | ||||||||
|
|
@@ -808,6 +809,80 @@ def uuid6(node=None, clock_seq=None): | |||||||
| int_uuid_6 |= _RFC_4122_VERSION_6_FLAGS | ||||||||
| return UUID._from_int(int_uuid_6) | ||||||||
|
|
||||||||
| _last_timestamp_v7 = None | ||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wanted to apply a PEP-8 change in a separate PR because the module has inconsistencies. It seems a bit weird to only PEP-8ify this part of the code while the rest is not really PEP-8ified. See #121119 (comment). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. python-dev doesn’t have a practice of doing reformatting-only PRs. Instead, follow good conventions in code that is added or already changed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well... if a core dev endorses the change, I think it's fine. I don't mind endorsing it. I didn't do it for uuid6() nor for uuid8() when I wrote the function as there were more 1-blank lines separations rather than 2 blank lines separations. But if you insist on adding 2 blank lines, I'll also add them around the other functions because I prefer being consistent in this case (honestly, having 2 blank lines around only UUIDv7 makes it harder to read IMO). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would say PEP-8 tells me that we can also ignore the PEP if the surrounding code already breaks it. But I will make a commit to just add blank lines around the functions I've added (uuid6 to uuid8). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think that it's worth it to reformat the whole uuid.py file to PEP 8, but respecting PEP 8 for new code (or code near changed code) is a good practice. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, adding a few blank lines is innocuous (it does not change git blame, or risk changing the meaning of code), so it’s fine to do in existing code in this PR. Generally people saying they want to «apply PEP 8» think of more bigger changes. [note: marking this convo as unresolved just to help Victor or Hugo see it, not because there’s something left to do for the PR author] There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is about for example methods using camelCase in unittest or logging, not spaces! |
||||||||
| _last_counter_v7 = 0 # 42-bit counter | ||||||||
|
|
||||||||
| def _uuid7_get_counter_and_tail(): | ||||||||
| rand = int.from_bytes(os.urandom(10)) | ||||||||
| # 42-bit counter with MSB set to 0 | ||||||||
| counter = (rand >> 32) & 0x1ff_ffff_ffff | ||||||||
| # 32-bit random data | ||||||||
| tail = rand & 0xffff_ffff | ||||||||
| return counter, tail | ||||||||
|
|
||||||||
| def uuid7(): | ||||||||
| """Generate a UUID from a Unix timestamp in milliseconds and random bits. | ||||||||
|
|
||||||||
| UUIDv7 objects feature monotonicity within a millisecond. | ||||||||
| """ | ||||||||
| # --- 48 --- -- 4 -- --- 12 --- -- 2 -- --- 30 --- - 32 - | ||||||||
| # unix_ts_ms | version | counter_hi | variant | counter_lo | random | ||||||||
| # | ||||||||
| # 'counter = counter_hi | counter_lo' is a 42-bit counter constructed | ||||||||
| # with Method 1 of RFC 9562, §6.2, and its MSB is set to 0. | ||||||||
| # | ||||||||
| # 'random' is a 32-bit random value regenerated for every new UUID. | ||||||||
| # | ||||||||
| # If multiple UUIDs are generated within the same millisecond, the LSB | ||||||||
| # of 'counter' is incremented by 1. When overflowing, the timestamp is | ||||||||
| # advanced and the counter is reset to a random 42-bit integer with MSB | ||||||||
| # set to 0. | ||||||||
|
|
||||||||
| global _last_timestamp_v7 | ||||||||
| global _last_counter_v7 | ||||||||
|
|
||||||||
| nanoseconds = time.time_ns() | ||||||||
| timestamp_ms = nanoseconds // 1_000_000 | ||||||||
|
|
||||||||
| if _last_timestamp_v7 is None or timestamp_ms > _last_timestamp_v7: | ||||||||
| counter, tail = _uuid7_get_counter_and_tail() | ||||||||
| else: | ||||||||
| if timestamp_ms < _last_timestamp_v7: | ||||||||
| timestamp_ms = _last_timestamp_v7 + 1 | ||||||||
hugovk marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||
| # advance the 42-bit counter | ||||||||
| counter = _last_counter_v7 + 1 | ||||||||
| if counter > 0x3ff_ffff_ffff: | ||||||||
| # advance the 48-bit timestamp | ||||||||
| timestamp_ms += 1 | ||||||||
| counter, tail = _uuid7_get_counter_and_tail() | ||||||||
| else: | ||||||||
| # 32-bit random data | ||||||||
| tail = int.from_bytes(os.urandom(4)) | ||||||||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||
|
|
||||||||
| unix_ts_ms = timestamp_ms & 0xffff_ffff_ffff | ||||||||
| counter_msbs = counter >> 30 | ||||||||
| # keep 12 counter's MSBs and clear variant bits | ||||||||
| counter_hi = counter_msbs & 0x0fff | ||||||||
| # keep 30 counter's LSBs and clear version bits | ||||||||
| counter_lo = counter & 0x3fff_ffff | ||||||||
| # ensure that the tail is always a 32-bit integer (by construction, | ||||||||
| # it is already the case, but future interfaces may allow the user | ||||||||
| # to specify the random tail) | ||||||||
| tail &= 0xffff_ffff | ||||||||
|
|
||||||||
| int_uuid_7 = unix_ts_ms << 80 | ||||||||
| int_uuid_7 |= counter_hi << 64 | ||||||||
| int_uuid_7 |= counter_lo << 32 | ||||||||
| int_uuid_7 |= tail | ||||||||
| # by construction, the variant and version bits are already cleared | ||||||||
| int_uuid_7 |= _RFC_4122_VERSION_7_FLAGS | ||||||||
| res = UUID._from_int(int_uuid_7) | ||||||||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||
|
|
||||||||
| # defer global update until all computations are done | ||||||||
| _last_timestamp_v7 = timestamp_ms | ||||||||
| _last_counter_v7 = counter | ||||||||
| return res | ||||||||
|
|
||||||||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||
| def uuid8(a=None, b=None, c=None): | ||||||||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||
| """Generate a UUID from three custom blocks. | ||||||||
|
|
||||||||
|
|
@@ -841,6 +916,7 @@ def main(): | |||||||
| "uuid4": uuid4, | ||||||||
| "uuid5": uuid5, | ||||||||
| "uuid6": uuid6, | ||||||||
| "uuid7": uuid7, | ||||||||
| "uuid8": uuid8, | ||||||||
| } | ||||||||
| uuid_namespace_funcs = ("uuid3", "uuid5") | ||||||||
|
|
||||||||
Uh oh!
There was an error while loading. Please reload this page.