-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
Closed as duplicate of#129392
Closed as duplicate of#129392
Copy link
Labels
type-featureA feature request or enhancementA feature request or enhancement
Description
Bug report
Bug description:
The C++ 23 Standard has a new syntax for universal character names, which codecs.decode
does not recognize. I ran this on Python 3.13, and the same occurs with earlier Python versions.
Python 3.13.1 (tags/v3.13.1:0671451, Dec 3 2024, 19:06:28) [MSC v.1942 64 bit (AMD64)] on win32
>>> import codecs
>>> codecs.decode('\u41',encoding='unicode-escape')
'A'
>>> codecs.decode('\u{41}',encoding='unicode-escape')
File "<python-input-3>", line 1
codecs.decode('\u{41}',encoding='unicode-escape')
^^^^^^^^^^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape
The result should be 'A'
.
For reference, this is quoted from the C++ 23 Standard, Appendix A.3:
universal-character-name:
...
\u{ simple-hexadecimal-digit-sequence }
simple-hexadecimal-digit-sequence:
hexadecimal-digit
simple-hexadecimal-digit-sequence hexadecimal-digit
Please update codecs
in Python 3.13, and all earlier Python versions that are still publishing bug fixes.
CPython versions tested on:
3.13
Operating systems tested on:
Windows
Metadata
Metadata
Assignees
Labels
type-featureA feature request or enhancementA feature request or enhancement