Skip to content

Commit 9af396d

Browse files
committed
pythongh-114667: Support hexadecimal floating point literals
This add hexadecimal floating point literals (IEEE 754-2008 §5.12.3) and support construction of floats from hexadecimal strings. Note that the syntax is more permissive: everything that is currently accepted by the ``float.fromhex()``, but with a mandatory base specifier; it also allows grouping digits with underscores. Examples: ```pycon >>> 0x1.1p-1 0.53125 >>> float('0x1.1') 1.0625 >>> 0x1.1 1.0625 >>> 0x1.1_1_1 1.066650390625 ``` Added compatibility code to not break access of existing int attributes. E.g. 0x1.bit_length() will not require parentheses around the hexadecimal integer literal (like 1.bit_length() for decimal int). Minor changes: Py_ISDIGIT/ISXDIGIT macros were transformed to functions.
1 parent 046a4e3 commit 9af396d

File tree

15 files changed

+277
-84
lines changed

15 files changed

+277
-84
lines changed

Doc/library/functions.rst

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -770,7 +770,8 @@ are always available. They are listed here in alphabetical order.
770770
>>> float('-Infinity')
771771
-inf
772772

773-
If the argument is a string, it should contain a decimal number, optionally
773+
If the argument is a string, it should contain a decimal number
774+
or a hexadecimal number, optionally
774775
preceded by a sign, and optionally embedded in whitespace. The optional
775776
sign may be ``'+'`` or ``'-'``; a ``'+'`` sign has no effect on the value
776777
produced. The argument may also be a string representing a NaN
@@ -787,12 +788,16 @@ are always available. They are listed here in alphabetical order.
787788
digitpart: `digit` (["_"] `digit`)*
788789
number: [`digitpart`] "." `digitpart` | `digitpart` ["."]
789790
exponent: ("e" | "E") [`sign`] `digitpart`
790-
floatnumber: `number` [`exponent`]
791+
floatnumber: (`number` [`exponent`]) | `hexfloatnumber`
791792
absfloatvalue: `floatnumber` | `infinity` | `nan`
792793
floatvalue: [`sign`] `absfloatvalue`
794+
hexfloatnumber: `~python-grammar:hexinteger` | `~python-grammar:hexfraction` | `~python-grammar:hexfloat`
793795

794796
Case is not significant, so, for example, "inf", "Inf", "INFINITY", and
795-
"iNfINity" are all acceptable spellings for positive infinity.
797+
"iNfINity" are all acceptable spellings for positive infinity. Note also
798+
that the exponent of a hexadecimal floating point number is written in
799+
decimal, and that it gives the power of 2 by which to multiply the
800+
coefficient.
796801

797802
Otherwise, if the argument is an integer or a floating-point number, a
798803
floating-point number with the same value (within Python's floating-point
@@ -818,6 +823,9 @@ are always available. They are listed here in alphabetical order.
818823
.. versionchanged:: 3.8
819824
Falls back to :meth:`~object.__index__` if :meth:`~object.__float__` is not defined.
820825

826+
.. versionchanged:: next
827+
Added support for hexadecimal floating-point numbers.
828+
821829

822830
.. index::
823831
single: __format__

Doc/reference/lexical_analysis.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1265,6 +1265,9 @@ The ``e`` or ``E`` represents "times ten raised to the power of"::
12651265
1.166e-5 # (represents 1.166×10⁻⁵, or 0.00001166)
12661266
6.02214076e+23 # (represents 6.02214076×10²³, or 602214076000000000000000.)
12671267

1268+
The exponent of a hexadecimal floating point literal is written in decimal, and
1269+
it gives the power of 2 by which to multiply the coefficient.
1270+
12681271
In floats with only integer and exponent parts, the decimal point may be
12691272
omitted::
12701273

@@ -1281,12 +1284,21 @@ lexical definitions:
12811284
| `digitpart` "." [`digitpart`] [`exponent`]
12821285
| "." `digitpart` [`exponent`]
12831286
| `digitpart` `exponent`
1287+
| `hexfloat`
12841288
digitpart: `digit` (["_"] `digit`)*
12851289
exponent: ("e" | "E") ["+" | "-"] `digitpart`
1290+
hexfloat: ("0x | "0X") ["_"] (`hexdigitpart` | `hexpointfloat`) [`binexponent`]
1291+
hexpointfloat: [`hexdigit`] `hexfraction` | `hexdigitpart` "."
1292+
hexfraction: "." `hexdigitpart`
1293+
hexdigitpart: `hexdigit` (["_"] `hexdigit`)*
1294+
binexponent: ("p" | "P") ["+" | "-"] `digitpart`
12861295

12871296
.. versionchanged:: 3.6
12881297
Underscores are now allowed for grouping purposes in literals.
12891298

1299+
.. versionchanged:: next
1300+
Added support for hexadecimal floating-point literals.
1301+
12901302

12911303
.. index::
12921304
single: j; in numeric literal

Doc/tutorial/floatingpoint.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -210,7 +210,7 @@ the float value exactly:
210210

211211
.. doctest::
212212

213-
>>> x == float.fromhex('0x1.921f9f01b866ep+1')
213+
>>> x == 0x1.921f9f01b866ep+1
214214
True
215215

216216
Since the representation is exact, it is useful for reliably porting values

Include/cpython/pyctype.h

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,17 @@ PyAPI_DATA(const unsigned int) _Py_ctype_table[256];
2121
#define Py_ISLOWER(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_LOWER)
2222
#define Py_ISUPPER(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_UPPER)
2323
#define Py_ISALPHA(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_ALPHA)
24-
#define Py_ISDIGIT(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_DIGIT)
25-
#define Py_ISXDIGIT(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_XDIGIT)
2624
#define Py_ISALNUM(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_ALNUM)
2725
#define Py_ISSPACE(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_SPACE)
2826

27+
static inline int Py_ISDIGIT(char c) {
28+
return _Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_DIGIT;
29+
}
30+
31+
static inline int Py_ISXDIGIT(char c) {
32+
return _Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_XDIGIT;
33+
}
34+
2935
PyAPI_DATA(const unsigned char) _Py_ctype_tolower[256];
3036
PyAPI_DATA(const unsigned char) _Py_ctype_toupper[256];
3137

Include/internal/pycore_floatobject.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ extern double _Py_parse_inf_or_nan(const char *p, char **endptr);
4242

4343
extern int _Py_convert_int_to_double(PyObject **v, double *dbl);
4444

45+
extern double _Py_dg_strtod_hex(const char *str, char **ptr);
4546

4647
#ifdef __cplusplus
4748
}

Lib/test/support/numbers.py

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,16 @@
2424
'.1_4j',
2525
'(1_2.5+3_3j)',
2626
'(.5_6j)',
27+
'0x_.1p1',
28+
'0X_.1p1',
29+
'0x1_1.p1',
30+
'0x_1_1.p1',
31+
'0x1.1_1p1',
32+
'0x1.p1_1',
33+
'0xa.p1',
34+
'0x.ap1',
35+
'0xa_c.p1',
36+
'0x.a_cp1',
2737
]
2838
INVALID_UNDERSCORE_LITERALS = [
2939
# Trailing underscores:
@@ -35,6 +45,8 @@
3545
'0xf_',
3646
'0o5_',
3747
'0 if 1_Else 1',
48+
'0x1p1_',
49+
'0x1.1p1_',
3850
# Underscores in the base selector:
3951
'0_b0',
4052
'0_xf',
@@ -52,28 +64,39 @@
5264
'0o5__77',
5365
'1e1__0',
5466
'1e1__0j',
67+
'0x1__1.1p1',
5568
# Underscore right before a dot:
5669
'1_.4',
5770
'1_.4j',
71+
'0x1_.p1',
72+
'0xa_.p1',
5873
# Underscore right after a dot:
5974
'1._4',
6075
'1._4j',
6176
'._5',
6277
'._5j',
78+
'0x1._p1',
79+
'0xa._p1',
6380
# Underscore right after a sign:
6481
'1.0e+_1',
6582
'1.0e+_1j',
83+
'0x1.1p+_1',
6684
# Underscore right before j:
6785
'1.4_j',
6886
'1.4e5_j',
6987
# Underscore right before e:
7088
'1_e1',
7189
'1.4_e1',
7290
'1.4_e1j',
73-
# Underscore right after e:
91+
'0x1.1p1_j',
92+
# Underscore right after e or p:
7493
'1e_1',
7594
'1.4e_1',
7695
'1.4e_1j',
96+
'0x1_p1',
97+
'0x1_P1',
98+
'0x1.1_p1',
99+
'0x1.1_P1',
77100
# Complex cases with parens:
78101
'(1+1.5_j_)',
79102
'(1+1.5_j)',

Lib/test/test_float.py

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -63,9 +63,9 @@ def test_float(self):
6363
self.assertEqual(float(3.14), 3.14)
6464
self.assertEqual(float(314), 314.0)
6565
self.assertEqual(float(" 3.14 "), 3.14)
66-
self.assertRaises(ValueError, float, " 0x3.1 ")
67-
self.assertRaises(ValueError, float, " -0x3.p-1 ")
68-
self.assertRaises(ValueError, float, " +0x3.p-1 ")
66+
self.assertEqual(float(" 0x3.1 "), 3.0625)
67+
self.assertEqual(float(" -0x3.p-1 "), -1.5)
68+
self.assertEqual(float(" +0x3.p-1 "), 1.5)
6969
self.assertRaises(ValueError, float, "++3.14")
7070
self.assertRaises(ValueError, float, "+-3.14")
7171
self.assertRaises(ValueError, float, "-+3.14")
@@ -95,13 +95,13 @@ def test_noargs(self):
9595

9696
def test_underscores(self):
9797
for lit in VALID_UNDERSCORE_LITERALS:
98-
if not any(ch in lit for ch in 'jJxXoObB'):
98+
if not any(ch in lit for ch in 'jJoObB'):
9999
self.assertEqual(float(lit), eval(lit))
100100
self.assertEqual(float(lit), float(lit.replace('_', '')))
101101
for lit in INVALID_UNDERSCORE_LITERALS:
102102
if lit in ('0_7', '09_99'): # octals are not recognized here
103103
continue
104-
if not any(ch in lit for ch in 'jJxXoObB'):
104+
if not any(ch in lit for ch in 'jJoObB'):
105105
self.assertRaises(ValueError, float, lit)
106106
# Additional test cases; nan and inf are never valid as literals,
107107
# only in the float() constructor, but we don't allow underscores
@@ -198,9 +198,9 @@ def test_float_with_comma(self):
198198
self.assertRaises(ValueError, float, " 3,14 ")
199199
self.assertRaises(ValueError, float, " +3,14 ")
200200
self.assertRaises(ValueError, float, " -3,14 ")
201-
self.assertRaises(ValueError, float, " 0x3.1 ")
202-
self.assertRaises(ValueError, float, " -0x3.p-1 ")
203-
self.assertRaises(ValueError, float, " +0x3.p-1 ")
201+
self.assertEqual(float(" 0x3.1 "), 3.0625)
202+
self.assertEqual(float(" -0x3.p-1 "), -1.5)
203+
self.assertEqual(float(" +0x3.p-1 "), 1.5)
204204
self.assertEqual(float(" 25.e-1 "), 2.5)
205205
self.assertAlmostEqual(float(" .25e-1 "), .025)
206206

@@ -1559,7 +1559,7 @@ def roundtrip(x):
15591559
except OverflowError:
15601560
pass
15611561
else:
1562-
self.identical(x, fromHex(toHex(x)))
1562+
self.identical(x, roundtrip(x))
15631563

15641564
def test_subclass(self):
15651565
class F(float):

Lib/test/test_grammar.py

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,15 @@ def test_plain_integers(self):
7474
else:
7575
self.fail('Weird maxsize value %r' % maxsize)
7676

77+
def test_attrs_on_hexintegers(self):
78+
good_meth = [m for m in dir(int) if not m.startswith('_')]
79+
for m in good_meth:
80+
self.assertEqual(eval('0x1.' + m), eval('(0x1).' + m))
81+
self.check_syntax_error('0x1.spam', "invalid hexadecimal literal",
82+
lineno=1, offset=4)
83+
self.check_syntax_error('0x1.foo', "invalid hexadecimal literal",
84+
lineno=1, offset=5)
85+
7786
def test_long_integers(self):
7887
x = 0
7988
x = 0xffffffffffffffff
@@ -97,6 +106,23 @@ def test_floats(self):
97106
x = 3.e14
98107
x = .3e14
99108
x = 3.1e4
109+
x = 0x1.2p1
110+
x = 0x1.2p+1
111+
x = 0x1.p1
112+
x = 0x1.p-1
113+
x = 0x1p0
114+
x = 0x1ap1
115+
x = 0x1P1
116+
x = 0x1cp2
117+
x = 0x1.p1
118+
x = 0x1.P1
119+
x = 0x001.1p2
120+
x = 0X1p1
121+
x = 0x1.1_1p1
122+
x = 0x1.1p1_1
123+
x = 0x1.
124+
x = 0x1.1
125+
x = 0x.1
100126

101127
def test_float_exponent_tokenization(self):
102128
# See issue 21642.
@@ -134,7 +160,14 @@ def test_bad_numerical_literals(self):
134160
"use an 0o prefix for octal integers")
135161
check("1.2_", "invalid decimal literal")
136162
check("1e2_", "invalid decimal literal")
137-
check("1e+", "invalid decimal literal")
163+
check("1e+", "invalid float literal")
164+
check("0x.p", "invalid float literal")
165+
check("0x_.p", "invalid float literal")
166+
check("0x1.1p", "invalid float literal")
167+
check("0x1.1_p", "invalid float literal")
168+
check("0x1.1p_", "invalid float literal")
169+
check("0xp", "invalid hexadecimal literal")
170+
check("0xP", "invalid hexadecimal literal")
138171

139172
def test_end_of_numerical_literals(self):
140173
def check(test, error=False):

Lib/test/test_tokenize.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -273,6 +273,16 @@ def test_float(self):
273273
NAME 'x' (1, 0) (1, 1)
274274
OP '=' (1, 2) (1, 3)
275275
NUMBER '3.14e159' (1, 4) (1, 12)
276+
""")
277+
self.check_tokenize("x = 0x1p1", """\
278+
NAME 'x' (1, 0) (1, 1)
279+
OP '=' (1, 2) (1, 3)
280+
NUMBER '0x1p1' (1, 4) (1, 9)
281+
""")
282+
self.check_tokenize("x = 0x.1p1", """\
283+
NAME 'x' (1, 0) (1, 1)
284+
OP '=' (1, 2) (1, 3)
285+
NUMBER '0x.1p1' (1, 4) (1, 10)
276286
""")
277287

278288
def test_underscore_literals(self):

Lib/tokenize.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,10 @@ def maybe(*choices): return group(*choices) + '?'
7777
Pointfloat = group(r'[0-9](?:_?[0-9])*\.(?:[0-9](?:_?[0-9])*)?',
7878
r'\.[0-9](?:_?[0-9])*') + maybe(Exponent)
7979
Expfloat = r'[0-9](?:_?[0-9])*' + Exponent
80-
Floatnumber = group(Pointfloat, Expfloat)
80+
HexExponent = r'[pP][-+]?[0-9](?:_?[0-9])*'
81+
Hexfloat = group(r'0[xX]_?[0-9a-f](?:_?[0-9a-f])*\.(?:[0-9a-f](?:_?[0-9a-f])*)?',
82+
r'0[xX]_?\.[0-9a-f](?:_?[0-9a-f])*') + HexExponent
83+
Floatnumber = group(Pointfloat, Expfloat, Hexfloat)
8184
Imagnumber = group(r'[0-9](?:_?[0-9])*[jJ]', Floatnumber + r'[jJ]')
8285
Number = group(Imagnumber, Floatnumber, Intnumber)
8386

0 commit comments

Comments
 (0)