From d4381aa9bfc80fe0f3d9530bc32aba8df47caa07 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 4 Jun 2025 18:01:25 +0200 Subject: [PATCH 01/17] WIP: String literals Co-authored-by: Blaise Pabon --- Doc/reference/expressions.rst | 50 +++++- Doc/reference/lexical_analysis.rst | 255 ++++++++++++++++++----------- 2 files changed, 207 insertions(+), 98 deletions(-) diff --git a/Doc/reference/expressions.rst b/Doc/reference/expressions.rst index 17f39aaf5f57cd..743d43b1c9c1b1 100644 --- a/Doc/reference/expressions.rst +++ b/Doc/reference/expressions.rst @@ -133,13 +133,18 @@ Literals Python supports string and bytes literals and various numeric literals: -.. productionlist:: python-grammar - literal: `stringliteral` | `bytesliteral` | `NUMBER` +.. grammar-snippet:: + :group: python-grammar + + literal: `strings` | `NUMBER` Evaluation of a literal yields an object of the given type (string, bytes, integer, floating-point number, complex number) with the given value. The value may be approximated in the case of floating-point and imaginary (complex) -literals. See section :ref:`literals` for details. +literals. +See section :ref:`literals` for details. +Seee section :ref:`string-concatenation` for details on ``strings``. + .. index:: triple: immutable; data; type @@ -152,6 +157,45 @@ occurrence) may obtain the same object or a different object with the same value. +.. _string-concatenation: + +String literal concatenation +............................ + +Multiple adjacent string or bytes literals (delimited by whitespace), possibly +using different quoting conventions, are allowed, and their meaning is the same +as their concatenation. Thus, ``"hello" 'world'`` is equivalent to +``"helloworld"``. + +Formally: + +.. grammar-snippet:: + :group: python-grammar + + strings: ( `STRING` | `fstring` | `tstring`)+ + +Note that this feature is defined at the syntactical level, so it only works +with literals. +To concatenate string expressions at run time, the '+' operator may be used:: + + greeting = "Hello" + space = " " + name = "Blaise" + print(greeting + space + name) # not: print(greeting space name) + +Also note that literal concatenation can freely mix raw strings, +triple-quoted strings, and formatted or template string literals. +However, bytes literals may not be combined with string literals of any kind. + +This feature can be used to reduce the number of backslashes +needed, to split long strings conveniently across long lines, or even to add +comments to parts of strings, for example:: + + re.compile("[A-Za-z_]" # letter or underscore + "[A-Za-z0-9_]*" # letter, digit or underscore + ) + + .. _parenthesized: Parenthesized forms diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 567c70111c20ec..58c8b15cfe5499 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -106,6 +106,16 @@ If an encoding is declared, the encoding name must be recognized by Python encoding is used for all lexical analysis, including string literals, comments and identifiers. +All lexical analysis, including string literals, comments +and identifiers, works on Unicode text decoded using the source encoding. +Any Unicode code point, except the NUL control character, can appear in +Python source. + +.. grammar-snippet:: + :group: python-grammar + + source_character: + .. _explicit-joining: @@ -478,66 +488,104 @@ Literals are notations for constant values of some built-in types. .. index:: string literal, bytes literal, ASCII single: ' (single quote); string literal single: " (double quote); string literal - single: u'; string literal - single: u"; string literal .. _strings: String and Bytes literals ------------------------- -String literals are described by the following lexical definitions: +String literals are text enclosed in single quotes (``'``) or double +quotes (``"``). For example: -.. productionlist:: python-grammar - stringliteral: [`stringprefix`](`shortstring` | `longstring`) - stringprefix: "r" | "u" | "R" | "U" | "f" | "F" | "t" | "T" - : | "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF" - : | "tr" | "Tr" | "tR" | "TR" | "rt" | "rT" | "Rt" | "RT" - shortstring: "'" `shortstringitem`* "'" | '"' `shortstringitem`* '"' - longstring: "'''" `longstringitem`* "'''" | '"""' `longstringitem`* '"""' - shortstringitem: `shortstringchar` | `stringescapeseq` - longstringitem: `longstringchar` | `stringescapeseq` - shortstringchar: - longstringchar: - stringescapeseq: "\" +.. code-block:: plain -.. productionlist:: python-grammar - bytesliteral: `bytesprefix`(`shortbytes` | `longbytes`) - bytesprefix: "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" - shortbytes: "'" `shortbytesitem`* "'" | '"' `shortbytesitem`* '"' - longbytes: "'''" `longbytesitem`* "'''" | '"""' `longbytesitem`* '"""' - shortbytesitem: `shortbyteschar` | `bytesescapeseq` - longbytesitem: `longbyteschar` | `bytesescapeseq` - shortbyteschar: - longbyteschar: - bytesescapeseq: "\" + "spam" + 'eggs' + +The quote used to start the literal also terminates it, so a string literal +can only contain the other quote (except with escape sequences, see below). +For example: + +.. code-block:: plain -One syntactic restriction not indicated by these productions is that whitespace -is not allowed between the :token:`~python-grammar:stringprefix` or -:token:`~python-grammar:bytesprefix` and the rest of the literal. The source -character set is defined by the encoding declaration; it is UTF-8 if no encoding -declaration is given in the source file; see section :ref:`encodings`. + 'Say "Hello", please.' + "Don't do that!" -.. index:: triple-quoted string, Unicode Consortium, raw string +Except for this limitation, the choice of quote character (``'`` or ``"``) +does not affect how the literal is parsed. + +.. index:: triple-quoted string single: """; string literal single: '''; string literal -In plain English: Both types of literals can be enclosed in matching single quotes -(``'``) or double quotes (``"``). They can also be enclosed in matching groups -of three single or double quotes (these are generally referred to as -*triple-quoted strings*). The backslash (``\``) character is used to give special -meaning to otherwise ordinary characters like ``n``, which means 'newline' when -escaped (``\n``). It can also be used to escape characters that otherwise have a -special meaning, such as newline, backslash itself, or the quote character. -See :ref:`escape sequences ` below for examples. +Triple-quoted strings +--------------------- + +Strings can also be enclosed in matching groups of three single or double +quotes. +These are generally referred to as :dfn:`triple-quoted strings`. + +In triple-quoted literals, unescaped newlines and quotes are allowed (and are +retained), except that three unescaped quotes in a row terminate the literal. +(Here, a *quote* is the character used to open the literal, that is, +either ``'`` or ``"``.) + +For example: + +.. code-block:: plain + + """This is a triple-quoted string with "quotes" inside.""" + + '''Another triple-quoted string. This one continues + on the next line.''' + +Escape sequences +---------------- + +Inside a string literal, the backslash (``\``) character introduces an +:dfn:`escape sequence`, which has special meaning depending on the character +after the backslash. +For example, ``\n`` denotes the 'newline' character, rather the two characters +``\`` and ``n``. +See :ref:`escape sequences ` below for a full list of such +sequences, and more details. + + +.. index:: + single: u'; string literal + single: u"; string literal + +String prefixes +--------------- + +String literals can have an optional :dfn:`prefix` that influences how the literal +is parsed, for example: + +.. code-block:: plain + + b"data" + f'{result=}' + +* ``r``: Raw string +* ``f``: "F-string" +* ``t``: "T-string" +* ``b``: Byte literal +* ``u``: No effect (allowed for backwards compatibility) + +Prefixes are case-insensitive (for example, ``B`` works the same as ``b``). +The ``r`` prefix can be combined with ``f``, ``t`` or ``b``, so ``fr``, +``rf``, ``tr``, ``rt``, ``br`` and ``rb`` are also valid prefixes. + .. index:: single: b'; bytes literal single: b"; bytes literal -Bytes literals are always prefixed with ``'b'`` or ``'B'``; they produce an -instance of the :class:`bytes` type instead of the :class:`str` type. They -may only contain ASCII characters; bytes with a numeric value of 128 or greater -must be expressed with escapes. +:dfn:`Bytes literals` are always prefixed with ``'b'`` or ``'B'``; they produce an +instance of the :class:`bytes` type instead of the :class:`str` type. +They may only contain ASCII characters; bytes with a numeric value of 128 +or greater must be expressed with escape sequences. +Similarly, a zero byte must be expressed using an escape sequence. + .. index:: single: r'; raw string literal @@ -546,9 +594,33 @@ must be expressed with escapes. Both string and bytes literals may optionally be prefixed with a letter ``'r'`` or ``'R'``; such constructs are called :dfn:`raw string literals` and :dfn:`raw bytes literals` respectively and treat backslashes as -literal characters. As a result, in raw string literals, ``'\U'`` and ``'\u'`` +literal characters. +As a result, in raw string literals, :ref:`escape sequences ` escapes are not treated specially. +Even in a raw literal, quotes can be escaped with a backslash, but the +backslash remains in the result; for example, ``r"\""`` is a valid string +literal consisting of two characters: a backslash and a double quote; ``r"\"`` +is not a valid string literal (even a raw string cannot end in an odd number of +backslashes). Specifically, *a raw literal cannot end in a single backslash* +(since the backslash would escape the following quote character). Note also +that a single backslash followed by a newline is interpreted as those two +characters as part of the literal, *not* as a line continuation. + + +.. index:: + single: f'; formatted string literal + single: f"; formatted string literal + +A string literal with ``'f'`` or ``'F'`` in its prefix is a +:dfn:`formatted string literal`; see :ref:`f-strings`. +Similarly, string literal with ``'t'`` or ``'T'`` in its prefix is a +:dfn:`template string literal`; see :ref:`t-strings`. + +The ``'f'`` or ``t`` may be combined with ``'r'`` to create a +:dfn:`raw formatted string` or :dfn:`raw template string`. +They may not be combined with ``'b'``, ``'u'``, or each other. + .. versionadded:: 3.3 The ``'rb'`` prefix of raw bytes literals has been added as a synonym of ``'br'``. @@ -557,18 +629,46 @@ escapes are not treated specially. to simplify the maintenance of dual Python 2.x and 3.x codebases. See :pep:`414` for more information. -.. index:: - single: f'; formatted string literal - single: f"; formatted string literal -A string literal with ``'f'`` or ``'F'`` in its prefix is a -:dfn:`formatted string literal`; see :ref:`f-strings`. The ``'f'`` may be -combined with ``'r'``, but not with ``'b'`` or ``'u'``, therefore raw -formatted strings are possible, but formatted bytes literals are not. +String literals, except "F-strings" and "T-strings", are described by the +following lexical definitions: + +.. grammar-snippet:: + :group: python-grammar + + STRING: stringliteral | bytesliteral | fstring | tstring + + stringliteral: [`stringprefix`](`stringcontent`) + stringprefix: <("r" | "u"), case-insensitive> + stringcontent: `quote` `stringitem`* + quote: "'" | '"' | "'''" | '"""' + stringitem: `stringchar` | `stringescapeseq` + stringchar: + stringescapeseq: "\" + +``stringchar`` can not include: + +- the backslash, ``\``; +- in triple-quoted strings (quoted by ``'''`` or ``"""``), the newline; +- the quote character. + + +.. grammar-snippet:: + :group: python-grammar + + bytesliteral: `bytesprefix`(`shortbytes` | `longbytes`) + bytesprefix: <("b" | "br" | "rb" ), case-insensitive> + shortbytes: "'" `shortbytesitem`* "'" | '"' `shortbytesitem`* '"' + longbytes: "'''" `longbytesitem`* "'''" | '"""' `longbytesitem`* '"""' + shortbytesitem: `shortbyteschar` | `bytesescapeseq` + longbytesitem: `longbyteschar` | `bytesescapeseq` + shortbyteschar: + longbyteschar: + bytesescapeseq: "\" + +Note that as in all lexical definitions, whitespace is significant. +The prefix, if any, must be followed immediately by the quoted string content. -In triple-quoted literals, unescaped newlines and quotes are allowed (and are -retained), except that three unescaped quotes in a row terminate the literal. (A -"quote" is the character used to open the literal, i.e. either ``'`` or ``"``.) .. index:: physical line, escape sequence, Standard C, C single: \ (backslash); escape sequence @@ -587,7 +687,6 @@ retained), except that three unescaped quotes in a row terminate the literal. ( .. _escape-sequences: - Escape sequences ^^^^^^^^^^^^^^^^ @@ -655,14 +754,14 @@ Notes: (2) - As in Standard C, up to three octal digits are accepted. + As in Standard C, up to three octal digits (0 through 7) are accepted. .. versionchanged:: 3.11 - Octal escapes with value larger than ``0o377`` produce a + Octal escapes with value larger than ``0o377`` (255) produce a :exc:`DeprecationWarning`. .. versionchanged:: 3.12 - Octal escapes with value larger than ``0o377`` produce a + Octal escapes with value larger than ``0o377`` (255) produce a :exc:`SyntaxWarning`. In a future Python version they will be eventually a :exc:`SyntaxError`. @@ -689,11 +788,9 @@ Notes: .. index:: unrecognized escape sequence Unlike Standard C, all unrecognized escape sequences are left in the string -unchanged, i.e., *the backslash is left in the result*. (This behavior is -useful when debugging: if an escape sequence is mistyped, the resulting output -is more easily recognized as broken.) It is also important to note that the -escape sequences only recognized in string literals fall into the category of -unrecognized escapes for bytes literals. +unchanged, i.e., *the backslash is left in the result*. +Note that for bytes literals, the escape sequences only recognized in string +literals fall into the category of unrecognized escapes. .. versionchanged:: 3.6 Unrecognized escape sequences produce a :exc:`DeprecationWarning`. @@ -702,38 +799,6 @@ unrecognized escapes for bytes literals. Unrecognized escape sequences produce a :exc:`SyntaxWarning`. In a future Python version they will be eventually a :exc:`SyntaxError`. -Even in a raw literal, quotes can be escaped with a backslash, but the -backslash remains in the result; for example, ``r"\""`` is a valid string -literal consisting of two characters: a backslash and a double quote; ``r"\"`` -is not a valid string literal (even a raw string cannot end in an odd number of -backslashes). Specifically, *a raw literal cannot end in a single backslash* -(since the backslash would escape the following quote character). Note also -that a single backslash followed by a newline is interpreted as those two -characters as part of the literal, *not* as a line continuation. - - -.. _string-concatenation: - -String literal concatenation ----------------------------- - -Multiple adjacent string or bytes literals (delimited by whitespace), possibly -using different quoting conventions, are allowed, and their meaning is the same -as their concatenation. Thus, ``"hello" 'world'`` is equivalent to -``"helloworld"``. This feature can be used to reduce the number of backslashes -needed, to split long strings conveniently across long lines, or even to add -comments to parts of strings, for example:: - - re.compile("[A-Za-z_]" # letter or underscore - "[A-Za-z0-9_]*" # letter, digit or underscore - ) - -Note that this feature is defined at the syntactical level, but implemented at -compile time. The '+' operator must be used to concatenate string expressions -at run time. Also note that literal concatenation can use different quoting -styles for each component (even mixing raw strings and triple quoted strings), -and formatted string literals may be concatenated with plain string literals. - .. index:: single: formatted string literal From 80ad85cc286f04a4ac19d03c5f99a9158d15231b Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 11 Jun 2025 16:22:08 +0200 Subject: [PATCH 02/17] Use correct Pygments lexer for plain text --- Doc/reference/lexical_analysis.rst | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 58c8b15cfe5499..6f3d90f89b98d3 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -496,7 +496,11 @@ String and Bytes literals String literals are text enclosed in single quotes (``'``) or double quotes (``"``). For example: -.. code-block:: plain +.. This is Python code, but we turn off highlighting because as of this + writing, highlighted strings don't look good when there's no code + surrounding them. + +.. code-block:: text "spam" 'eggs' @@ -505,7 +509,7 @@ The quote used to start the literal also terminates it, so a string literal can only contain the other quote (except with escape sequences, see below). For example: -.. code-block:: plain +.. code-block:: text 'Say "Hello", please.' "Don't do that!" @@ -531,7 +535,7 @@ either ``'`` or ``"``.) For example: -.. code-block:: plain +.. code-block:: text """This is a triple-quoted string with "quotes" inside.""" @@ -560,7 +564,7 @@ String prefixes String literals can have an optional :dfn:`prefix` that influences how the literal is parsed, for example: -.. code-block:: plain +.. code-block:: python b"data" f'{result=}' From e44fa66cf2da63763a3ed37f7d59da28e95c785c Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 11 Jun 2025 17:59:01 +0200 Subject: [PATCH 03/17] WIP --- Doc/reference/grammar.rst | 5 +- Doc/reference/introduction.rst | 16 +++-- Doc/reference/lexical_analysis.rst | 110 +++++++++++++++++------------ 3 files changed, 76 insertions(+), 55 deletions(-) diff --git a/Doc/reference/grammar.rst b/Doc/reference/grammar.rst index 55c148801d8559..1037feb691f6bc 100644 --- a/Doc/reference/grammar.rst +++ b/Doc/reference/grammar.rst @@ -10,11 +10,8 @@ error recovery. The notation used here is the same as in the preceding docs, and is described in the :ref:`notation ` section, -except for a few extra complications: +except for an extra complication: -* ``&e``: a positive lookahead (that is, ``e`` is required to match but - not consumed) -* ``!e``: a negative lookahead (that is, ``e`` is required *not* to match) * ``~`` ("cut"): commit to the current alternative and fail the rule even if this fails to parse diff --git a/Doc/reference/introduction.rst b/Doc/reference/introduction.rst index 444acac374a690..c62240b18cfe55 100644 --- a/Doc/reference/introduction.rst +++ b/Doc/reference/introduction.rst @@ -145,15 +145,23 @@ The definition to the right of the colon uses the following syntax elements: * ``e?``: A question mark has exactly the same meaning as square brackets: the preceding item is optional. * ``(e)``: Parentheses are used for grouping. + +The following notation is only used in +:ref:`lexical definitions `. + * ``"a"..."z"``: Two literal characters separated by three dots mean a choice of any single character in the given (inclusive) range of ASCII characters. - This notation is only used in - :ref:`lexical definitions `. * ``<...>``: A phrase between angular brackets gives an informal description of the matched symbol (for example, ````), or an abbreviation that is defined in nearby text (for example, ````). - This notation is only used in - :ref:`lexical definitions `. + +.. _lexical-lookaheads: + +Some definitions also use *lookaheads*, which indicate that an element +must (or must not) match at a given position, but without consuming any input: + +* ``&e``: a positive lookahead (that is, ``e`` is required to match) +* ``!e``: a negative lookahead (that is, ``e`` is required *not* to match) The unary operators (``*``, ``+``, ``?``) bind as tightly as possible; the vertical bar (``|``) binds most loosely. diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 6f3d90f89b98d3..67cc9bd8fc7bac 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -39,7 +39,8 @@ The end of a logical line is represented by the token :data:`~token.NEWLINE`. Statements cannot cross logical line boundaries except where :data:`!NEWLINE` is allowed by the syntax (e.g., between statements in compound statements). A logical line is constructed from one or more *physical lines* by following -the explicit or implicit *line joining* rules. +the :ref:`explicit ` or :ref:`implicit ` +*line joining* rules. .. _physical-lines: @@ -47,17 +48,28 @@ the explicit or implicit *line joining* rules. Physical lines -------------- -A physical line is a sequence of characters terminated by an end-of-line -sequence. In source files and strings, any of the standard platform line -termination sequences can be used - the Unix form using ASCII LF (linefeed), -the Windows form using the ASCII sequence CR LF (return followed by linefeed), -or the old Macintosh form using the ASCII CR (return) character. All of these -forms can be used equally, regardless of platform. The end of input also serves -as an implicit terminator for the final physical line. +A physical line is a sequence of characters terminated by one the following +end-of-line sequences: -When embedding Python, source code strings should be passed to Python APIs using -the standard C conventions for newline characters (the ``\n`` character, -representing ASCII LF, is the line terminator). +* the Unix form using ASCII LF (linefeed), +* the Windows form using the ASCII sequence CR LF (return followed by linefeed), +* the old Macintosh form using the ASCII CR (return) character. + +Regardless of platform, each of these sequences is replaced by a single +ASCII LF (linefeed) character. +(This is done even inside :ref:`string literals `.) +Each line can use any of the sequences; they do not need to be consistent +within a file. + +The end of input also serves as an implicit terminator for the final +physical line. + +Formally: + +.. grammar-snippet:: + :group: python-grammar + + newline: | | .. _comments: @@ -484,6 +496,13 @@ Literals Literals are notations for constant values of some built-in types. +In terms of lexical analysis, Python has :ref:`string, bytes ` +and :ref:`numeric ` literals. + +Other “literals” are lexically denoted using :ref:`keywords ` +(``None``, ``True``, ``False``) and the special +:ref:`ellipsis token ` (``...``): + .. index:: string literal, bytes literal, ASCII single: ' (single quote); string literal @@ -491,7 +510,7 @@ Literals are notations for constant values of some built-in types. .. _strings: String and Bytes literals -------------------------- +========================= String literals are text enclosed in single quotes (``'``) or double quotes (``"``). For example: @@ -635,41 +654,26 @@ They may not be combined with ``'b'``, ``'u'``, or each other. String literals, except "F-strings" and "T-strings", are described by the -following lexical definitions: +following lexical definitions. + +These definitions use :ref:`negative lookaheads ` (``!``) +to indicate that an ending quote ends the literal. .. grammar-snippet:: :group: python-grammar - STRING: stringliteral | bytesliteral | fstring | tstring - - stringliteral: [`stringprefix`](`stringcontent`) - stringprefix: <("r" | "u"), case-insensitive> - stringcontent: `quote` `stringitem`* - quote: "'" | '"' | "'''" | '"""' + STRING: [`stringprefix`] (`stringcontent`) + stringprefix: <("r" | "u" | "b" | "br" | "rb"), case-insensitive> + stringcontent: + | "'" ( !"'" `stringitem`)* "'" + | '"' ( !'"' `stringitem`)* '"' + | "'''" ( !"'''" `longstringitem`)* "'''" + | '"""' ( !'"""' `longstringitem`)* '"""' stringitem: `stringchar` | `stringescapeseq` - stringchar: + stringchar: + longstringitem: `stringitem` | newline stringescapeseq: "\" -``stringchar`` can not include: - -- the backslash, ``\``; -- in triple-quoted strings (quoted by ``'''`` or ``"""``), the newline; -- the quote character. - - -.. grammar-snippet:: - :group: python-grammar - - bytesliteral: `bytesprefix`(`shortbytes` | `longbytes`) - bytesprefix: <("b" | "br" | "rb" ), case-insensitive> - shortbytes: "'" `shortbytesitem`* "'" | '"' `shortbytesitem`* '"' - longbytes: "'''" `longbytesitem`* "'''" | '"""' `longbytesitem`* '"""' - shortbytesitem: `shortbyteschar` | `bytesescapeseq` - longbytesitem: `longbyteschar` | `bytesescapeseq` - shortbyteschar: - longbyteschar: - bytesescapeseq: "\" - Note that as in all lexical definitions, whitespace is significant. The prefix, if any, must be followed immediately by the quoted string content. @@ -692,7 +696,7 @@ The prefix, if any, must be followed immediately by the quoted string content. .. _escape-sequences: Escape sequences -^^^^^^^^^^^^^^^^ +---------------- Unless an ``'r'`` or ``'R'`` prefix is present, escape sequences in string and bytes literals are interpreted according to rules similar to those used by @@ -985,7 +989,7 @@ and :meth:`str.format`, which uses a related format string mechanism. .. _numbers: Numeric literals ----------------- +================ .. index:: number, numeric literal, integer literal floating-point literal, hexadecimal literal @@ -1241,14 +1245,26 @@ The following tokens serve as delimiters in the grammar: ( ) [ ] { } , : ! . ; @ = + +The period can also occur in floating-point and imaginary literals. + +.. _lexical-ellipsis: + +A sequence of three periods has a special meaning as an +:py:data:`Ellipsis` literal: + +.. code-block:: none + + ... + +The following *augmented assignment operators* serve +lexically as delimiters, but also perform an operation: + +.. code-block:: none + -> += -= *= /= //= %= @= &= |= ^= >>= <<= **= -The period can also occur in floating-point and imaginary literals. A sequence -of three periods has a special meaning as an ellipsis literal. The second half -of the list, the augmented assignment operators, serve lexically as delimiters, -but also perform an operation. - The following printing ASCII characters have special meaning as part of other tokens or are otherwise significant to the lexical analyzer: From 86bf94b0f4cc9f9eaa63728610d7bb71fc4f3107 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 18 Jun 2025 18:05:31 +0200 Subject: [PATCH 04/17] More WIP --- Doc/reference/lexical_analysis.rst | 424 +++++++++++++++++------------ 1 file changed, 251 insertions(+), 173 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 67cc9bd8fc7bac..36abfa31c093c9 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -501,7 +501,7 @@ and :ref:`numeric ` literals. Other “literals” are lexically denoted using :ref:`keywords ` (``None``, ``True``, ``False``) and the special -:ref:`ellipsis token ` (``...``): +:ref:`ellipsis token ` (``...``). .. index:: string literal, bytes literal, ASCII @@ -519,7 +519,7 @@ quotes (``"``). For example: writing, highlighted strings don't look good when there's no code surrounding them. -.. code-block:: text +.. code-block:: python "spam" 'eggs' @@ -528,7 +528,7 @@ The quote used to start the literal also terminates it, so a string literal can only contain the other quote (except with escape sequences, see below). For example: -.. code-block:: text +.. code-block:: python 'Say "Hello", please.' "Don't do that!" @@ -536,6 +536,21 @@ For example: Except for this limitation, the choice of quote character (``'`` or ``"``) does not affect how the literal is parsed. +Inside a string literal, the backslash (``\``) character introduces an +:dfn:`escape sequence`, which has special meaning depending on the character +after the backslash. +For example, ``\"`` denotes the double quote character, and does *not* end +the string: + +.. code-block:: python + + >>> print("Say \"Hello\" to everyone!") + Say "Hello" to everyone! + +See :ref:`escape sequences ` below for a full list of such +sequences, and more details. + + .. index:: triple-quoted string single: """; string literal single: '''; string literal @@ -545,32 +560,20 @@ Triple-quoted strings Strings can also be enclosed in matching groups of three single or double quotes. -These are generally referred to as :dfn:`triple-quoted strings`. +These are generally referred to as :dfn:`triple-quoted strings`:: -In triple-quoted literals, unescaped newlines and quotes are allowed (and are -retained), except that three unescaped quotes in a row terminate the literal. -(Here, a *quote* is the character used to open the literal, that is, -either ``'`` or ``"``.) + """This is a triple-quoted string.""" -For example: +In triple-quoted literals, unescaped quotes are allowed (and are +retained), except that three unescaped quotes in a row terminate the literal, +if they are of the same kind (``'`` or ``"``) used at the start:: -.. code-block:: text + """This string has "quotes" inside.""" - """This is a triple-quoted string with "quotes" inside.""" +Unescaped newlines are also allowed and retained:: - '''Another triple-quoted string. This one continues - on the next line.''' - -Escape sequences ----------------- - -Inside a string literal, the backslash (``\``) character introduces an -:dfn:`escape sequence`, which has special meaning depending on the character -after the backslash. -For example, ``\n`` denotes the 'newline' character, rather the two characters -``\`` and ``n``. -See :ref:`escape sequences ` below for a full list of such -sequences, and more details. + '''This triple-quoted string + continues on the next line.''' .. index:: @@ -580,70 +583,28 @@ sequences, and more details. String prefixes --------------- -String literals can have an optional :dfn:`prefix` that influences how the literal -is parsed, for example: +String literals can have an optional :dfn:`prefix` that influences how the +content of the literal is parsed, for example: .. code-block:: python b"data" f'{result=}' -* ``r``: Raw string -* ``f``: "F-string" -* ``t``: "T-string" -* ``b``: Byte literal +The allowed prefixes are: + +* ``b``: :ref:`Bytes literal ` +* ``r``: :ref:`Raw string ` +* ``f``: :ref:`Formatted string literal ` ("f-string") +* ``t``: :ref:`Template string literal ` ("t-string") * ``u``: No effect (allowed for backwards compatibility) +See the linked sections for details on each type. + Prefixes are case-insensitive (for example, ``B`` works the same as ``b``). The ``r`` prefix can be combined with ``f``, ``t`` or ``b``, so ``fr``, ``rf``, ``tr``, ``rt``, ``br`` and ``rb`` are also valid prefixes. - -.. index:: - single: b'; bytes literal - single: b"; bytes literal - -:dfn:`Bytes literals` are always prefixed with ``'b'`` or ``'B'``; they produce an -instance of the :class:`bytes` type instead of the :class:`str` type. -They may only contain ASCII characters; bytes with a numeric value of 128 -or greater must be expressed with escape sequences. -Similarly, a zero byte must be expressed using an escape sequence. - - -.. index:: - single: r'; raw string literal - single: r"; raw string literal - -Both string and bytes literals may optionally be prefixed with a letter ``'r'`` -or ``'R'``; such constructs are called :dfn:`raw string literals` -and :dfn:`raw bytes literals` respectively and treat backslashes as -literal characters. -As a result, in raw string literals, :ref:`escape sequences ` -escapes are not treated specially. - -Even in a raw literal, quotes can be escaped with a backslash, but the -backslash remains in the result; for example, ``r"\""`` is a valid string -literal consisting of two characters: a backslash and a double quote; ``r"\"`` -is not a valid string literal (even a raw string cannot end in an odd number of -backslashes). Specifically, *a raw literal cannot end in a single backslash* -(since the backslash would escape the following quote character). Note also -that a single backslash followed by a newline is interpreted as those two -characters as part of the literal, *not* as a line continuation. - - -.. index:: - single: f'; formatted string literal - single: f"; formatted string literal - -A string literal with ``'f'`` or ``'F'`` in its prefix is a -:dfn:`formatted string literal`; see :ref:`f-strings`. -Similarly, string literal with ``'t'`` or ``'T'`` in its prefix is a -:dfn:`template string literal`; see :ref:`t-strings`. - -The ``'f'`` or ``t`` may be combined with ``'r'`` to create a -:dfn:`raw formatted string` or :dfn:`raw template string`. -They may not be combined with ``'b'``, ``'u'``, or each other. - .. versionadded:: 3.3 The ``'rb'`` prefix of raw bytes literals has been added as a synonym of ``'br'``. @@ -653,7 +614,11 @@ They may not be combined with ``'b'``, ``'u'``, or each other. See :pep:`414` for more information. -String literals, except "F-strings" and "T-strings", are described by the +Formal grammar +-------------- + +String literals, except :ref:`"F-strings" ` and +:ref:`"T-strings" `, are described by the following lexical definitions. These definitions use :ref:`negative lookaheads ` (``!``) @@ -675,23 +640,8 @@ to indicate that an ending quote ends the literal. stringescapeseq: "\" Note that as in all lexical definitions, whitespace is significant. -The prefix, if any, must be followed immediately by the quoted string content. - - -.. index:: physical line, escape sequence, Standard C, C - single: \ (backslash); escape sequence - single: \\; escape sequence - single: \a; escape sequence - single: \b; escape sequence - single: \f; escape sequence - single: \n; escape sequence - single: \r; escape sequence - single: \t; escape sequence - single: \v; escape sequence - single: \x; escape sequence - single: \N; escape sequence - single: \u; escape sequence - single: \U; escape sequence +In particular, the prefix (if any) must be immediately followed by the starting +quote. .. _escape-sequences: @@ -702,55 +652,50 @@ Unless an ``'r'`` or ``'R'`` prefix is present, escape sequences in string and bytes literals are interpreted according to rules similar to those used by Standard C. The recognized escape sequences are: -+-------------------------+---------------------------------+-------+ -| Escape Sequence | Meaning | Notes | -+=========================+=================================+=======+ -| ``\``\ | Backslash and newline ignored | \(1) | -+-------------------------+---------------------------------+-------+ -| ``\\`` | Backslash (``\``) | | -+-------------------------+---------------------------------+-------+ -| ``\'`` | Single quote (``'``) | | -+-------------------------+---------------------------------+-------+ -| ``\"`` | Double quote (``"``) | | -+-------------------------+---------------------------------+-------+ -| ``\a`` | ASCII Bell (BEL) | | -+-------------------------+---------------------------------+-------+ -| ``\b`` | ASCII Backspace (BS) | | -+-------------------------+---------------------------------+-------+ -| ``\f`` | ASCII Formfeed (FF) | | -+-------------------------+---------------------------------+-------+ -| ``\n`` | ASCII Linefeed (LF) | | -+-------------------------+---------------------------------+-------+ -| ``\r`` | ASCII Carriage Return (CR) | | -+-------------------------+---------------------------------+-------+ -| ``\t`` | ASCII Horizontal Tab (TAB) | | -+-------------------------+---------------------------------+-------+ -| ``\v`` | ASCII Vertical Tab (VT) | | -+-------------------------+---------------------------------+-------+ -| :samp:`\\\\{ooo}` | Character with octal value | (2,4) | -| | *ooo* | | -+-------------------------+---------------------------------+-------+ -| :samp:`\\x{hh}` | Character with hex value *hh* | (3,4) | -+-------------------------+---------------------------------+-------+ - -Escape sequences only recognized in string literals are: - -+-------------------------+---------------------------------+-------+ -| Escape Sequence | Meaning | Notes | -+=========================+=================================+=======+ -| :samp:`\\N\\{{name}\\}` | Character named *name* in the | \(5) | -| | Unicode database | | -+-------------------------+---------------------------------+-------+ -| :samp:`\\u{xxxx}` | Character with 16-bit hex value | \(6) | -| | *xxxx* | | -+-------------------------+---------------------------------+-------+ -| :samp:`\\U{xxxxxxxx}` | Character with 32-bit hex value | \(7) | -| | *xxxxxxxx* | | -+-------------------------+---------------------------------+-------+ - -Notes: - -(1) +.. list-table:: + :widths: auto + :header-rows: 1 + + * * Escape Sequence + * Meaning + * * ``\``\ + * :ref:`string-escape-ignore` + * * ``\\`` + * :ref:`Backslash ` + * * ``\'`` + * :ref:`Single quote ` + * * ``\"`` + * :ref:`Double quote ` + * * ``\a`` + * ASCII Bell (BEL) + * * ``\b`` + * ASCII Backspace (BS) + * * ``\f`` + * ASCII Formfeed (FF) + * * ``\n`` + * ASCII Linefeed (LF) + * * ``\r`` + * ASCII Carriage Return (CR) + * * ``\t`` + * ASCII Horizontal Tab (TAB) + * * ``\v`` + * ASCII Vertical Tab (VT) + * * :samp:`\\\\{ooo}` + * :ref:`string-escape-oct` + * * :samp:`\\x{hh}` + * :ref:`string-escape-hex` + * * :samp:`\\N\\{{name}\\}` + * :ref:`string-escape-named` + * * :samp:`\\u{xxxx}` + * :ref:`Hexadecimal Unicode character ` + * * :samp:`\\U{xxxxxxxx}` + * :ref:`Hexadecimal Unicode character ` + +.. _string-escape-ignore: + +Ignored end of line +^^^^^^^^^^^^^^^^^^^ + A backslash can be added at the end of a line to ignore the newline:: >>> 'This string will not include \ @@ -760,9 +705,39 @@ Notes: The same result can be achieved using :ref:`triple-quoted strings `, or parentheses and :ref:`string literal concatenation `. +.. _string-escape-escaped-char: + +Escaped characters +^^^^^^^^^^^^^^^^^^ -(2) - As in Standard C, up to three octal digits (0 through 7) are accepted. + To include a backslash in a non-:ref:`raw ` Python string + literal, it must be doubled. The ``\\`` escape sequence denotes a single + backslash character:: + + >>> print('C:\\Program Files') + C:\Program Files + + Similarly, the ``\'`` and ``\"`` sequences denote the single and double + quote character, respectively:: + + >>> print('\' and \"') + ' and " + +.. _string-escape-oct: + +Octal character +^^^^^^^^^^^^^^^ + + The sequence :samp:`\\\\{ooo}` denotes a *character* with the octal (base 8) + value *ooo*:: + + >>> '\120' + 'P' + + Up to three octal digits (0 through 7) are accepted. + + In a bytes literal, *character* means a *byte* with the given value. + In a string literal, it means a Unicode character with the given value. .. versionchanged:: 3.11 Octal escapes with value larger than ``0o377`` (255) produce a @@ -770,42 +745,147 @@ Notes: .. versionchanged:: 3.12 Octal escapes with value larger than ``0o377`` (255) produce a - :exc:`SyntaxWarning`. In a future Python version they will be eventually - a :exc:`SyntaxError`. + :exc:`SyntaxWarning`. + In a future Python version they will raise a :exc:`SyntaxError`. + +.. _string-escape-hex: + +Hexadecimal character +^^^^^^^^^^^^^^^^^^^^^ + + The sequence :samp:`\\x{hh}` denotes a *character* with the hex (base 16) + value *hh*:: + + >>> '\x50' + 'P' + + Unlike in Standard C, exactly two hex digits are required. + + In a bytes literal, *character* means a *byte* with the given value. + In a string literal, it means a Unicode character with the given value. + +.. _string-escape-named: + +Named Unicode character +^^^^^^^^^^^^^^^^^^^^^^^ + + The sequence :samp:`\\N\\{{name}\\}` denotes a Unicode character + with the given *name*:: + + >>> '\N{LATIN CAPITAL LETTER P}' + 'P' + >>> '\N{SNAKE}' + '🐍' + + This sequence cannot appear in :ref:`bytes literals `. + + .. versionchanged:: 3.3 + Support for `name aliases `__ + has been added. -(3) - Unlike in Standard C, exactly two hex digits are required. +.. _string-escape-long-hex: -(4) - In a bytes literal, hexadecimal and octal escapes denote the byte with the - given value. In a string literal, these escapes denote a Unicode character - with the given value. +Hexadecimal Unicode characters +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -(5) - .. versionchanged:: 3.3 - Support for name aliases [#]_ has been added. + These sequences :samp:`\\u{xxxx}` and :samp:`\\U{xxxxxxxx}` denote the + Unicode character with the given hex (base 16) value. + Exactly four digits are required for ``\u``; exactly eight digits are + required for ``\U``. + The latter can encode any Unicode character. -(6) - Exactly four hex digits are required. + .. code-block:: python -(7) - Any Unicode character can be encoded this way. Exactly eight hex digits - are required. + >>> '\u1234' + 'ሴ' + >>> '\U0001f40d' + '🐍' + + These sequences cannot appear in :ref:`bytes literals `. .. index:: unrecognized escape sequence -Unlike Standard C, all unrecognized escape sequences are left in the string -unchanged, i.e., *the backslash is left in the result*. +Unrecognized escape sequences +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Unlike in Standard C, all unrecognized escape sequences are left in the string +unchanged, that is, *the backslash is left in the result*:: + + >>> print('\q') + \q + >>> list('\q') + ['\\', 'q'] + Note that for bytes literals, the escape sequences only recognized in string -literals fall into the category of unrecognized escapes. +literals (``\N...``, ``\u...``, ``\U...``) fall into the category of +unrecognized escapes. .. versionchanged:: 3.6 Unrecognized escape sequences produce a :exc:`DeprecationWarning`. .. versionchanged:: 3.12 - Unrecognized escape sequences produce a :exc:`SyntaxWarning`. In a future - Python version they will be eventually a :exc:`SyntaxError`. + Unrecognized escape sequences produce a :exc:`SyntaxWarning`. + In a future Python version they will raise a :exc:`SyntaxError`. + + +.. index:: + single: b'; bytes literal + single: b"; bytes literal + + +.. _bytes-literal: + +Bytes literals +-------------- + +:dfn:`Bytes literals` are always prefixed with ``'b'`` or ``'B'``; they produce an +instance of the :class:`bytes` type instead of the :class:`str` type. +They may only contain ASCII characters; bytes with a numeric value of 128 +or greater must be expressed with escape sequences. +Similarly, a zero byte must be expressed using an escape sequence. + + +.. index:: + single: r'; raw string literal + single: r"; raw string literal + +.. _raw-strings: + +Raw string literals +------------------- + +Both string and bytes literals may optionally be prefixed with a letter ``'r'`` +or ``'R'``; such constructs are called :dfn:`raw string literals` +and :dfn:`raw bytes literals` respectively and treat backslashes as +literal characters. +As a result, in raw string literals, :ref:`escape sequences ` +escapes are not treated specially. + +Even in a raw literal, quotes can be escaped with a backslash, but the +backslash remains in the result; for example, ``r"\""`` is a valid string +literal consisting of two characters: a backslash and a double quote; ``r"\"`` +is not a valid string literal (even a raw string cannot end in an odd number of +backslashes). Specifically, *a raw literal cannot end in a single backslash* +(since the backslash would escape the following quote character). Note also +that a single backslash followed by a newline is interpreted as those two +characters as part of the literal, *not* as a line continuation. + + +.. index:: physical line, escape sequence, Standard C, C + single: \ (backslash); escape sequence + single: \\; escape sequence + single: \a; escape sequence + single: \b; escape sequence + single: \f; escape sequence + single: \n; escape sequence + single: \r; escape sequence + single: \t; escape sequence + single: \v; escape sequence + single: \x; escape sequence + single: \N; escape sequence + single: \u; escape sequence + single: \U; escape sequence .. index:: @@ -815,6 +895,8 @@ literals fall into the category of unrecognized escapes. single: string; interpolated literal single: f-string single: fstring + single: f'; formatted string literal + single: f"; formatted string literal single: {} (curly brackets); in formatted string literal single: ! (exclamation); in formatted string literal single: : (colon); in formatted string literal @@ -1022,7 +1104,7 @@ actually an expression composed of the unary operator '``-``' and the literal .. _integers: Integer literals -^^^^^^^^^^^^^^^^ +---------------- Integer literals denote whole numbers. For example:: @@ -1095,7 +1177,7 @@ Formally, integer literals are described by the following lexical definitions: .. _floating: Floating-point literals -^^^^^^^^^^^^^^^^^^^^^^^ +----------------------- Floating-point (float) literals, such as ``3.14`` or ``1.5``, denote :ref:`approximations of real numbers `. @@ -1157,7 +1239,7 @@ lexical definitions: .. _imaginary: Imaginary literals -^^^^^^^^^^^^^^^^^^ +------------------ Python has :ref:`complex number ` objects, but no complex literals. @@ -1279,7 +1361,3 @@ occurrence outside string literals and comments is an unconditional error: $ ? ` - -.. rubric:: Footnotes - -.. [#] https://www.unicode.org/Public/16.0.0/ucd/NameAliases.txt From faf05a192ed7ec80ab26e803544ce9585b59d583 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 25 Jun 2025 16:26:37 +0200 Subject: [PATCH 05/17] Byte strings, raw strings; f-string stub --- Doc/reference/lexical_analysis.rst | 65 +++++++++++++++++++++--------- 1 file changed, 46 insertions(+), 19 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 36abfa31c093c9..2c6ae9a16d0d08 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -643,6 +643,21 @@ Note that as in all lexical definitions, whitespace is significant. In particular, the prefix (if any) must be immediately followed by the starting quote. +.. index:: physical line, escape sequence, Standard C, C + single: \ (backslash); escape sequence + single: \\; escape sequence + single: \a; escape sequence + single: \b; escape sequence + single: \f; escape sequence + single: \n; escape sequence + single: \r; escape sequence + single: \t; escape sequence + single: \v; escape sequence + single: \x; escape sequence + single: \N; escape sequence + single: \u; escape sequence + single: \U; escape sequence + .. _escape-sequences: Escape sequences @@ -842,8 +857,18 @@ Bytes literals :dfn:`Bytes literals` are always prefixed with ``'b'`` or ``'B'``; they produce an instance of the :class:`bytes` type instead of the :class:`str` type. They may only contain ASCII characters; bytes with a numeric value of 128 -or greater must be expressed with escape sequences. -Similarly, a zero byte must be expressed using an escape sequence. +or greater must be expressed with escape sequences (typically +:ref:`string-escape-hex` or :ref:`string-escape-oct`): + +.. code-block:: python + + >>> b'\x89PNG\r\n\x1a\n' + b'\x89PNG\r\n\x1a\n' + >>> list(b'\x89PNG\r\n\x1a\n') + [137, 80, 78, 71, 13, 10, 26, 10] + +Similarly, a zero byte must be expressed using an escape sequence (typically +``\0`` or ``\x00``). .. index:: @@ -860,7 +885,12 @@ or ``'R'``; such constructs are called :dfn:`raw string literals` and :dfn:`raw bytes literals` respectively and treat backslashes as literal characters. As a result, in raw string literals, :ref:`escape sequences ` -escapes are not treated specially. +are not treated specially: + +.. code-block:: python + + >>> r'\d{4}-\d{2}-\d{2}' + '\\d{4}-\\d{2}-\\d{2}' Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example, ``r"\""`` is a valid string @@ -872,22 +902,6 @@ that a single backslash followed by a newline is interpreted as those two characters as part of the literal, *not* as a line continuation. -.. index:: physical line, escape sequence, Standard C, C - single: \ (backslash); escape sequence - single: \\; escape sequence - single: \a; escape sequence - single: \b; escape sequence - single: \f; escape sequence - single: \n; escape sequence - single: \r; escape sequence - single: \t; escape sequence - single: \v; escape sequence - single: \x; escape sequence - single: \N; escape sequence - single: \u; escape sequence - single: \U; escape sequence - - .. index:: single: formatted string literal single: interpolated string literal @@ -1067,6 +1081,19 @@ include expressions. See also :pep:`498` for the proposal that added formatted string literals, and :meth:`str.format`, which uses a related format string mechanism. +.. _t-strings: +.. _template-string-literals: + +t-strings +--------- + +A :dfn:`template string literal` or :dfn:`t-string` is a string literal that +is prefixed with ``'t'`` or ``'T'``. +These strings have internal structure similar to :ref:`f-strings`, +but are evaluated as Template objects instead of strings. + +.. versionadded:: 3.14 + .. _numbers: From 687fe5830318ca89a5541703bae3e62b3c8a7b5e Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 25 Jun 2025 16:38:09 +0200 Subject: [PATCH 06/17] Remove outdated comment --- Doc/reference/lexical_analysis.rst | 4 ---- 1 file changed, 4 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 2c6ae9a16d0d08..e3d0bab8942ced 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -515,10 +515,6 @@ String and Bytes literals String literals are text enclosed in single quotes (``'``) or double quotes (``"``). For example: -.. This is Python code, but we turn off highlighting because as of this - writing, highlighted strings don't look good when there's no code - surrounding them. - .. code-block:: python "spam" From 9f9d29ccab8a5c25aa9433a90bd03d2a5521c36b Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 25 Jun 2025 16:50:18 +0200 Subject: [PATCH 07/17] Fix ReST errors --- Doc/reference/expressions.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/reference/expressions.rst b/Doc/reference/expressions.rst index 743d43b1c9c1b1..c1f046388c3d1b 100644 --- a/Doc/reference/expressions.rst +++ b/Doc/reference/expressions.rst @@ -160,7 +160,7 @@ value. .. _string-concatenation: String literal concatenation -............................ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Multiple adjacent string or bytes literals (delimited by whitespace), possibly using different quoting conventions, are allowed, and their meaning is the same @@ -172,7 +172,7 @@ Formally: .. grammar-snippet:: :group: python-grammar - strings: ( `STRING` | `fstring` | `tstring`)+ + strings: ( `STRING` | fstring | tstring)+ Note that this feature is defined at the syntactical level, so it only works with literals. From 1e0c84a0357207348d66e16dc51590cf6169dcd9 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 25 Jun 2025 18:04:08 +0200 Subject: [PATCH 08/17] TMP --- Doc/reference/lexical_analysis.rst | 72 ++++++++++++++++++++++++------ 1 file changed, 58 insertions(+), 14 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index e3d0bab8942ced..a9aeee965ad257 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -277,7 +277,15 @@ Whitespace between tokens Except at the beginning of a logical line or in string literals, the whitespace characters space, tab and formfeed can be used interchangeably to separate -tokens. Whitespace is needed between two tokens only if their concatenation +tokens: + +.. grammar-snippet:: + :group: python-grammar + + whitespace: ' ' | tab | formfeed + + +Whitespace is needed between two tokens only if their concatenation could otherwise be interpreted as a different token. For example, ``ab`` is one token, but ``a b`` is two tokens. However, ``+a`` and ``+ a`` both produce two tokens, ``+`` and ``a``, as ``+a`` is not a valid token. @@ -921,24 +929,60 @@ f-strings .. versionadded:: 3.6 A :dfn:`formatted string literal` or :dfn:`f-string` is a string literal -that is prefixed with ``'f'`` or ``'F'``. These strings may contain -replacement fields, which are expressions delimited by curly braces ``{}``. -While other string literals always have a constant value, formatted strings -are really expressions evaluated at run time. +that is prefixed with ``'f'`` or ``'F'``. +Unlike other string literals, f-strings do not have a constant value. +They may contain *replacement fields*, which are expressions delimited by +curly braces ``{}``, which are evaluated at run time. +For example:: + + >>> f'One plus one is {1 + 1}.' + 'One plus one is 2.' + Escape sequences are decoded like in ordinary string literals (except when a literal is also marked as a raw string). After decoding, the grammar for the contents of the string is: -.. productionlist:: python-grammar - f_string: (`literal_char` | "{{" | "}}" | `replacement_field`)* - replacement_field: "{" `f_expression` ["="] ["!" `conversion`] [":" `format_spec`] "}" - f_expression: (`conditional_expression` | "*" `or_expr`) - : ("," `conditional_expression` | "," "*" `or_expr`)* [","] - : | `yield_expression` - conversion: "s" | "r" | "a" - format_spec: (`literal_char` | `replacement_field`)* - literal_char: +.. grammar-snippet:: python-grammar + :group: python-grammar + + FSTRING_START: `fstringprefix` ("'" | '"' | "'''" | '"""') + FSTRING_MIDDLE: + | + | `stringescapeseq` + | "{{" + | "}}" + | + FSTRING_END: ("'" | '"' | "'''" | '"""') + fstringprefix: <("f" | "fr" | "rf"), case-insensitive> + f_debug_specifier: whitespace* '=' whitespace* + +.. grammar-snippet:: python-grammar + :group: python-grammar + + fstring: `FSTRING_START` `fstring_middle`* `FSTRING_END` + fstring_middle: + | `fstring_replacement_field` + | `FSTRING_MIDDLE` + fstring_replacement_field: + | '{' `f_expression` [`f_debug_specifier`] [`fstring_conversion`] + [`fstring_full_format_spec`] '}' + fstring_conversion: + | "!" ("s" | "r" | "a") + fstring_full_format_spec: + | ':' `fstring_format_spec`* + fstring_format_spec: + | `FSTRING_MIDDLE` + | `fstring_replacement_field` + f_expression: + | ','.(`conditional_expression` | "*" `or_expr`)+ [","] + | `yield_expression` + + +--------------- + + + The parts of the string outside curly braces are treated literally, except that any doubled curly braces ``'{{'`` or ``'}}'`` are replaced From e7b57b582b8296b12b5e8a86f008b58b5a00590d Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 2 Jul 2025 16:56:27 +0200 Subject: [PATCH 09/17] Continue with f-strings --- Doc/library/stdtypes.rst | 2 + Doc/reference/lexical_analysis.rst | 90 ++++++++++++++++++------------ 2 files changed, 56 insertions(+), 36 deletions(-) diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 394c302fd354b9..6976838eceb03e 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -2526,6 +2526,8 @@ expression support in the :mod:`re` module). single: : (colon); in formatted string literal single: = (equals); for help in debugging using string literals +.. _stdtypes-fstrings: + Formatted String Literals (f-strings) ------------------------------------- diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index a9aeee965ad257..82b0f711afd071 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -931,17 +931,62 @@ f-strings A :dfn:`formatted string literal` or :dfn:`f-string` is a string literal that is prefixed with ``'f'`` or ``'F'``. Unlike other string literals, f-strings do not have a constant value. -They may contain *replacement fields*, which are expressions delimited by -curly braces ``{}``, which are evaluated at run time. +They may contain *replacement fields* delimited by curly braces ``{}``. +Replacement fields contain expressions which are evaluated at run time. For example:: >>> f'One plus one is {1 + 1}.' 'One plus one is 2.' +The parts of the string outside curly braces are treated literally, +except that any doubled curly braces ``'{{'`` or ``'}}'`` are replaced +with the corresponding single curly brace:: + + >>> print(f'{{...}}') + {...} Escape sequences are decoded like in ordinary string literals (except when -a literal is also marked as a raw string). After decoding, the grammar -for the contents of the string is: +a literal is also marked as a raw string):: + + >>> name = 'Galahad' + >>> favorite_color = 'blue' + >>> print(f'{name}:\t{favorite_color}') + Galahad: blue + >>> print(rf'C:\Users\{name}') + C:\Users\Galahad + +In addition to the expression, replacement fields may contain: + +* a *debug specifier* -- an equal sign (``=``); +* a *conversion specifier* -- ``!s``, ``!r`` or ``!a``; and/or +* a *format specifier* prefixed with a colon (``:``). + +See :ref:`stdtypes-fstrings` for how these specifiers are interpreted. + +Note that whitespace on both sides of a debug specifier (``=``) is +significant --- it is retained in the result:: + + >>> print(f'{name=}') + name='Galahad' + >>> print(f'{name = }') + name = 'Galahad' + +Expressions in formatted string literals are treated like regular +Python expressions surrounded by parentheses, with a few exceptions. +An empty expression is not allowed, and both :keyword:`lambda` and +assignment expressions ``:=`` must be surrounded by explicit parentheses. +Each expression is evaluated in the context where the formatted string literal +appears, in order from left to right. Replacement expressions can contain +newlines in both single-quoted and triple-quoted f-strings and they can contain +comments. Everything that comes after a ``#`` inside a replacement field +is a comment (even closing braces and quotes). In that case, replacement fields +must be closed in a different line. + +.. code-block:: text + + >>> f"abc{a # This is a comment }" + ... + 3}" + 'abc5' .. grammar-snippet:: python-grammar :group: python-grammar @@ -979,38 +1024,6 @@ for the contents of the string is: | `yield_expression` ---------------- - - - - -The parts of the string outside curly braces are treated literally, -except that any doubled curly braces ``'{{'`` or ``'}}'`` are replaced -with the corresponding single curly brace. A single opening curly -bracket ``'{'`` marks a replacement field, which starts with a -Python expression. To display both the expression text and its value after -evaluation, (useful in debugging), an equal sign ``'='`` may be added after the -expression. A conversion field, introduced by an exclamation point ``'!'`` may -follow. A format specifier may also be appended, introduced by a colon ``':'``. -A replacement field ends with a closing curly bracket ``'}'``. - -Expressions in formatted string literals are treated like regular -Python expressions surrounded by parentheses, with a few exceptions. -An empty expression is not allowed, and both :keyword:`lambda` and -assignment expressions ``:=`` must be surrounded by explicit parentheses. -Each expression is evaluated in the context where the formatted string literal -appears, in order from left to right. Replacement expressions can contain -newlines in both single-quoted and triple-quoted f-strings and they can contain -comments. Everything that comes after a ``#`` inside a replacement field -is a comment (even closing braces and quotes). In that case, replacement fields -must be closed in a different line. - -.. code-block:: text - - >>> f"abc{a # This is a comment }" - ... + 3}" - 'abc5' - .. versionchanged:: 3.7 Prior to Python 3.7, an :keyword:`await` expression and comprehensions containing an :keyword:`async for` clause were illegal in the expressions @@ -1020,6 +1033,11 @@ must be closed in a different line. Prior to Python 3.12, comments were not allowed inside f-string replacement fields. +--------------- + + + + When the equal sign ``'='`` is provided, the output will have the expression text, the ``'='`` and the evaluated value. Spaces after the opening brace ``'{'``, within the expression and after the ``'='`` are all retained in the From d593940fb290105481b5b8dfc01170c08d455e82 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 9 Jul 2025 17:38:34 +0200 Subject: [PATCH 10/17] Work on the f-string semantics --- Doc/reference/lexical_analysis.rst | 132 +++++++++++++++++++++++------ 1 file changed, 106 insertions(+), 26 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 82b0f711afd071..954ebbed9ba1be 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -935,58 +935,138 @@ They may contain *replacement fields* delimited by curly braces ``{}``. Replacement fields contain expressions which are evaluated at run time. For example:: - >>> f'One plus one is {1 + 1}.' - 'One plus one is 2.' + >>> who = 'nobody' + >>> nationality = 'Spanish' + >>> f'{who.title()} expects the {nationality} Inquisition!' + 'Nobody expects the Spanish Inquisition!' -The parts of the string outside curly braces are treated literally, -except that any doubled curly braces ``'{{'`` or ``'}}'`` are replaced -with the corresponding single curly brace:: +Any doubled curly braces (``{{`` or ``}}``) outside replacement fields +are replaced with the corresponding single curly brace:: >>> print(f'{{...}}') {...} -Escape sequences are decoded like in ordinary string literals (except when -a literal is also marked as a raw string):: +Other characters outside replacement fields are treated like in ordinary +string literals. +This means that escape sequences are decoded (except when a literal is +also marked as a raw string), and newlines are possible in triple-quoted +f-strings:: >>> name = 'Galahad' >>> favorite_color = 'blue' >>> print(f'{name}:\t{favorite_color}') Galahad: blue - >>> print(rf'C:\Users\{name}') + >>> print(rf"C:\Users\{name}") C:\Users\Galahad + >>> print(f'''Three shall be the number of the counting + ... and the number of the counting shall be three.''') + Three shall be the number of the counting + and the number of the counting shall be three. -In addition to the expression, replacement fields may contain: +Expressions in formatted string literals are treated like regular +Python expressions. +Each expression is evaluated in the context where the formatted string literal +appears, in order from left to right. +An empty expression is not allowed, and both :keyword:`lambda` and +assignment expressions ``:=`` must be surrounded by explicit parentheses:: + + >>> f'{(half := 1/2)}, {half * 42}' + '0.5, 21.0' + +Replacement expressions can contain newlines in both single-quoted and +triple-quoted f-strings and they can contain comments. +Everything that comes after a ``#`` inside a replacement field +is a comment (even closing braces and quotes). +This means that replacement fields with comments must be closed in a +different line: + +.. code-block:: text + + >>> a = 2 + >>> f"abc{a # This comment }" continues until the end of the line + ... + 3}" + 'abc5' + +After the expression, replacement fields may optionally contain: * a *debug specifier* -- an equal sign (``=``); * a *conversion specifier* -- ``!s``, ``!r`` or ``!a``; and/or * a *format specifier* prefixed with a colon (``:``). -See :ref:`stdtypes-fstrings` for how these specifiers are interpreted. +Debug specifier +^^^^^^^^^^^^^^^ -Note that whitespace on both sides of a debug specifier (``=``) is -significant --- it is retained in the result:: +If a debug specifier -- an equal sign (``=``) -- appears after the replacement +field expression, the resulting f-string will contain the expression's source, +the equal sign, and the value of the expression. +This is often useful for debugging:: >>> print(f'{name=}') name='Galahad' + +Whitespace on both sides of the equal sign is significant --- it is retained +in the result:: + >>> print(f'{name = }') name = 'Galahad' -Expressions in formatted string literals are treated like regular -Python expressions surrounded by parentheses, with a few exceptions. -An empty expression is not allowed, and both :keyword:`lambda` and -assignment expressions ``:=`` must be surrounded by explicit parentheses. -Each expression is evaluated in the context where the formatted string literal -appears, in order from left to right. Replacement expressions can contain -newlines in both single-quoted and triple-quoted f-strings and they can contain -comments. Everything that comes after a ``#`` inside a replacement field -is a comment (even closing braces and quotes). In that case, replacement fields -must be closed in a different line. -.. code-block:: text +Conversion specifier +^^^^^^^^^^^^^^^^^^^^ - >>> f"abc{a # This is a comment }" - ... + 3}" - 'abc5' +By default, the value of a replacement field expression is converted to +string using :func:`str`:: + + >>> from fractions import Fraction + >>> one_third = Fraction(1, 3) + >>> f'{one_third}' + '1/3' + +When a debug specifier but no format specifier is used, the default conversion +instead uses :func:`repr`:: + + >>> f'{one_third = }' + 'one_third = Fraction(1, 3)' + +The conversion can be specified explicitly using one of these specifiers: + +* ``!s`` for :func:`str` +* ``!r`` for :func:`repr` +* ``!a`` for :func:`ascii` + +For example:: + + >>> f'{one_third!r} is {one_third!s}' + 'Fraction(1, 3) is 1/3' + + >>> string = "¡kočka 😸!" + >>> f'{string = !a}' + "string = '\\xa1ko\\u010dka \\U0001f638!'" + + +Format specifier +^^^^^^^^^^^^^^^^ + +After the expression has been evaluated, and possibly converted using an +explicit conversion specifier, it is formatted using the :func:`format` function. +If the replacement field includes a *format specifier*, an arbitrary string +introduced by a colon (``:``), the specifier is passed to :func:`!format` +as the second argument. +The result of :func:`!format` is then used as the final value for the +replacement field. For example:: + + >>> f'{one_third:.6f}' + '0.333333' + >>> f'{one_third:_^+10}' + '___+1/3___' + >>> >>> f'{one_third!r:_^20}' + '___Fraction(1, 3)___' + >>> f'{one_third = :~>10}~' + 'one_third = ~~~~~~~1/3~' + + +Formal grammar +^^^^^^^^^^^^^^ .. grammar-snippet:: python-grammar :group: python-grammar From 0d8a91789283892d6b8248034cf9e2f394f3b1a8 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 9 Jul 2025 18:08:57 +0200 Subject: [PATCH 11/17] Work on f-strings --- Doc/library/stdtypes.rst | 149 ++++++++++++----------------- Doc/reference/lexical_analysis.rst | 132 ++++++++----------------- 2 files changed, 101 insertions(+), 180 deletions(-) diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 6976838eceb03e..59fbb07ccf512c 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -2536,123 +2536,98 @@ Formatted String Literals (f-strings) The :keyword:`await` and :keyword:`async for` can be used in expressions within f-strings. .. versionchanged:: 3.8 - Added the debugging operator (``=``) + Added the debug specifier (``=``) .. versionchanged:: 3.12 Many restrictions on expressions within f-strings have been removed. Notably, nested strings, comments, and backslashes are now permitted. An :dfn:`f-string` (formally a :dfn:`formatted string literal`) is a string literal that is prefixed with ``f`` or ``F``. -This type of string literal allows embedding arbitrary Python expressions -within *replacement fields*, which are delimited by curly brackets (``{}``). -These expressions are evaluated at runtime, similarly to :meth:`str.format`, -and are converted into regular :class:`str` objects. -For example: - -.. doctest:: - - >>> who = 'nobody' - >>> nationality = 'Spanish' - >>> f'{who.title()} expects the {nationality} Inquisition!' - 'Nobody expects the Spanish Inquisition!' - -It is also possible to use a multi line f-string: - -.. doctest:: - - >>> f'''This is a string - ... on two lines''' - 'This is a string\non two lines' +This type of string literal allows embedding the results of arbitrary Python +expressions within *replacement fields*, which are delimited by curly +brackets (``{}``). +Each replacement field must contain an expression, optionally followed by: -A single opening curly bracket, ``'{'``, marks a *replacement field* that -can contain any Python expression: +* a *debug specifier* -- an equal sign (``=``); +* a *conversion specifier* -- ``!s``, ``!r`` or ``!a``; and/or +* a *format specifier* prefixed with a colon (``:``). -.. doctest:: +See the :ref:`Lexical Analysis section on f-strings ` for details +on the syntax of these fields. - >>> nationality = 'Spanish' - >>> f'The {nationality} Inquisition!' - 'The Spanish Inquisition!' +Debug specifier +^^^^^^^^^^^^^^^ -To include a literal ``{`` or ``}``, use a double bracket: +.. versionadded:: 3.8 -.. doctest:: +If a debug specifier -- an equal sign (``=``) -- appears after the replacement +field expression, the resulting f-string will contain the expression's source, +the equal sign, and the value of the expression. +This is often useful for debugging:: - >>> x = 42 - >>> f'{{x}} is {x}' - '{x} is 42' + >>> print(f'{name=}') + name='Galahad' -Functions can also be used, and :ref:`format specifiers `: +Whitespace on both sides of the equal sign is significant --- it is retained +in the result:: -.. doctest:: + >>> print(f'{name = }') + name = 'Galahad' - >>> from math import sqrt - >>> f'√2 \N{ALMOST EQUAL TO} {sqrt(2):.5f}' - '√2 ≈ 1.41421' -Any non-string expression is converted using :func:`str`, by default: +Conversion specifier +^^^^^^^^^^^^^^^^^^^^ -.. doctest:: +By default, the value of a replacement field expression is converted to +string using :func:`str`:: >>> from fractions import Fraction - >>> f'{Fraction(1, 3)}' + >>> one_third = Fraction(1, 3) + >>> f'{one_third}' '1/3' -To use an explicit conversion, use the ``!`` (exclamation mark) operator, -followed by any of the valid formats, which are: +When a debug specifier but no format specifier is used, the default conversion +instead uses :func:`repr`:: -========== ============== -Conversion Meaning -========== ============== -``!a`` :func:`ascii` -``!r`` :func:`repr` -``!s`` :func:`str` -========== ============== + >>> f'{one_third = }' + 'one_third = Fraction(1, 3)' -For example: +The conversion can be specified explicitly using one of these specifiers: -.. doctest:: +* ``!s`` for :func:`str` +* ``!r`` for :func:`repr` +* ``!a`` for :func:`ascii` - >>> from fractions import Fraction - >>> f'{Fraction(1, 3)!s}' - '1/3' - >>> f'{Fraction(1, 3)!r}' - 'Fraction(1, 3)' - >>> question = '¿Dónde está el Presidente?' - >>> print(f'{question!a}') - '\xbfD\xf3nde est\xe1 el Presidente?' - -While debugging it may be helpful to see both the expression and its value, -by using the equals sign (``=``) after the expression. -This preserves spaces within the brackets, and can be used with a converter. -By default, the debugging operator uses the :func:`repr` (``!r``) conversion. -For example: +For example:: -.. doctest:: + >>> f'{one_third!r} is {one_third!s}' + 'Fraction(1, 3) is 1/3' - >>> from fractions import Fraction - >>> calculation = Fraction(1, 3) - >>> f'{calculation=}' - 'calculation=Fraction(1, 3)' - >>> f'{calculation = }' - 'calculation = Fraction(1, 3)' - >>> f'{calculation = !s}' - 'calculation = 1/3' - -Once the output has been evaluated, it can be formatted using a -:ref:`format specifier ` following a colon (``':'``). -After the expression has been evaluated, and possibly converted to a string, -the :meth:`!__format__` method of the result is called with the format specifier, -or the empty string if no format specifier is given. -The formatted result is then used as the final value for the replacement field. -For example: + >>> string = "¡kočka 😸!" + >>> f'{string = !a}' + "string = '\\xa1ko\\u010dka \\U0001f638!'" -.. doctest:: + +Format specifier +^^^^^^^^^^^^^^^^ + +After the expression has been evaluated, and possibly converted using an +explicit conversion specifier, it is formatted using the :func:`format` function. +If the replacement field includes a *format specifier* introduced by a colon +(``:``), the specifier is passed to :func:`!format` as the second argument. +The result of :func:`!format` is then used as the final value for the +replacement field. For example:: >>> from fractions import Fraction - >>> f'{Fraction(1, 7):.6f}' - '0.142857' - >>> f'{Fraction(1, 7):_^+10}' - '___+1/7___' + >>> one_third = Fraction(1, 3) + >>> f'{one_third:.6f}' + '0.333333' + >>> f'{one_third:_^+10}' + '___+1/3___' + >>> >>> f'{one_third!r:_^20}' + '___Fraction(1, 3)___' + >>> f'{one_third = :~>10}~' + 'one_third = ~~~~~~~1/3~' .. _old-string-formatting: diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 954ebbed9ba1be..1d50eeca0b92e8 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -927,6 +927,14 @@ f-strings --------- .. versionadded:: 3.6 +.. versionchanged:: 3.7 + The :keyword:`await` and :keyword:`async for` can be used in expressions + within f-strings. +.. versionchanged:: 3.8 + Added the debug specifier (``=``) +.. versionchanged:: 3.12 + Many restrictions on expressions within f-strings have been removed. + Notably, nested strings, comments, and backslashes are now permitted. A :dfn:`formatted string literal` or :dfn:`f-string` is a string literal that is prefixed with ``'f'`` or ``'F'``. @@ -989,80 +997,49 @@ different line: After the expression, replacement fields may optionally contain: -* a *debug specifier* -- an equal sign (``=``); +* a *debug specifier* -- an equal sign (``=``), optionally surrounded by + whitespace on one or both sides; * a *conversion specifier* -- ``!s``, ``!r`` or ``!a``; and/or * a *format specifier* prefixed with a colon (``:``). -Debug specifier -^^^^^^^^^^^^^^^ - -If a debug specifier -- an equal sign (``=``) -- appears after the replacement -field expression, the resulting f-string will contain the expression's source, -the equal sign, and the value of the expression. -This is often useful for debugging:: - - >>> print(f'{name=}') - name='Galahad' - -Whitespace on both sides of the equal sign is significant --- it is retained -in the result:: - - >>> print(f'{name = }') - name = 'Galahad' +See the :ref:`Standard Library section on f-strings ` +for details on how these fields are evaluated. +As that section explains, *format specifiers* are passed as the second argument +to the :func:`format` function to format a replacement field value. +For example, they can be used to specify a field width and padding characters +using the :ref:`Format Specification Mini-Language `:: -Conversion specifier -^^^^^^^^^^^^^^^^^^^^ + >>> color = 'blue' + >>> f'{color:-^20s}' + '--------blue--------' -By default, the value of a replacement field expression is converted to -string using :func:`str`:: +Top-level format specifiers may include nested replacement fields:: - >>> from fractions import Fraction - >>> one_third = Fraction(1, 3) - >>> f'{one_third}' - '1/3' - -When a debug specifier but no format specifier is used, the default conversion -instead uses :func:`repr`:: - - >>> f'{one_third = }' - 'one_third = Fraction(1, 3)' - -The conversion can be specified explicitly using one of these specifiers: - -* ``!s`` for :func:`str` -* ``!r`` for :func:`repr` -* ``!a`` for :func:`ascii` - -For example:: + >>> field_size = 20 + >>> f'{color:-^{field_size}s}' + '--------blue--------' - >>> f'{one_third!r} is {one_third!s}' - 'Fraction(1, 3) is 1/3' +These nested fields may include their own conversion fields and +:ref:`format specifiers `:: - >>> string = "¡kočka 😸!" - >>> f'{string = !a}' - "string = '\\xa1ko\\u010dka \\U0001f638!'" + >>> number = 3 + >>> f'{number:{field_size}}' + ' 3' + >>> f'{number:{field_size:05}}' + '00000000000000000003' +However, these nested fields may not include more deeply nested replacement +fields. -Format specifier -^^^^^^^^^^^^^^^^ +Formatted string literals may be concatenated, but replacement fields +cannot be split across literals. +For example, the following is a single f-string:: -After the expression has been evaluated, and possibly converted using an -explicit conversion specifier, it is formatted using the :func:`format` function. -If the replacement field includes a *format specifier*, an arbitrary string -introduced by a colon (``:``), the specifier is passed to :func:`!format` -as the second argument. -The result of :func:`!format` is then used as the final value for the -replacement field. For example:: + >>> f'{' '}' + ' ' - >>> f'{one_third:.6f}' - '0.333333' - >>> f'{one_third:_^+10}' - '___+1/3___' - >>> >>> f'{one_third!r:_^20}' - '___Fraction(1, 3)___' - >>> f'{one_third = :~>10}~' - 'one_third = ~~~~~~~1/3~' +It is equivalent to ``f'{" "}'``, rather than ``f'{' "}"``. Formal grammar @@ -1116,38 +1093,6 @@ Formal grammar --------------- - - -When the equal sign ``'='`` is provided, the output will have the expression -text, the ``'='`` and the evaluated value. Spaces after the opening brace -``'{'``, within the expression and after the ``'='`` are all retained in the -output. By default, the ``'='`` causes the :func:`repr` of the expression to be -provided, unless there is a format specified. When a format is specified it -defaults to the :func:`str` of the expression unless a conversion ``'!r'`` is -declared. - -.. versionadded:: 3.8 - The equal sign ``'='``. - -If a conversion is specified, the result of evaluating the expression -is converted before formatting. Conversion ``'!s'`` calls :func:`str` on -the result, ``'!r'`` calls :func:`repr`, and ``'!a'`` calls :func:`ascii`. - -The result is then formatted using the :func:`format` protocol. The -format specifier is passed to the :meth:`~object.__format__` method of the -expression or conversion result. An empty string is passed when the -format specifier is omitted. The formatted result is then included in -the final value of the whole string. - -Top-level format specifiers may include nested replacement fields. These nested -fields may include their own conversion fields and :ref:`format specifiers -`, but may not include more deeply nested replacement fields. The -:ref:`format specifier mini-language ` is the same as that used by -the :meth:`str.format` method. - -Formatted string literals may be concatenated, but replacement fields -cannot be split across literals. - Some examples of formatted string literals:: >>> name = "Fred" @@ -1219,6 +1164,7 @@ include expressions. See also :pep:`498` for the proposal that added formatted string literals, and :meth:`str.format`, which uses a related format string mechanism. + .. _t-strings: .. _template-string-literals: From 5fdb129c28ef1236bb0424748dfac6796d76a4e1 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 23 Jul 2025 17:22:40 +0200 Subject: [PATCH 12/17] Details & start on the formal grammar Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> Co-authored-by: Blaise Pabon --- Doc/library/stdtypes.rst | 25 ++-- Doc/reference/lexical_analysis.rst | 189 ++++++++++++----------------- 2 files changed, 97 insertions(+), 117 deletions(-) diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 63526d0165fe33..dc601dce294243 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -2567,14 +2567,15 @@ field expression, the resulting f-string will contain the expression's source, the equal sign, and the value of the expression. This is often useful for debugging:: - >>> print(f'{name=}') - name='Galahad' + >>> number = 14.3 + >>> 'number=14.3' + number=14.3 -Whitespace on both sides of the equal sign is significant --- it is retained -in the result:: +Whitespace before, inside and after the expression, as well as whitespace +after the equal sign, is significant --- it is retained in the result:: - >>> print(f'{name = }') - name = 'Galahad' + >>> f'{ number - 4 = }' + ' number - 4 = 10.3' Conversion specifier @@ -2602,10 +2603,18 @@ The conversion can be specified explicitly using one of these specifiers: For example:: - >>> f'{one_third!r} is {one_third!s}' - 'Fraction(1, 3) is 1/3' + >>> str(one_third) + '1/3' + >>> repr(one_third) + 'Fraction(1, 3)' + + >>> f'{one_third!s} is {one_third!r}' + '1/3 is Fraction(1, 3)' >>> string = "¡kočka 😸!" + >>> ascii(string) + "'\\xa1ko\\u010dka \\U0001f638!'" + >>> f'{string = !a}' "string = '\\xa1ko\\u010dka \\U0001f638!'" diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index c4dc34f30d3cdf..88b5295a590ab8 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -981,6 +981,35 @@ assignment expressions ``:=`` must be surrounded by explicit parentheses:: >>> f'{(half := 1/2)}, {half * 42}' '0.5, 21.0' +Reusing the outer f-string quoting type inside a replacement field is +permitted:: + + >>> a = dict(x=2) + >>> f"abc {a["x"]} def" + 'abc 2 def' + +Backslashes are also allowed in replacement fields and are evaluated the same +way as in any other context:: + + >>> a = ["a", "b", "c"] + >>> print(f"List a contains:\n{"\n".join(a)}") + List a contains: + a + b + c + +It is possible to nest f-strings:: + + >>> name = 'world' + >>> f'Repeated:{f' hello {name}' * 3}' + 'Repeated: hello world hello world hello world' + +Portable Python programs should not use more than 5 levels of nesting. + +.. impl-detail:: + + CPython does not limit nesting of f-strings. + Replacement expressions can contain newlines in both single-quoted and triple-quoted f-strings and they can contain comments. Everything that comes after a ``#`` inside a replacement field @@ -1010,15 +1039,16 @@ to the :func:`format` function to format a replacement field value. For example, they can be used to specify a field width and padding characters using the :ref:`Format Specification Mini-Language `:: - >>> color = 'blue' - >>> f'{color:-^20s}' - '--------blue--------' + >>> number = 14.3 + >>> f'{number:20.7f}' + ' 14.3000000' Top-level format specifiers may include nested replacement fields:: >>> field_size = 20 - >>> f'{color:-^{field_size}s}' - '--------blue--------' + >>> precision = 7 + >>> f'{number:{field_size}.{precision}f}' + ' 14.3000000' These nested fields may include their own conversion fields and :ref:`format specifiers `:: @@ -1032,40 +1062,65 @@ These nested fields may include their own conversion fields and However, these nested fields may not include more deeply nested replacement fields. -Formatted string literals may be concatenated, but replacement fields -cannot be split across literals. -For example, the following is a single f-string:: +Formatted string literals cannot be used as :term:`docstrings `, +even if they do not include expressions:: - >>> f'{' '}' - ' ' + >>> def foo(): + ... f"Not a docstring" + ... + >>> print(foo.__doc__) + None -It is equivalent to ``f'{" "}'``, rather than ``f'{' "}"``. +.. seealso:: + * :pep:`498` -- Literal String Interpolation + * :pep:`701` -- Syntactic formalization of f-strings + * :meth:`str.format`, which uses a related format string mechanism. -Formal grammar -^^^^^^^^^^^^^^ -.. grammar-snippet:: python-grammar - :group: python-grammar +Formal grammar for f-strings +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - FSTRING_START: `fstringprefix` ("'" | '"' | "'''" | '"""') - FSTRING_MIDDLE: - | - | `stringescapeseq` - | "{{" - | "}}" - | - FSTRING_END: ("'" | '"' | "'''" | '"""') - fstringprefix: <("f" | "fr" | "rf"), case-insensitive> - f_debug_specifier: whitespace* '=' whitespace* +F-strings are handled partly by the :term:`lexical analyzer`, which produces the +tokens :py:data:`~token.FSTRING_START`, :py:data:`~token.FSTRING_MIDDLE` +and :py:data:`~token.FSTRING_END`, and the parser, which handles expressions +in the replacement field. +The exact way the work is split is a CPython implementation detail. + +Correspondingly, the f-string grammar is a mix of +:ref:`lexical and syntactic definitions `. + +Whitespace is significant in these situations: + +* There may be no whitespace in :py:data:`~token.FSTRING_START` (between + the prefix and quote). +* Whitespace in :py:data:`~token.FSTRING_MIDDLE` is part of the literal + string contents. +* In ``fstring_replacement_field``, if ``f_debug_specifier`` is present, + all whitespace after the opening brace up to the ``!`` of + ``fstring_conversion``, ``:`` of ``fstring_full_format_spec``, + or the closing brace, is retained as part of the expression. .. grammar-snippet:: python-grammar :group: python-grammar fstring: `FSTRING_START` `fstring_middle`* `FSTRING_END` + + FSTRING_START: `fstringprefix` ("'" | '"' | "'''" | '"""') + FSTRING_END: `f_quote` + fstringprefix: <("f" | "fr" | "rf"), case-insensitive> + f_debug_specifier: '=' + f_quote: + fstring_middle: | `fstring_replacement_field` | `FSTRING_MIDDLE` + FSTRING_MIDDLE: + | (!"\" !`newline` !'{' !'}' !`f_quote`) `source_character` + | `stringescapeseq` + | "{{" + | "}}" + | fstring_replacement_field: | '{' `f_expression` [`f_debug_specifier`] [`fstring_conversion`] [`fstring_full_format_spec`] '}' @@ -1081,90 +1136,6 @@ Formal grammar | `yield_expression` -.. versionchanged:: 3.7 - Prior to Python 3.7, an :keyword:`await` expression and comprehensions - containing an :keyword:`async for` clause were illegal in the expressions - in formatted string literals due to a problem with the implementation. - -.. versionchanged:: 3.12 - Prior to Python 3.12, comments were not allowed inside f-string replacement - fields. - ---------------- - - -Some examples of formatted string literals:: - - >>> name = "Fred" - >>> f"He said his name is {name!r}." - "He said his name is 'Fred'." - >>> f"He said his name is {repr(name)}." # repr() is equivalent to !r - "He said his name is 'Fred'." - >>> width = 10 - >>> precision = 4 - >>> value = decimal.Decimal("12.34567") - >>> f"result: {value:{width}.{precision}}" # nested fields - 'result: 12.35' - >>> today = datetime(year=2017, month=1, day=27) - >>> f"{today:%B %d, %Y}" # using date format specifier - 'January 27, 2017' - >>> f"{today=:%B %d, %Y}" # using date format specifier and debugging - 'today=January 27, 2017' - >>> number = 1024 - >>> f"{number:#0x}" # using integer format specifier - '0x400' - >>> foo = "bar" - >>> f"{ foo = }" # preserves whitespace - " foo = 'bar'" - >>> line = "The mill's closed" - >>> f"{line = }" - 'line = "The mill\'s closed"' - >>> f"{line = :20}" - "line = The mill's closed " - >>> f"{line = !r:20}" - 'line = "The mill\'s closed" ' - - -Reusing the outer f-string quoting type inside a replacement field is -permitted:: - - >>> a = dict(x=2) - >>> f"abc {a["x"]} def" - 'abc 2 def' - -.. versionchanged:: 3.12 - Prior to Python 3.12, reuse of the same quoting type of the outer f-string - inside a replacement field was not possible. - -Backslashes are also allowed in replacement fields and are evaluated the same -way as in any other context:: - - >>> a = ["a", "b", "c"] - >>> print(f"List a contains:\n{"\n".join(a)}") - List a contains: - a - b - c - -.. versionchanged:: 3.12 - Prior to Python 3.12, backslashes were not permitted inside an f-string - replacement field. - -Formatted string literals cannot be used as docstrings, even if they do not -include expressions. - -:: - - >>> def foo(): - ... f"Not a docstring" - ... - >>> foo.__doc__ is None - True - -See also :pep:`498` for the proposal that added formatted string literals, -and :meth:`str.format`, which uses a related format string mechanism. - - .. _t-strings: .. _template-string-literals: From e88843c23054caa7c22231d8ac485facb7d942c5 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 23 Jul 2025 18:05:51 +0200 Subject: [PATCH 13/17] Improve text on whitespace in f-string debug expressions --- Doc/reference/lexical_analysis.rst | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 88b5295a590ab8..bd508399e6fcb4 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1097,9 +1097,15 @@ Whitespace is significant in these situations: * Whitespace in :py:data:`~token.FSTRING_MIDDLE` is part of the literal string contents. * In ``fstring_replacement_field``, if ``f_debug_specifier`` is present, - all whitespace after the opening brace up to the ``!`` of - ``fstring_conversion``, ``:`` of ``fstring_full_format_spec``, - or the closing brace, is retained as part of the expression. + all whitespace after the opening brace until the ``f_debug_specifier``, + as well as whitespace immediatelly following ``f_debug_specifier``, + is retained as part of the expression. + + .. impl-detail:: + + The expression is not handled in the tokenization phase; it is + retrieved from the source code using locations of the ``{`` token + and the token after ``=``. .. grammar-snippet:: python-grammar :group: python-grammar From f2db8f9b660454700d32bf6aa516755d099c9855 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 6 Aug 2025 16:51:03 +0200 Subject: [PATCH 14/17] Comment on the funkiness of the t-string grammar --- Doc/reference/lexical_analysis.rst | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index b97b7bbc492712..a86a3521bedfe0 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1085,8 +1085,8 @@ Formal grammar for f-strings F-strings are handled partly by the :term:`lexical analyzer`, which produces the tokens :py:data:`~token.FSTRING_START`, :py:data:`~token.FSTRING_MIDDLE` -and :py:data:`~token.FSTRING_END`, and the parser, which handles expressions -in the replacement field. +and :py:data:`~token.FSTRING_END`, and partly by the parser, which handles +expressions in the replacement field. The exact way the work is split is a CPython implementation detail. Correspondingly, the f-string grammar is a mix of @@ -1109,6 +1109,12 @@ Whitespace is significant in these situations: retrieved from the source code using locations of the ``{`` token and the token after ``=``. + +The ``FSTRING_MIDDLE`` definition uses +:ref:`negative lookaheads ` (``!``) +to indicate special characters (backslash, newline, ``{``, ``}``) and +sequences (``f_quote``). + .. grammar-snippet:: python-grammar :group: python-grammar @@ -1143,6 +1149,15 @@ Whitespace is significant in these situations: | ','.(`conditional_expression` | "*" `or_expr`)+ [","] | `yield_expression` +.. note:: + + In the above grammar snippet, the ``f_quote`` and ``FSTRING_MIDDLE`` rules + are context-sensitive -- they depend on the contents of ``FSTRING_START`` + of the nearest enclosing ``fstring``. + + Constructing a more traditional formal grammar from this template is left + as an exercise for the reader. + .. _t-strings: .. _template-string-literals: From 6468a97fb2d8ee497760e9ebc22e52a96d9aab7c Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 6 Aug 2025 17:15:37 +0200 Subject: [PATCH 15/17] Adjust t-string docs: move evaluation rules out, add a note on grammar --- Doc/library/stdtypes.rst | 40 +++++++++++++++++++ Doc/reference/expressions.rst | 2 +- Doc/reference/lexical_analysis.rst | 64 +++++++++++------------------- 3 files changed, 65 insertions(+), 41 deletions(-) diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 281592e2508f92..e9eda86a18adcd 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -2640,6 +2640,46 @@ replacement field. For example:: >>> f'{one_third = :~>10}~' 'one_third = ~~~~~~~1/3~' +.. _stdtypes-tstrings: + +Template String Literals (t-strings) +------------------------------------ + +An :dfn:`t-string` (formally a :dfn:`template string literal`) is +a string literal that is prefixed with ``t`` or ``T``. + +These strings follow the same syntax and evaluation rules as +:ref:`formatted string literals `, +with for the following differences: + +* Rather than evaluating to a ``str`` object, template string literals evaluate + to a :class:`string.templatelib.Template` object. + +* The :func:`format` protocol is not used. + Instead, the format specifier and conversions (if any) are passed to + a new :class:`~string.templatelib.Interpolation` object that is created + for each evaluated expression. + It is up to code that processes the resulting :class:`~string.templatelib.Template` + object to decide how to handle format specifiers and conversions. + +* Format specifiers containing nested replacement fields are evaluated eagerly, + prior to being passed to the :class:`~string.templatelib.Interpolation` object. + For instance, an interpolation of the form ``{amount:.{precision}f}`` will + evaluate the inner expression ``{precision}`` to determine the value of the + ``format_spec`` attribute. + If ``precision`` were to be ``2``, the resulting format specifier + would be ``'.2f'``. + +* When the equals sign ``'='`` is provided in an interpolation expression, + the text of the expression is appended to the literal string that precedes + the relevant interpolation. + This includes the equals sign and any surrounding whitespace. + The :class:`!Interpolation` instance for the expression will be created as + normal, except that :attr:`~string.templatelib.Interpolation.conversion` will + be set to '``r``' (:func:`repr`) by default. + If an explicit conversion or format specifier are provided, + this will override the default behaviour. + .. _old-string-formatting: diff --git a/Doc/reference/expressions.rst b/Doc/reference/expressions.rst index 9aca25e3214a16..20100e6617f10d 100644 --- a/Doc/reference/expressions.rst +++ b/Doc/reference/expressions.rst @@ -174,7 +174,7 @@ Formally: .. grammar-snippet:: :group: python-grammar - strings: ( `STRING` | fstring)+ | tstring+ + strings: ( `STRING` | `fstring`)+ | `tstring`+ This feature is defined at the syntactical level, so it only works with literals. To concatenate string expressions at run time, the '+' operator may be used:: diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index a86a3521bedfe0..f9615299bf5e42 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1080,8 +1080,24 @@ even if they do not include expressions:: * :meth:`str.format`, which uses a related format string mechanism. +.. _t-strings: +.. _template-string-literals: + +t-strings +--------- + +.. versionadded:: 3.14 + +A :dfn:`template string literal` or :dfn:`t-string` is a string literal +that is prefixed with '``t``' or '``T``'. +These strings follow the same syntax rules as +:ref:`formatted string literals `. +For differences in evaluation rules, see the +:ref:`Standard Library section on t-strings ` + + Formal grammar for f-strings -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +---------------------------- F-strings are handled partly by the :term:`lexical analyzer`, which produces the tokens :py:data:`~token.FSTRING_START`, :py:data:`~token.FSTRING_MIDDLE` @@ -1115,7 +1131,7 @@ The ``FSTRING_MIDDLE`` definition uses to indicate special characters (backslash, newline, ``{``, ``}``) and sequences (``f_quote``). -.. grammar-snippet:: python-grammar +.. grammar-snippet:: :group: python-grammar fstring: `FSTRING_START` `fstring_middle`* `FSTRING_END` @@ -1158,47 +1174,15 @@ sequences (``f_quote``). Constructing a more traditional formal grammar from this template is left as an exercise for the reader. +The grammar for t-strings is identical to the one for f-strings, with *t* +instead of *f* at the beginning of rule and token names and in the prefix. -.. _t-strings: -.. _template-string-literals: - -t-strings ---------- +.. grammar-snippet:: + :group: python-grammar -.. versionadded:: 3.14 + tstring: `TSTRING_START` `tstring_middle`* `TSTRING_END` -A :dfn:`template string literal` or :dfn:`t-string` is a string literal -that is prefixed with '``t``' or '``T``'. -These strings follow the same syntax and evaluation rules as -:ref:`formatted string literals `, with the following differences: - -* Rather than evaluating to a ``str`` object, template string literals evaluate - to a :class:`string.templatelib.Template` object. - -* The :func:`format` protocol is not used. - Instead, the format specifier and conversions (if any) are passed to - a new :class:`~string.templatelib.Interpolation` object that is created - for each evaluated expression. - It is up to code that processes the resulting :class:`~string.templatelib.Template` - object to decide how to handle format specifiers and conversions. - -* Format specifiers containing nested replacement fields are evaluated eagerly, - prior to being passed to the :class:`~string.templatelib.Interpolation` object. - For instance, an interpolation of the form ``{amount:.{precision}f}`` will - evaluate the inner expression ``{precision}`` to determine the value of the - ``format_spec`` attribute. - If ``precision`` were to be ``2``, the resulting format specifier - would be ``'.2f'``. - -* When the equals sign ``'='`` is provided in an interpolation expression, - the text of the expression is appended to the literal string that precedes - the relevant interpolation. - This includes the equals sign and any surrounding whitespace. - The :class:`!Interpolation` instance for the expression will be created as - normal, except that :attr:`~string.templatelib.Interpolation.conversion` will - be set to '``r``' (:func:`repr`) by default. - If an explicit conversion or format specifier are provided, - this will override the default behaviour. + .. _numbers: From 681112d9ecf00c8f1b81dc10f93b87dd49a67281 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 13 Aug 2025 16:10:39 +0200 Subject: [PATCH 16/17] Apply suggestions from code review Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com> --- Doc/library/stdtypes.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index e9eda86a18adcd..9df41d380b5e79 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -2568,8 +2568,8 @@ the equal sign, and the value of the expression. This is often useful for debugging:: >>> number = 14.3 - >>> 'number=14.3' - number=14.3 + >>> f'{number=}' + 'number=14.3' Whitespace before, inside and after the expression, as well as whitespace after the equal sign, is significant --- it is retained in the result:: @@ -2582,7 +2582,7 @@ Conversion specifier ^^^^^^^^^^^^^^^^^^^^ By default, the value of a replacement field expression is converted to -string using :func:`str`:: +a string using :func:`str`:: >>> from fractions import Fraction >>> one_third = Fraction(1, 3) From 9e1290fa18533a819942e49730317dd0be9f3f67 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 13 Aug 2025 16:12:26 +0200 Subject: [PATCH 17/17] Don't link TSTRING_START &c. since we don't define them --- Doc/reference/lexical_analysis.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index f9615299bf5e42..082b770ede749c 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1180,7 +1180,7 @@ instead of *f* at the beginning of rule and token names and in the prefix. .. grammar-snippet:: :group: python-grammar - tstring: `TSTRING_START` `tstring_middle`* `TSTRING_END` + tstring: TSTRING_START tstring_middle* TSTRING_END