Skip to content

Commit 8b2e1d5

Browse files
methanehugovk
andauthored
PEP 822: Dedented Multiline String (d-string) (#4768)
Co-authored-by: Hugo van Kemenade <[email protected]>
1 parent 67384c4 commit 8b2e1d5

File tree

1 file changed

+319
-0
lines changed

1 file changed

+319
-0
lines changed

peps/pep-0822.rst

Lines changed: 319 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,319 @@
1+
PEP: 822
2+
Title: Dedented Multiline String (d-string)
3+
Author: Inada Naoki <[email protected]>
4+
Discussions-To: https://discuss.python.org/t/105519
5+
Status: Draft
6+
Type: Standards Track
7+
Created: 05-Jan-2026
8+
Python-Version: 3.15
9+
Post-History: `05-Jan-2026 <https://discuss.python.org/t/105519>`__,
10+
11+
12+
Abstract
13+
========
14+
15+
This PEP proposes to add a feature that automatically removes indentation from
16+
multiline string literals.
17+
18+
Dedented multiline strings use a new prefix "d" (shorthand for "dedent") before
19+
the opening quote of a multiline string literal.
20+
21+
Example (spaces are visualized as ``_``):
22+
23+
.. code-block:: python
24+
25+
def hello_paragraph() -> str:
26+
____return d"""
27+
________<p>
28+
__________Hello, World!
29+
________</p>
30+
____"""
31+
32+
The closing triple quotes control how much indentation would be removed.
33+
In the above example, the returned string will contain three lines:
34+
35+
* ``"____<p>\n"`` (four leading spaces)
36+
* ``"______Hello, World!\n"`` (six leading spaces)
37+
* ``"____</p>\n"`` (four leading spaces)
38+
39+
40+
Motivation
41+
==========
42+
43+
When writing multiline string literals within deeply indented Python code,
44+
users are faced with the following choices:
45+
46+
* Accept that the content of the string literal will be left-aligned.
47+
* Use multiple single-line string literals concatenated together instead of
48+
a multiline string literal.
49+
* Use ``textwrap.dedent()`` to remove indentation.
50+
51+
All of these options have drawbacks in terms of code readability and
52+
maintainability.
53+
54+
* Left-aligned multiline strings look awkward and tend to be avoided.
55+
In practice, many places including Python's own test code choose other
56+
methods.
57+
* Concatenated single-line string literals are more verbose and harder to
58+
maintain.
59+
* ``textwrap.dedent()`` is implemented in Python so it requires some runtime
60+
overhead.
61+
It cannot be used in hot paths where performance is critical.
62+
63+
This PEP aims to provide a built-in syntax for dedented multiline strings that
64+
is both easy to read and write, while also being efficient at runtime.
65+
66+
67+
Rationale
68+
=========
69+
70+
The main alternative to this idea is to implement ``textwrap.dedent()`` in C
71+
and provide it as a ``str.dedent()`` method.
72+
This idea reduces the runtime overhead of ``textwrap.dedent()``.
73+
By making it a built-in method, it also allows for compile-time dedentation
74+
when called directly on string literals.
75+
76+
However, this approach has several drawbacks:
77+
78+
* To support cases where users want to include some indentation in the string,
79+
the ``dedent()`` method would need to accept an argument specifying
80+
the amount of indentation to remove.
81+
This would be cumbersome and error-prone for users.
82+
* When continuation lines (lines after line ends with a backslash) are used,
83+
they cannot be dedented.
84+
* f-strings may interpolate expressions as multiline string without indent.
85+
In such case, f-string + ``str.dedent()`` cannot dedent the whole string.
86+
* t-strings do not create ``str`` objects, so they cannot use the
87+
``str.dedent()`` method.
88+
While adding a ``dedent()`` method to ``string.templatelib.Template`` is an
89+
option, it would lead to inconsistency since t-strings and f-strings are very
90+
similar but would have different behaviors regarding dedentation.
91+
92+
The ``str.dedent()`` method can still be useful for non-literal strings,
93+
so this PEP does not preclude that idea.
94+
However, for ease of use with multiline string literals, providing dedicated
95+
syntax is superior.
96+
97+
98+
Specification
99+
=============
100+
101+
Add a new string literal prefix "d" for dedented multiline strings.
102+
This prefix can be combined with "f", "t", and "r" prefixes.
103+
104+
This prefix is only for multiline string literals.
105+
So it can only be used with triple quotes (``"""`` or ``'''``).
106+
Using it with single or double quotes (``"`` or ``'``) is a syntax error.
107+
108+
Opening triple quotes needs to be followed by a newline character.
109+
This newline is not included in the resulting string.
110+
111+
The amount of indentation to be removed is determined by the whitespace
112+
(``' '`` or ``'\t'``) preceding the closing triple quotes.
113+
Mixing spaces and tabs in indentation raises a ``TabError``, similar to
114+
Python's own indentation rules.
115+
116+
The dedentation process removes the determined amount of leading whitespace
117+
from every line in the string.
118+
Lines that are shorter than the determined indentation become just an empty
119+
line (e.g. ``"\n"``).
120+
Otherwise, if the line does not start with the determined indentation,
121+
Python raises an ``IndentationError``.
122+
123+
Unless combined with the "r" prefix, backslash escapes are processed after
124+
removing indentation.
125+
So you cannot use ``\\t`` to create indentation.
126+
And you can use line continuation (backslash at the end of line) and remove
127+
indentation from the continued line.
128+
129+
Examples:
130+
131+
.. code-block:: python
132+
133+
# Whitespace is shown as _ and tab is shown as ---> for clarity.
134+
# Error messages are just for explanation. Actual messages may differ.
135+
136+
s = d"" # SyntaxError: d-string must be a multiline string
137+
s = d"""Hello""" # SyntaxError: d-string must be a multiline string
138+
s = d"""Hello
139+
__World!
140+
""" # SyntaxError: d-string must start with a newline
141+
142+
s = d"""
143+
__Hello
144+
__World!""" # SyntaxError: d-string must end with an indent-only line
145+
146+
s = d"""
147+
__Hello
148+
__World!
149+
""" # Zero indentation is removed because closing quotes are not indented.
150+
print(repr(s)) # '__Hello\n__World!\n'
151+
152+
s = d"""
153+
__Hello
154+
__World!
155+
_""" # One space indentation is removed.
156+
print(repr(s)) # '_Hello\n_World!\n'
157+
158+
s = d"""
159+
__Hello
160+
__World!
161+
__""" # Two spaces indentation are removed.
162+
print(repr(s)) # 'Hello\nWorld!\n'
163+
164+
s = d"""
165+
__Hello
166+
__World!
167+
___""" # IndentationError: missing valid indentation
168+
169+
s = d"""
170+
--->Hello
171+
__World!
172+
__""" # IndentationError: missing valid indentation
173+
174+
s = d"""
175+
--->--->__Hello
176+
--->--->__World!
177+
--->--->""" # Tab is allowed as indentation.
178+
# Spaces are just in the string, not indentation to be removed.
179+
print(repr(s)) # '__Hello\n__World!\n'
180+
181+
s = d"""
182+
--->____Hello
183+
--->____World!
184+
--->__""" # TabError: mixing spaces and tabs in indentation
185+
186+
s = d"""
187+
__Hello \
188+
__World!\
189+
__""" # line continuation works as ususal
190+
print(repr(s)) # 'Hello_World!'
191+
192+
s = d"""\
193+
__Hello
194+
__World
195+
__""" # SyntaxError: d-string must starts with a newline.
196+
197+
s = dr"""
198+
__Hello\
199+
__World!\
200+
__""" # d-string can be combined with r-string.
201+
print(repr(s)) # 'Hello\\\nWorld!\\\n'
202+
203+
s = df"""
204+
____Hello, {"world".title()}!
205+
____""" # d-string can be combined with f-string and t-string too.
206+
print(repr(s)) # 'Hello, World!\n'
207+
208+
s = dt"""
209+
____Hello, {"world".title()}!
210+
____"""
211+
print(type(s)) # <class 'string.templatelib.Template'>
212+
print(s.strings) # ('Hello, ', '!\n')
213+
print(s.values) # ('World',)
214+
print(s.interpolations)
215+
# (Interpolation('World', '"world".title()', None, ''),)
216+
217+
218+
How to Teach This
219+
=================
220+
221+
In the tutorial, we can introduce d-string with triple quote string literals.
222+
Additionally, we can add a note in the ``textwrap.dedent()`` documentation,
223+
providing a link to the d-string section in the language reference or
224+
the relevant part of the tutorial.
225+
226+
227+
Other Languages having Similar Features
228+
========================================
229+
230+
Java 15 introduced a feature called `text blocks <https://openjdk.org/jeps/378>`__.
231+
Since Java had not used triple qutes before, they introduced triple quotes for
232+
multiline string literals with automatic indent removal.
233+
234+
C# 11 also introduced a similar feature called
235+
`raw string literals <https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-11.0/raw-string-literal>`__.
236+
237+
`Julia <https://docs.julialang.org/en/v1/manual/strings/#Triple-Quoted-String-Literals>`__ and
238+
`Swift <https://docs.swift.org/swift-book/documentation/the-swift-programming-language/stringsandcharacters/#Multiline-String-Literals>`__
239+
also support triple-quoted string literals that automatically remove indentation.
240+
241+
PHP 7.3 introduced `Flexible Heredoc and Nowdoc Syntaxes <https://wiki.php.net/rfc/flexible_heredoc_nowdoc_syntaxes>`__
242+
Although it uses closing marker (e.g. ``<<<END ... END``) instead of
243+
triple quote, it removes indent from text too.
244+
245+
Java and Julia uses the least-indented line to determine the amount of
246+
indentation to be removed.
247+
Swift, C#, and PHP uses the indentation of the closing triple quotes or
248+
closing marker.
249+
250+
This PEP chose the Swift and C# approach because it is simpler and easier to
251+
explain.
252+
253+
254+
Reference Implementation
255+
========================
256+
257+
A CPython implementation of PEP 822 is available at
258+
`methane/cpython#108 <https://github.com/methane/cpython/pull/108>`__.
259+
260+
261+
Rejected Ideas
262+
==============
263+
264+
``str.dedent()`` method
265+
-----------------------
266+
267+
As mentioned in the Rationale section, this PEP doesn't reject the idea of a
268+
``str.dedent()`` method.
269+
A faster version of ``textwrap.dedent()`` implemented in C would be useful for
270+
runtime dedentation.
271+
272+
However, d-string is more suitable for multiline string literals because:
273+
274+
* It works well with f/t-strings.
275+
* It allows specifying the amount of indentation to be removed more easily.
276+
* It can dedent continuation lines.
277+
278+
279+
Triple-backtick
280+
---------------
281+
282+
It is considered that
283+
`using triple backticks <https://discuss.python.org/t/40679>`__
284+
for dedented multiline strings could be an alternative syntax.
285+
This notation is familiar to us from Markdown. While there were past concerns
286+
about certain keyboard layouts,
287+
nowadays many people are accustomed to typing this notation.
288+
289+
However, this notation conflicts when embedding Python code within Markdown or
290+
vice versa.
291+
Therefore, considering these drawbacks, increasing the variety of quote
292+
characters is not seen as a superior idea compared to adding a prefix to
293+
string literals.
294+
295+
296+
``__future__`` import
297+
---------------------
298+
299+
Instead of adding a prefix to string literals, the idea of using a
300+
``__future__`` import to change the default behavior of multiline
301+
string literals was also considered.
302+
This could help simplify Python's grammar in the future.
303+
304+
But rewriting all existing complex codebases to the new notation may not be
305+
straightforward.
306+
Until all multiline strings in that source code are rewritten to
307+
the new notation, automatic dedentation cannot be utilized.
308+
309+
Until all users can rewrite existing codebases to the new notation,
310+
two types of Python syntax will coexist indefinitely.
311+
Therefore, `many people preferred the new string prefix <https://discuss.python.org/t/90988/54>`__
312+
over the ``__future__`` import.
313+
314+
315+
Copyright
316+
=========
317+
318+
This document is placed in the public domain or under the
319+
CC0-1.0-Universal license, whichever is more permissive.

0 commit comments

Comments
 (0)