Skip to content

Commit 27c044a

Browse files
committed
Possible t-string addition to tutorial/intputoutput.rst
1 parent 0b4e13c commit 27c044a

File tree

2 files changed

+193
-0
lines changed

2 files changed

+193
-0
lines changed

Doc/library/string.templatelib.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010

1111
.. seealso::
1212

13+
* :ref:`T-strings tutorial <tut-t-strings>`
1314
* :ref:`Format strings <f-strings>`
1415
* :ref:`T-string literal syntax <t-strings>`
1516

Doc/tutorial/inputoutput.rst

Lines changed: 192 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,22 @@ printing space-separated values. There are several ways to format output.
3434
>>> f'Results of the {year} {event}'
3535
'Results of the 2016 Referendum'
3636

37+
* When greater control is needed, :ref:`template string literals <tut-t-strings>`
38+
can be useful. T-strings -- which begin with ``t`` or ``T`` -- share the
39+
same syntax as f-strings but, unlike f-strings, produce a
40+
:class:`~string.templatelib.Template` instance rather than a simple ``str``.
41+
Templates give you access to the static and interpolated (in curly braces)
42+
parts of the string *before* they are combined into a final string.
43+
44+
::
45+
46+
>>> name = "World"
47+
>>> template = t"Hello {name}!"
48+
>>> template.strings
49+
('Hello ', '!')
50+
>>> template.values
51+
('World',)
52+
3753
* The :meth:`str.format` method of strings requires more manual
3854
effort. You'll still use ``{`` and ``}`` to mark where a variable
3955
will be substituted and can provide detailed formatting directives,
@@ -161,6 +177,182 @@ See :ref:`self-documenting expressions <bpo-36817-whatsnew>` for more informatio
161177
on the ``=`` specifier. For a reference on these format specifications, see
162178
the reference guide for the :ref:`formatspec`.
163179

180+
.. _tut-t-strings:
181+
182+
Template String Literals
183+
-------------------------
184+
185+
:ref:`Template string literals <t-strings>` (also called t-strings for short)
186+
are an extension of :ref:`f-strings <tut-f-strings>` that let you access the
187+
static and interpolated parts of a string *before* they are combined into a
188+
final string. This provides for greater control over how the string is
189+
formatted.
190+
191+
The most common way to create a :class:`~string.templatelib.Template` instance
192+
is to use the :ref:`t-string literal syntax <t-strings>`. This syntax is
193+
identical to that of :ref:`f-strings` except that it uses a ``t`` instead of
194+
an ``f``:
195+
196+
>>> name = "World"
197+
>>> template = t"Hello {name}!"
198+
>>> template.strings
199+
('Hello ', '!')
200+
>>> template.values
201+
('World',)
202+
203+
:class:`!Template` instances are iterable, yielding each string and
204+
:class:`~string.templatelib.Interpolation` in order:
205+
206+
.. testsetup::
207+
208+
name = "World"
209+
template = t"Hello {name}!"
210+
211+
.. doctest::
212+
213+
>>> list(template)
214+
['Hello ', Interpolation('World', 'name', None, ''), '!']
215+
216+
Interpolations represent expressions inside a t-string. They contain the
217+
evaluated value of the expression (``'World'`` in this example), the text of
218+
the original expression (``'name'``), and optional conversion and format
219+
specification attributes.
220+
221+
Templates can be processed in a variety of ways. For instance, here's code that
222+
converts static strings to lowercase and interpolated values to uppercase:
223+
224+
>>> from string.templatelib import Template
225+
>>>
226+
>>> def lower_upper(template: Template) -> str:
227+
... return ''.join(
228+
... part.lower() if isinstance(part, str) else part.value.upper()
229+
... for part in template
230+
... )
231+
...
232+
>>> name = "World"
233+
>>> template = t"Hello {name}!"
234+
>>> lower_upper(template)
235+
'hello WORLD!'
236+
237+
Template strings are particularly useful for sanitizing user input. Imagine
238+
we're building a web application that has user profile pages. Perhaps the
239+
``User`` class is defined like this:
240+
241+
>>> from dataclasses import dataclass
242+
>>>
243+
>>> @dataclass
244+
... class User:
245+
... name: str
246+
...
247+
248+
Imagine using f-strings in to generate HTML for the ``User``:
249+
250+
.. testsetup::
251+
252+
class User:
253+
name: str
254+
def __init__(self, name: str):
255+
self.name = name
256+
257+
.. doctest::
258+
259+
>>> # Warning: this is dangerous code. Don't do this!
260+
>>> def user_html(user: User) -> str:
261+
... return f"<div><h1>{user.name}</h1></div>"
262+
...
263+
264+
This code is dangerous because our website lets users type in their own names.
265+
If a user types in a name like ``"<script>alert('evil');</script>"``, the
266+
browser will execute that script when someone else visits their profile page.
267+
This is called a *cross-site scripting (XSS) vulnerability*, and it is a form
268+
of *injection vulnerability*. Injection vulnerabilities occur when user input
269+
is included in a program without proper sanitization, allowing malicious code
270+
to be executed. The same sorts of vulnerabilities can occur when user input is
271+
included in SQL queries, command lines, or other contexts where the input is
272+
interpreted as code.
273+
274+
To prevent this, instead of using f-strings, we can use t-strings. Let's
275+
update our ``user_html()`` function to return a :class:`~string.templatelib.Template`:
276+
277+
>>> from string.templatelib import Template
278+
>>>
279+
>>> def user_html(user: User) -> Template:
280+
... return t"<div><h1>{user.name}</h1></div>"
281+
282+
Now let's implement a function that sanitizes *any* HTML :class:`!Template`:
283+
284+
>>> from html import escape
285+
>>> from string.templatelib import Template
286+
>>>
287+
>>> def sanitize_html_template(template: Template) -> str:
288+
... return ''.join(
289+
... part if isinstance(part, str) else escape(part.value)
290+
... for part in template
291+
... )
292+
...
293+
294+
This function iterates over the parts of the :class:`!Template`, escaping any
295+
interpolated values using the :func:`html.escape` function, which converts
296+
special characters like ``<``, ``>``, and ``&`` into their HTML-safe
297+
equivalents.
298+
299+
Now we can tie it all together:
300+
301+
.. testsetup::
302+
303+
from dataclasses import dataclass
304+
from string.templatelib import Template
305+
from html import escape
306+
@dataclass
307+
class User:
308+
name: str
309+
def sanitize_html_template(template: Template) -> str:
310+
return ''.join(
311+
part if isinstance(part, str) else escape(part.value)
312+
for part in template
313+
)
314+
def user_html(user: User) -> Template:
315+
return t"<div><h1>{user.name}</h1></div>"
316+
317+
.. doctest::
318+
319+
>>> evil_user = User(name="<script>alert('evil');</script>")
320+
>>> template = user_html(evil_user)
321+
>>> safe = sanitize_html_template(template)
322+
>>> print(safe)
323+
<div><h1>&lt;script&gt;alert(&#x27;evil&#x27;);&lt;/script&gt;</h1></div>
324+
325+
We are no longer vulnerable to XSS attacks because we are escaping the
326+
interpolated values before they are included in the rendered HTML.
327+
328+
Of course, there's no need for code that processes :class:`!Template` instances
329+
to be limited to returning a simple string. For instance, we could imagine
330+
defining a more complex ``html()`` function that returns a structured
331+
representation of the HTML:
332+
333+
>>> from dataclasses import dataclass
334+
>>> from string.templatelib import Template
335+
>>> from html.parser import HTMLParser
336+
>>>
337+
>>> @dataclass
338+
... class Element:
339+
... tag: str
340+
... attributes: dict[str, str]
341+
... children: list[str | Element]
342+
...
343+
>>> def parse_html(template: Template) -> Element:
344+
... """
345+
... Uses Python's built-in HTMLParser to parse the template,
346+
... handle any interpolated values, and return a tree of
347+
... Element instances.
348+
... """
349+
... ...
350+
...
351+
352+
A full implementation of this function would be quite complex and is not
353+
provided here. That said, the fact that it is possible to implement a method
354+
like ``parse_html()`` showcases the flexibility and power of t-strings.
355+
164356
.. _tut-string-format:
165357

166358
The String format() Method

0 commit comments

Comments
 (0)