@@ -15,12 +15,10 @@ XML Processing Modules
1515
1616Python's interfaces for processing XML are grouped in the ``xml `` package.
1717
18- .. warning ::
18+ .. note ::
1919
20- The XML modules are not secure against erroneous or maliciously
21- constructed data. If you need to parse untrusted or
22- unauthenticated data see the :ref: `xml-vulnerabilities ` and
23- :ref: `defusedxml-package ` sections.
20+ If you need to parse untrusted or unauthenticated data, see
21+ :ref: `xml-security `.
2422
2523It is important to note that modules in the :mod: `xml ` package require that
2624there be at least one SAX-compliant XML parser available. The Expat parser is
@@ -47,46 +45,22 @@ The XML handling submodules are:
4745* :mod: `xml.parsers.expat `: the Expat parser binding
4846
4947
48+ .. _xml-security :
5049.. _xml-vulnerabilities :
5150
52- XML vulnerabilities
53- -------------------
51+ XML security
52+ ------------
5453
55- The XML processing modules are not secure against maliciously constructed data.
5654An attacker can abuse XML features to carry out denial of service attacks,
5755access local files, generate network connections to other machines, or
5856circumvent firewalls.
5957
60- The following table gives an overview of the known attacks and whether
61- the various modules are vulnerable to them.
62-
63- ========================= ================== ================== ================== ================== ==================
64- kind sax etree minidom pulldom xmlrpc
65- ========================= ================== ================== ================== ================== ==================
66- billion laughs **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1)
67- quadratic blowup **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1)
68- external entity expansion Safe (5) Safe (2) Safe (3) Safe (5) Safe (4)
69- `DTD `_ retrieval Safe (5) Safe Safe Safe (5) Safe
70- decompression bomb Safe Safe Safe Safe **Vulnerable **
71- large tokens **Vulnerable ** (6) **Vulnerable ** (6) **Vulnerable ** (6) **Vulnerable ** (6) **Vulnerable ** (6)
72- ========================= ================== ================== ================== ================== ==================
73-
74- 1. Expat 2.4.1 and newer is not vulnerable to the "billion laughs" and
75- "quadratic blowup" vulnerabilities. Items still listed as vulnerable due to
76- potential reliance on system-provided libraries. Check
77- :const: `!pyexpat.EXPAT_VERSION `.
78- 2. :mod: `xml.etree.ElementTree ` doesn't expand external entities and raises a
79- :exc: `~xml.etree.ElementTree.ParseError ` when an entity occurs.
80- 3. :mod: `xml.dom.minidom ` doesn't expand external entities and simply returns
81- the unexpanded entity verbatim.
82- 4. :mod: `xmlrpc.client ` doesn't expand external entities and omits them.
83- 5. Since Python 3.7.1, external general entities are no longer processed by
84- default.
85- 6. Expat 2.6.0 and newer is not vulnerable to denial of service
86- through quadratic runtime caused by parsing large tokens.
87- Items still listed as vulnerable due to
88- potential reliance on system-provided libraries. Check
89- :const: `!pyexpat.EXPAT_VERSION `.
58+ Expat versions lower that 2.6.0 may be vulnerable to "billion laughs",
59+ "quadratic blowup" and "large tokens". Python may be vulnerable if it uses such
60+ older versions of Expat as a system-provided library.
61+ Check :const: `!pyexpat.EXPAT_VERSION `.
62+
63+ :mod: `xmlrpc ` is **vulnerable ** to the "decompression bomb" attack.
9064
9165
9266billion laughs / exponential entity expansion
@@ -103,16 +77,6 @@ quadratic blowup entity expansion
10377 efficient as the exponential case but it avoids triggering parser countermeasures
10478 that forbid deeply nested entities.
10579
106- external entity expansion
107- Entity declarations can contain more than just text for replacement. They can
108- also point to external resources or local files. The XML
109- parser accesses the resource and embeds the content into the XML document.
110-
111- `DTD `_ retrieval
112- Some XML libraries like Python's :mod: `xml.dom.pulldom ` retrieve document type
113- definitions from remote or local locations. The feature has similar
114- implications as the external entity expansion issue.
115-
11680decompression bomb
11781 Decompression bombs (aka `ZIP bomb `_) apply to all XML libraries
11882 that can parse compressed XML streams such as gzipped HTTP streams or
@@ -126,21 +90,5 @@ large tokens
12690 be used to cause denial of service in the application parsing XML.
12791 The issue is known as :cve: `2023-52425 `.
12892
129- The documentation for :pypi: `defusedxml ` on PyPI has further information about
130- all known attack vectors with examples and references.
131-
132- .. _defusedxml-package :
133-
134- The :mod: `!defusedxml ` Package
135- ------------------------------
136-
137- :pypi: `defusedxml ` is a pure Python package with modified subclasses of all stdlib
138- XML parsers that prevent any potentially malicious operation. Use of this
139- package is recommended for any server code that parses untrusted XML data. The
140- package also ships with example exploits and extended documentation on more
141- XML exploits such as XPath injection.
142-
143-
14493.. _Billion Laughs : https://en.wikipedia.org/wiki/Billion_laughs
14594.. _ZIP bomb : https://en.wikipedia.org/wiki/Zip_bomb
146- .. _DTD : https://en.wikipedia.org/wiki/Document_type_definition
0 commit comments