Skip to content

Commit bba2678

Browse files
committed
expose Expat mitigation API to prevent exponential expansions
1 parent 8288f36 commit bba2678

File tree

8 files changed

+397
-35
lines changed

8 files changed

+397
-35
lines changed

Doc/library/pyexpat.rst

Lines changed: 53 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -241,6 +241,55 @@ XMLParser Objects
241241
:class:`!xmlparser` objects have the following methods to mitigate some
242242
common XML vulnerabilities.
243243

244+
.. method:: xmlparser.SetBillionLaughsAttackProtectionActivationThreshold(threshold, /)
245+
246+
Sets the number of output bytes needed to activate protection against
247+
`billion laughs`_ attacks.
248+
249+
The number of output bytes includes amplification from entity expansion
250+
and reading DTD files.
251+
252+
By default, parser objects have a protection activation threshold of 8 MiB,
253+
or equivalently 8,388,608 bytes.
254+
255+
An :exc:`ExpatError` is raised if this method is called on a
256+
|xml-non-root-parser| parser.
257+
The corresponding :attr:`~ExpatError.lineno` and :attr:`~ExpatError.offset`
258+
should not be used as they may have no special meaning.
259+
260+
.. versionadded:: next
261+
262+
.. method:: xmlparser.SetBillionLaughsAttackProtectionMaximumAmplification(max_factor, /)
263+
264+
Sets the maximum tolerated amplification factor for protection against
265+
`billion laughs`_ attacks.
266+
267+
The amplification factor is calculated as ``(direct + indirect) / direct``
268+
while parsing, where ``direct`` is the number of bytes read from
269+
the primary document in parsing and ``indirect`` is the number of
270+
bytes added by expanding entities and reading of external DTD files.
271+
272+
The *max_factor* value must be a non-NaN :class:`float` value greater than
273+
or equal to 1.0. Peak amplifications of factor 15,000 for the entire payload
274+
and of factor 30,000 in the middle of parsing have been observed with small
275+
benign files in practice. In particular, the activation threshold should be
276+
carefully chosen to avoid false positives.
277+
278+
By default, parser objects have a maximum amplification factor of 100.0.
279+
280+
An :exc:`ExpatError` is raised if this method is called on a
281+
|xml-non-root-parser| parser or if *max_factor* is outside the valid range.
282+
The corresponding :attr:`~ExpatError.lineno` and :attr:`~ExpatError.offset`
283+
should not be used as they may have no special meaning.
284+
285+
.. note::
286+
287+
The maximum amplification factor is only considered if the threshold
288+
that can be adjusted by :meth:`.SetBillionLaughsAttackProtectionActivationThreshold`
289+
is exceeded.
290+
291+
.. versionadded:: next
292+
244293
.. method:: xmlparser.SetAllocTrackerActivationThreshold(threshold, /)
245294

246295
Sets the number of allocated bytes of dynamic memory needed to activate
@@ -281,8 +330,8 @@ common XML vulnerabilities.
281330
.. note::
282331

283332
The maximum amplification factor is only considered if the threshold
284-
that can be adjusted :meth:`.SetAllocTrackerActivationThreshold` is
285-
exceeded.
333+
that can be adjusted by :meth:`.SetAllocTrackerActivationThreshold`
334+
is exceeded.
286335

287336
.. versionadded:: next
288337

@@ -1010,4 +1059,6 @@ The ``errors`` module has the following attributes:
10101059
not. See https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
10111060
and https://www.iana.org/assignments/character-sets/character-sets.xhtml.
10121061
1062+
1063+
.. _billion laughs: https://en.wikipedia.org/wiki/Billion_laughs_attack
10131064
.. |xml-non-root-parser| replace:: :ref:`non-root <xmlparser-non-root>`

Doc/whatsnew/3.15.rst

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -556,8 +556,16 @@ unittest
556556
xml.parsers.expat
557557
-----------------
558558

559-
* Add :func:`~xml.parsers.expat.xmlparser.SetAllocTrackerActivationThreshold`
560-
and :func:`~xml.parsers.expat.xmlparser.SetAllocTrackerMaximumAmplification`
559+
* Add :meth:`~xml.parsers.expat.xmlparser.SetBillionLaughsAttackProtectionActivationThreshold`
560+
and :meth:`~xml.parsers.expat.xmlparser.SetBillionLaughsAttackProtectionMaximumAmplification`
561+
to :ref:`xmlparser <xmlparser-objects>` objects to mitigate `billion laughs`_
562+
attacks.
563+
(Contributed by Bénédikt Tran in :gh:`90949`.)
564+
565+
.. _billion laughs: https://en.wikipedia.org/wiki/Billion_laughs_attack
566+
567+
* Add :meth:`~xml.parsers.expat.xmlparser.SetAllocTrackerActivationThreshold`
568+
and :meth:`~xml.parsers.expat.xmlparser.SetAllocTrackerMaximumAmplification`
561569
to :ref:`xmlparser <xmlparser-objects>` objects to prevent use of
562570
disproportional amounts of dynamic memory from within an Expat parser.
563571
(Contributed by Bénédikt Tran in :gh:`90949`.)

Include/pyexpat.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,11 @@ struct PyExpat_CAPI
5757
XML_Parser parser, unsigned long long activationThresholdBytes);
5858
XML_Bool (*SetAllocTrackerMaximumAmplification)(
5959
XML_Parser parser, float maxAmplificationFactor);
60+
/* might be NULL for expat < 2.4.0 */
61+
XML_Bool (*SetBillionLaughsAttackProtectionActivationThreshold)(
62+
XML_Parser parser, unsigned long long activationThresholdBytes);
63+
XML_Bool (*SetBillionLaughsAttackProtectionMaximumAmplification)(
64+
XML_Parser parser, float maxAmplificationFactor);
6065
/* always add new stuff to the end! */
6166
};
6267

Lib/test/test_pyexpat.py

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -958,6 +958,64 @@ def test_set_maximum_amplification__fail_for_subparser(self):
958958
self.assert_root_parser_failure(setter, 123.45)
959959

960960

961+
@unittest.skipIf(expat.version_info < (2, 4, 0), "requires Expat >= 2.4.0")
962+
class ExpansionProtectionTest(AttackProtectionTestBase, unittest.TestCase):
963+
964+
def assert_rejected(self, func, /, *args, **kwargs):
965+
"""Check that func(*args, **kwargs) hits the allocation limit."""
966+
msg = (
967+
r"limit on input amplification factor \(from DTD and entities\) "
968+
r"breached: line \d+, column \d+"
969+
)
970+
self.assertRaisesRegex(expat.ExpatError, msg, func, *args, **kwargs)
971+
972+
def set_activation_threshold(self, parser, threshold):
973+
return parser.SetBillionLaughsAttackProtectionActivationThreshold(threshold)
974+
975+
def set_maximum_amplification(self, parser, max_factor):
976+
return parser.SetBillionLaughsAttackProtectionMaximumAmplification(max_factor)
977+
978+
def test_set_activation_threshold__threshold_reached(self):
979+
parser = expat.ParserCreate()
980+
# Choose a threshold expected to be always reached.
981+
self.set_activation_threshold(parser, 3)
982+
# Check that the threshold is reached by choosing a small factor
983+
# and a payload whose peak amplification factor exceeds it.
984+
self.assertIsNone(self.set_maximum_amplification(parser, 1.0))
985+
payload = self.exponential_expansion_payload(ncols=10, nrows=4)
986+
self.assert_rejected(parser.Parse, payload, True)
987+
988+
def test_set_activation_threshold__threshold_not_reached(self):
989+
parser = expat.ParserCreate()
990+
# Choose a threshold expected to be never reached.
991+
self.set_activation_threshold(parser, pow(10, 5))
992+
# Check that the threshold is reached by choosing a small factor
993+
# and a payload whose peak amplification factor exceeds it.
994+
self.assertIsNone(self.set_maximum_amplification(parser, 1.0))
995+
payload = self.exponential_expansion_payload(ncols=10, nrows=4)
996+
self.assertIsNotNone(parser.Parse(payload, True))
997+
998+
def test_set_maximum_amplification__amplification_exceeded(self):
999+
parser = expat.ParserCreate()
1000+
# Unconditionally enable maximum activation factor.
1001+
self.set_activation_threshold(parser, 0)
1002+
# Choose a max amplification factor expected to always be exceeded.
1003+
self.assertIsNone(self.set_maximum_amplification(parser, 1.0))
1004+
# Craft a payload for which the peak amplification factor is > 1.0.
1005+
payload = self.exponential_expansion_payload(ncols=1, nrows=2)
1006+
self.assert_rejected(parser.Parse, payload, True)
1007+
1008+
def test_set_maximum_amplification__amplification_not_exceeded(self):
1009+
parser = expat.ParserCreate()
1010+
# Unconditionally enable maximum activation factor.
1011+
self.set_activation_threshold(parser, 0)
1012+
# Choose a max amplification factor expected to never be exceeded.
1013+
self.assertIsNone(self.set_maximum_amplification(parser, 1e4))
1014+
# Craft a payload for which the peak amplification factor is < 1e4.
1015+
payload = self.exponential_expansion_payload(ncols=1, nrows=2)
1016+
self.assertIsNotNone(parser.Parse(payload, True))
1017+
1018+
9611019
@unittest.skipIf(expat.version_info < (2, 7, 2), "requires Expat >= 2.7.2")
9621020
class MemoryProtectionTest(AttackProtectionTestBase, unittest.TestCase):
9631021

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
Add :func:`~xml.parsers.expat.xmlparser.SetAllocTrackerActivationThreshold`
2-
and :func:`~xml.parsers.expat.xmlparser.SetAllocTrackerMaximumAmplification`
1+
Add :meth:`~xml.parsers.expat.xmlparser.SetAllocTrackerActivationThreshold`
2+
and :meth:`~xml.parsers.expat.xmlparser.SetAllocTrackerMaximumAmplification`
33
to :ref:`xmlparser <xmlparser-objects>` objects to prevent use of
44
disproportional amounts of dynamic memory from within an Expat parser.
55
Patch by Bénédikt Tran.
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Add
2+
:meth:`~xml.parsers.expat.xmlparser.SetBillionLaughsAttackProtectionActivationThreshold`
3+
and
4+
:meth:`~xml.parsers.expat.xmlparser.SetBillionLaughsAttackProtectionMaximumAmplification`
5+
to :ref:`xmlparser <xmlparser-objects>` objects to mitigate `billion laughs
6+
<https://en.wikipedia.org/wiki/Billion_laughs_attack>`_ attacks. Patch by
7+
Bénédikt Tran.

Modules/clinic/pyexpat.c.h

Lines changed: 135 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)