Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 53 additions & 2 deletions Doc/library/pyexpat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,55 @@ XMLParser Objects
:class:`!xmlparser` objects have the following methods to mitigate some
common XML vulnerabilities.

.. method:: xmlparser.SetBillionLaughsAttackProtectionActivationThreshold(threshold, /)

Sets the number of output bytes needed to activate protection against
`billion laughs`_ attacks.

The number of output bytes includes amplification from entity expansion
and reading DTD files.

By default, parser objects have a protection activation threshold of 8 MiB,
or equivalently 8,388,608 bytes.

An :exc:`ExpatError` is raised if this method is called on a
|xml-non-root-parser| parser.
The corresponding :attr:`~ExpatError.lineno` and :attr:`~ExpatError.offset`
should not be used as they may have no special meaning.

.. versionadded:: next

.. method:: xmlparser.SetBillionLaughsAttackProtectionMaximumAmplification(max_factor, /)

Sets the maximum tolerated amplification factor for protection against
`billion laughs`_ attacks.

The amplification factor is calculated as ``(direct + indirect) / direct``
while parsing, where ``direct`` is the number of bytes read from
the primary document in parsing and ``indirect`` is the number of
bytes added by expanding entities and reading of external DTD files.

The *max_factor* value must be a non-NaN :class:`float` value greater than
or equal to 1.0. Peak amplifications of factor 15,000 for the entire payload
and of factor 30,000 in the middle of parsing have been observed with small
benign files in practice. In particular, the activation threshold should be
carefully chosen to avoid false positives.

By default, parser objects have a maximum amplification factor of 100.0.

An :exc:`ExpatError` is raised if this method is called on a
|xml-non-root-parser| parser or if *max_factor* is outside the valid range.
The corresponding :attr:`~ExpatError.lineno` and :attr:`~ExpatError.offset`
should not be used as they may have no special meaning.

.. note::

The maximum amplification factor is only considered if the threshold
that can be adjusted by :meth:`.SetBillionLaughsAttackProtectionActivationThreshold`
is exceeded.

.. versionadded:: next

.. method:: xmlparser.SetAllocTrackerActivationThreshold(threshold, /)

Sets the number of allocated bytes of dynamic memory needed to activate
Expand Down Expand Up @@ -281,8 +330,8 @@ common XML vulnerabilities.
.. note::

The maximum amplification factor is only considered if the threshold
that can be adjusted :meth:`.SetAllocTrackerActivationThreshold` is
exceeded.
that can be adjusted by :meth:`.SetAllocTrackerActivationThreshold`
is exceeded.

.. versionadded:: next

Expand Down Expand Up @@ -1010,4 +1059,6 @@ The ``errors`` module has the following attributes:
not. See https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
and https://www.iana.org/assignments/character-sets/character-sets.xhtml.


.. _billion laughs: https://en.wikipedia.org/wiki/Billion_laughs_attack
.. |xml-non-root-parser| replace:: :ref:`non-root <xmlparser-non-root>`
12 changes: 10 additions & 2 deletions Doc/whatsnew/3.15.rst
Original file line number Diff line number Diff line change
Expand Up @@ -556,8 +556,16 @@ unittest
xml.parsers.expat
-----------------

* Add :func:`~xml.parsers.expat.xmlparser.SetAllocTrackerActivationThreshold`
and :func:`~xml.parsers.expat.xmlparser.SetAllocTrackerMaximumAmplification`
* Add :meth:`~xml.parsers.expat.xmlparser.SetBillionLaughsAttackProtectionActivationThreshold`
and :meth:`~xml.parsers.expat.xmlparser.SetBillionLaughsAttackProtectionMaximumAmplification`
to :ref:`xmlparser <xmlparser-objects>` objects to mitigate `billion laughs`_
attacks.
(Contributed by Bénédikt Tran in :gh:`90949`.)

.. _billion laughs: https://en.wikipedia.org/wiki/Billion_laughs_attack

* Add :meth:`~xml.parsers.expat.xmlparser.SetAllocTrackerActivationThreshold`
and :meth:`~xml.parsers.expat.xmlparser.SetAllocTrackerMaximumAmplification`
to :ref:`xmlparser <xmlparser-objects>` objects to prevent use of
disproportional amounts of dynamic memory from within an Expat parser.
(Contributed by Bénédikt Tran in :gh:`90949`.)
Expand Down
5 changes: 5 additions & 0 deletions Include/pyexpat.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,11 @@ struct PyExpat_CAPI
XML_Parser parser, unsigned long long activationThresholdBytes);
XML_Bool (*SetAllocTrackerMaximumAmplification)(
XML_Parser parser, float maxAmplificationFactor);
/* might be NULL for expat < 2.4.0 */
XML_Bool (*SetBillionLaughsAttackProtectionActivationThreshold)(
XML_Parser parser, unsigned long long activationThresholdBytes);
XML_Bool (*SetBillionLaughsAttackProtectionMaximumAmplification)(
XML_Parser parser, float maxAmplificationFactor);
/* always add new stuff to the end! */
};

58 changes: 58 additions & 0 deletions Lib/test/test_pyexpat.py
Original file line number Diff line number Diff line change
Expand Up @@ -958,6 +958,64 @@ def test_set_maximum_amplification__fail_for_subparser(self):
self.assert_root_parser_failure(setter, 123.45)


@unittest.skipIf(expat.version_info < (2, 4, 0), "requires Expat >= 2.4.0")
class ExpansionProtectionTest(AttackProtectionTestBase, unittest.TestCase):

def assert_rejected(self, func, /, *args, **kwargs):
"""Check that func(*args, **kwargs) hits the allocation limit."""
msg = (
r"limit on input amplification factor \(from DTD and entities\) "
r"breached: line \d+, column \d+"
)
self.assertRaisesRegex(expat.ExpatError, msg, func, *args, **kwargs)

def set_activation_threshold(self, parser, threshold):
return parser.SetBillionLaughsAttackProtectionActivationThreshold(threshold)

def set_maximum_amplification(self, parser, max_factor):
return parser.SetBillionLaughsAttackProtectionMaximumAmplification(max_factor)

def test_set_activation_threshold__threshold_reached(self):
parser = expat.ParserCreate()
# Choose a threshold expected to be always reached.
self.set_activation_threshold(parser, 3)
# Check that the threshold is reached by choosing a small factor
# and a payload whose peak amplification factor exceeds it.
self.assertIsNone(self.set_maximum_amplification(parser, 1.0))
payload = self.exponential_expansion_payload(ncols=10, nrows=4)
self.assert_rejected(parser.Parse, payload, True)

def test_set_activation_threshold__threshold_not_reached(self):
parser = expat.ParserCreate()
# Choose a threshold expected to be never reached.
self.set_activation_threshold(parser, pow(10, 5))
# Check that the threshold is reached by choosing a small factor
# and a payload whose peak amplification factor exceeds it.
self.assertIsNone(self.set_maximum_amplification(parser, 1.0))
payload = self.exponential_expansion_payload(ncols=10, nrows=4)
self.assertIsNotNone(parser.Parse(payload, True))

def test_set_maximum_amplification__amplification_exceeded(self):
parser = expat.ParserCreate()
# Unconditionally enable maximum activation factor.
self.set_activation_threshold(parser, 0)
# Choose a max amplification factor expected to always be exceeded.
self.assertIsNone(self.set_maximum_amplification(parser, 1.0))
# Craft a payload for which the peak amplification factor is > 1.0.
payload = self.exponential_expansion_payload(ncols=1, nrows=2)
self.assert_rejected(parser.Parse, payload, True)

def test_set_maximum_amplification__amplification_not_exceeded(self):
parser = expat.ParserCreate()
# Unconditionally enable maximum activation factor.
self.set_activation_threshold(parser, 0)
# Choose a max amplification factor expected to never be exceeded.
self.assertIsNone(self.set_maximum_amplification(parser, 1e4))
# Craft a payload for which the peak amplification factor is < 1e4.
payload = self.exponential_expansion_payload(ncols=1, nrows=2)
self.assertIsNotNone(parser.Parse(payload, True))


@unittest.skipIf(expat.version_info < (2, 7, 2), "requires Expat >= 2.7.2")
class MemoryProtectionTest(AttackProtectionTestBase, unittest.TestCase):

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Add :func:`~xml.parsers.expat.xmlparser.SetAllocTrackerActivationThreshold`
and :func:`~xml.parsers.expat.xmlparser.SetAllocTrackerMaximumAmplification`
Add :meth:`~xml.parsers.expat.xmlparser.SetAllocTrackerActivationThreshold`
and :meth:`~xml.parsers.expat.xmlparser.SetAllocTrackerMaximumAmplification`
to :ref:`xmlparser <xmlparser-objects>` objects to prevent use of
disproportional amounts of dynamic memory from within an Expat parser.
Patch by Bénédikt Tran.
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Add
:meth:`~xml.parsers.expat.xmlparser.SetBillionLaughsAttackProtectionActivationThreshold`
and
:meth:`~xml.parsers.expat.xmlparser.SetBillionLaughsAttackProtectionMaximumAmplification`
to :ref:`xmlparser <xmlparser-objects>` objects to mitigate `billion laughs
<https://en.wikipedia.org/wiki/Billion_laughs_attack>`_ attacks. Patch by
Bénédikt Tran.
136 changes: 135 additions & 1 deletion Modules/clinic/pyexpat.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading