@@ -72,6 +72,13 @@ The :mod:`xml.parsers.expat` module contains two functions:
72
72
*encoding * [1 ]_ is given it will override the implicit or explicit encoding of the
73
73
document.
74
74
75
+ .. _xmlparser-non-root :
76
+
77
+ Parsers created through :func: `!ParserCreate ` are called "root" parsers,
78
+ in the sense that they do not have any parent parser attached. Non-root
79
+ parsers are created by :meth: `parser.ExternalEntityParserCreate
80
+ <xmlparser.ExternalEntityParserCreate> `.
81
+
75
82
Expat can optionally do XML namespace processing for you, enabled by providing a
76
83
value for *namespace_separator *. The value must be a one-character string; a
77
84
:exc: `ValueError ` will be raised if the string has an illegal length (``None ``
@@ -231,6 +238,111 @@ XMLParser Objects
231
238
.. versionadded :: 3.13
232
239
233
240
241
+ :class: `!xmlparser ` objects have the following methods to tune protections
242
+ against some common XML vulnerabilities.
243
+
244
+ .. method :: xmlparser.SetBillionLaughsAttackProtectionActivationThreshold(threshold, /)
245
+
246
+ Sets the number of output bytes needed to activate protection against
247
+ `billion laughs `_ attacks.
248
+
249
+ The number of output bytes includes amplification from entity expansion
250
+ and reading DTD files.
251
+
252
+ Parser objects usually have a protection activation threshold of 8 MiB,
253
+ but the actual default value depends on the underlying Expat library.
254
+
255
+ An :exc: `ExpatError ` is raised if this method is called on a
256
+ |xml-non-root-parser | parser.
257
+ The corresponding :attr: `~ExpatError.lineno ` and :attr: `~ExpatError.offset `
258
+ should not be used as they may have no special meaning.
259
+
260
+ .. note ::
261
+
262
+ Activation thresholds below 4 MiB are known to break support for DITA 1.3
263
+ payload and are hence not recommended.
264
+
265
+ .. versionadded :: next
266
+
267
+ .. method :: xmlparser.SetBillionLaughsAttackProtectionMaximumAmplification(max_factor, /)
268
+
269
+ Sets the maximum tolerated amplification factor for protection against
270
+ `billion laughs `_ attacks.
271
+
272
+ The amplification factor is calculated as ``(direct + indirect) / direct ``
273
+ while parsing, where ``direct `` is the number of bytes read from
274
+ the primary document in parsing and ``indirect `` is the number of
275
+ bytes added by expanding entities and reading of external DTD files.
276
+
277
+ The *max_factor * value must be a non-NaN :class: `float ` value greater than
278
+ or equal to 1.0. Peak amplifications of factor 15,000 for the entire payload
279
+ and of factor 30,000 in the middle of parsing have been observed with small
280
+ benign files in practice. In particular, the activation threshold should be
281
+ carefully chosen to avoid false positives.
282
+
283
+ Parser objects usually have a maximum amplification factor of 100,
284
+ but the actual default value depends on the underlying Expat library.
285
+
286
+ An :exc: `ExpatError ` is raised if this method is called on a
287
+ |xml-non-root-parser | parser or if *max_factor * is outside the valid range.
288
+ The corresponding :attr: `~ExpatError.lineno ` and :attr: `~ExpatError.offset `
289
+ should not be used as they may have no special meaning.
290
+
291
+ .. note ::
292
+
293
+ The maximum amplification factor is only considered if the threshold
294
+ that can be adjusted by :meth: `.SetBillionLaughsAttackProtectionActivationThreshold `
295
+ is exceeded.
296
+
297
+ .. versionadded :: next
298
+
299
+ .. method :: xmlparser.SetAllocTrackerActivationThreshold(threshold, /)
300
+
301
+ Sets the number of allocated bytes of dynamic memory needed to activate
302
+ protection against disproportionate use of RAM.
303
+
304
+ Parser objects usually have an allocation activation threshold of 64 MiB,
305
+ but the actual default value depends on the underlying Expat library.
306
+
307
+ An :exc: `ExpatError ` is raised if this method is called on a
308
+ |xml-non-root-parser | parser.
309
+ The corresponding :attr: `~ExpatError.lineno ` and :attr: `~ExpatError.offset `
310
+ should not be used as they may have no special meaning.
311
+
312
+ .. versionadded :: next
313
+
314
+ .. method :: xmlparser.SetAllocTrackerMaximumAmplification(max_factor, /)
315
+
316
+ Sets the maximum amplification factor between direct input and bytes
317
+ of dynamic memory allocated.
318
+
319
+ The amplification factor is calculated as ``allocated / direct ``
320
+ while parsing, where ``direct `` is the number of bytes read from
321
+ the primary document in parsing and ``allocated `` is the number
322
+ of bytes of dynamic memory allocated in the parser hierarchy.
323
+
324
+ The *max_factor * value must be a non-NaN :class: `float ` value greater than
325
+ or equal to 1.0. Amplification factors greater than 100.0 can be observed
326
+ near the start of parsing even with benign files in practice. In particular,
327
+ the activation threshold should be carefully chosen to avoid false positives.
328
+
329
+ Parser objects usually have a maximum amplification factor of 100,
330
+ but the actual default value depends on the underlying Expat library.
331
+
332
+ An :exc: `ExpatError ` is raised if this method is called on a
333
+ |xml-non-root-parser | parser or if *max_factor * is outside the valid range.
334
+ The corresponding :attr: `~ExpatError.lineno ` and :attr: `~ExpatError.offset `
335
+ should not be used as they may have no special meaning.
336
+
337
+ .. note ::
338
+
339
+ The maximum amplification factor is only considered if the threshold
340
+ that can be adjusted by :meth: `.SetAllocTrackerActivationThreshold `
341
+ is exceeded.
342
+
343
+ .. versionadded :: next
344
+
345
+
234
346
:class: `xmlparser ` objects have the following attributes:
235
347
236
348
@@ -954,3 +1066,6 @@ The ``errors`` module has the following attributes:
954
1066
not. See https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
955
1067
and https://www.iana.org/assignments/character-sets/character-sets.xhtml.
956
1068
1069
+
1070
+ .. _billion laughs : https://en.wikipedia.org/wiki/Billion_laughs_attack
1071
+ .. |xml-non-root-parser | replace :: :ref: `non-root <xmlparser-non-root >`
0 commit comments