-
-
Notifications
You must be signed in to change notification settings - Fork 33k
gh-139210: Fix use-after-free in _elementtree_XMLParser__setevents_impl #139211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Does anyone know how to add a test for this? I don't see how as the normal interpreter doesn't crash with this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core change looks good. I would just add a test case using something like the original repro. We shouldn't need anything from gettext
or fancy Unicode characters to reproduce this, as long as the iterable to events
has a reference count of 1 and the string inside it is not interned/immortal.
I think this only crashes with the JIT because the JIT avoids reference count increases, right? On the normal build, the interpreter is seemingly holding an extra reference to the event name on its stack, which keeps it alive just long enough to be used in the exception. That step is probably optimized out when the JIT is enabled, so the Py_DECREF(events_seq)
will actually deallocate the string.
@@ -0,0 +1 @@ | |||
Fix use after free when handling unicode characters in ``xml.etree.ElementTree.iterparse``. Patch by Ken Jin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nitpicks:
Fix use after free when handling unicode characters in ``xml.etree.ElementTree.iterparse``. Patch by Ken Jin. | |
Fix use-after-free when handling unicode characters in :func:`xml.etree.ElementTree.iterparse`. Patch by Ken Jin. |
This also isn't really specific to Unicode characters, but I don't mind keeping the mention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "reporting unknown event"?
No ASAN triggers on the normal build as well it seems. |
Yeah, that makes sense. The memory isn't freed on the default build (at least in the repro), because something else has an extra reference (likely the interpreter stack). It'd probably be possible to get a repro for the default build by constructing some iterator in C that the interpreter can't touch. |
The modified code path is already tested by these test_xml_etree_c tests:
I don't know how to trigger a crash on these tests with the old (current) code. I don't think that it's worth it to write more tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a test?
As I wrote, there are already two tests on the modified code. It's just that it doesn't crash for subtle reasons (see previous comments). |
Something like this: diff --git a/Lib/test/test_xml_etree.py b/Lib/test/test_xml_etree.py
index bf6d5074fde..f65baa0cfae 100644
--- a/Lib/test/test_xml_etree.py
+++ b/Lib/test/test_xml_etree.py
@@ -1749,6 +1749,8 @@ def __next__(self):
def test_unknown_event(self):
with self.assertRaises(ValueError):
ET.XMLPullParser(events=('start', 'end', 'bogus'))
+ with self.assertRaisesRegex(ValueError, "unknown event 'bogus'"):
+ ET.XMLPullParser(events=(x.decode() for x in (b'start', b'end', b'bogus')))
@unittest.skipIf(pyexpat.version_info < (2, 6, 0),
f'Expat {pyexpat.version_info} does not ' |
Co-Authored-By: Serhiy Storchaka <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 👍
Thanks @Fidget-Spinner for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.14. |
…e() (pythonGH-139211) (cherry picked from commit c86eb4d) Co-authored-by: Ken Jin <[email protected]>
Thanks @Fidget-Spinner for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13. |
…e() (pythonGH-139211) (cherry picked from commit c86eb4d) Co-authored-by: Ken Jin <[email protected]>
GH-139455 is a backport of this pull request to the 3.14 branch. |
GH-139456 is a backport of this pull request to the 3.13 branch. |
…se() (GH-139211) (GH-139456) (cherry picked from commit c86eb4d) Co-authored-by: Ken Jin <[email protected]>
I just noticed that the NEWS entry was added in the wrong section. This is not a builtin and not a part of the interpreter core. @Fidget-Spinner, could you please create a PR for moving it in the "Library" section? |
@serhiy-storchaka sorry I didnt realize either. Will fix it tomorrow |
…e() (pythonGH-139211) (cherry picked from commit c86eb4d) Co-authored-by: Ken Jin <[email protected]>
_elementtree_XMLParser__setevents_impl
#139210