-
-
Notifications
You must be signed in to change notification settings - Fork 33k
gh-139400: Make sure that parent parsers outlive their subparsers in pyexpat
#139403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 15 commits
e1c0470
3b0afda
8f70991
a63bd34
22417ec
20caacd
f8c58a3
f1e3e36
1b2937d
8c5a4e8
a6f5520
4470cea
3759ac7
30598a9
6d3c424
66b77e0
455787f
a669454
a91ed6b
c056cfe
12b8989
0541d6f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -771,6 +771,48 @@ def resolve_entity(context, base, system_id, public_id): | |||||||||
self.assertEqual(handler_call_args, [("bar", "baz")]) | ||||||||||
|
||||||||||
|
||||||||||
class ParentParserLifetimeTest(unittest.TestCase): | ||||||||||
""" | ||||||||||
Subparsers make use of their parent XML_Parser inside of Expat. | ||||||||||
As a result, parent parsers need to outlive subparsers. | ||||||||||
|
||||||||||
See https://github.com/python/cpython/issues/139400. | ||||||||||
""" | ||||||||||
|
||||||||||
def test_parent_parser_outlives_its_subparsers__single(self): | ||||||||||
parser = expat.ParserCreate() | ||||||||||
subparser = parser.ExternalEntityParserCreate(None) | ||||||||||
|
||||||||||
# Now try to cause garbage collection of the parent parser | ||||||||||
# while it's still being referenced by a related subparser | ||||||||||
del parser | ||||||||||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
hartwork marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||
|
||||||||||
def test_parent_parser_outlives_its_subparsers__multiple(self): | ||||||||||
parser = expat.ParserCreate() | ||||||||||
subparser_one = parser.ExternalEntityParserCreate(None) | ||||||||||
subparser_two = parser.ExternalEntityParserCreate(None) | ||||||||||
|
||||||||||
# Now try to cause garbage collection of the parent parser | ||||||||||
# while it's still being referenced by a related subparser | ||||||||||
hartwork marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
del parser | ||||||||||
|
||||||||||
def test_parent_parser_outlives_its_subparsers__chain(self): | ||||||||||
parser = expat.ParserCreate() | ||||||||||
subparser = parser.ExternalEntityParserCreate(None) | ||||||||||
subsubparser = subparser.ExternalEntityParserCreate(None) | ||||||||||
|
||||||||||
# Now try to cause garbage collection of the parent parsers | ||||||||||
# while they are still being referenced by a related subparser | ||||||||||
hartwork marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
del parser | ||||||||||
del subparser | ||||||||||
|
||||||||||
def test_cycle(self): | ||||||||||
parser = expat.ParserCreate() | ||||||||||
subparser = parser.ExternalEntityParserCreate(None) | ||||||||||
parser.StartElementHandler = lambda _1, _2: subparser | ||||||||||
|
parser.StartElementHandler = lambda _1, _2: subparser | |
handler = lambda _1, _2: None | |
handler.subparser = subparser | |
parser.StartElementHandler = handler |
I don't know if it's sufficient for the handler to have a strong reference to the subparser if it's only returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@picnixz I have an idea how to maybe find out. Will try in the next hour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want me to wait for you to find out or can I merge the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@picnixz depends on what the goal is: to merge with being sure that the test creates a cycle or being okay to potentially improve that test in a follow-up pull request (by you or me)? The two ideas I had were:
- your pointer to
-X showrefcount
gc.get_referrers
andgc.get_referents
I'm not getting anywhere with the latter yet, current impression is that the test does not yet actually create a cycle. How would you like to move foward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@picnixz update: I extended the test_cycle
test now, please let me know what you think 🍻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent was to just create a test where we would have possible of cycles. The simplest way to create cycles is to have something that references itself. Since it's not possible to mutate the type itself, I suggested that the handler creates the cycle itself. The handlers are callables and when we call Py_VISIT
, the default visitor function would visit their attributes, and one of these attributes would be the parser that we stored.
But I'm not entirely sure that my reasoning here is accurate. At least, that's how I always dealt with reference cycles and how I detected crashes, mostly by adding attributes that have references to the object or its type I'm trying to free.
Possibility one: use the following snippet:
import pyexpat as expat
p = expat.ParserCreate()
s = p.ExternalEntityParserCreate(None)
func = lambda *a, **kw: None
func.s = s
func.p = p
func.t = type(p)
p.StartElementHandler = func
p.Parse("<xml></xml>", True)
I think this should be enough to create cycles. But since I don't know how to check this (using gc in tests is not advised in general), the second possibility is to remove this test and investigate more later (I would appreciate if we had a debug option to see the cyclic isolates that the GC finds).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@picnixz I believe we want to limit ourselves to code we are certain about and also merge this pull request today. Let's drop the test then until we really know. Pushing…
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
:mod:`xml.parsers.expat`: Make sure that parent Expat parsers are only | ||
garbage-collected once they are no longer referenced by subparsers created | ||
by :meth:`~xml.parsers.expat.xmlparser.ExternalEntityParserCreate`. | ||
Patch by Sebastian Pipping. |
Uh oh!
There was an error while loading. Please reload this page.