-
-
Notifications
You must be signed in to change notification settings - Fork 33.3k
gh-137894: Fix segmentation fault in deeply nested filter() iterator chains #137896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
|
Please do not add |
|
Please add tests :) |
Done! |
Misc/NEWS.d/next/Core_and_Builtins/2025-08-18-08-14-52.gh-issue-137894.SrkIA_.rst
Outdated
Show resolved
Hide resolved
| del i | ||
| gc.collect() | ||
|
|
||
| @support.requires_resource('cpu') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it really need a CPU resource? (or maybe it's because of i = filter(bool, i)? how long does the test take? (seconds or minutes?)
Lib/test/test_builtin.py
Outdated
| del i | ||
| gc.collect() | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't simply remove things without answering my question. How long does the test take? does it take minutes, seconds or milliseconds? it it's more than 10 seconds or if the CPU is really strained, then we should keep the CPU resource. Otherwise we shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to my performance test, the setup phase of this test (creating 100,000 nested filters) is very fast (approximately 0.02-0.03 seconds), but the execution phase (triggering recursion by calling list()) consumes a lot of time and CPU resources, and it may take several minutes to complete or trigger a RecursionError. Therefore, I think the@support.requires_resource('cpu')decorator should be retained, because although the setup of this test is fast, its actual execution will become a resource-intensive stress test.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't confirm that this fixes the bug. With this PR applied, the test case takes several minutes and is not interruptible through CTRL^C, and it seemingly doesn't raise a RecursionError either. All of the CI jobs are crashing on your test.
|
@ZeroIntensity, see issue thread |
…update the cleanup function of filter_clear
|
Apparently, the issue was considered a non-issue as C stack exhaustion is hard to predict. Just adding a |
|
I closed the parent issue and so will close this PR as it doesn't solve the issue as observed by |
Summary
Fixed a critical segmentation fault bug in Python 3.13 where creating deeply nested chains of
filter()iterators would crash the interpreter instead of raising a proper Python exception. The fix adds atp_clearmethod to the filter object to prevent stack overflow during garbage collection of cyclic references.Problem
When creating deeply nested
filter()iterator chains (e.g., applyingfilter()thousands of times in a loop), Python 3.13 would crash with a segmentation fault instead of raising aRecursionError. This violates Python's fundamental principle that user code should never be able to crash the interpreter.Example problematic code:
Expected behavior: Should raise
RecursionErroror similar Python exceptionActual behavior: Segmentation fault (interpreter crash)
Root Cause
The
filterobject type inPython/bltinmodule.cwas missing atp_clearmethod implementation. During garbage collection of deeply nested filter iterator chains, the garbage collector would attempt to deallocate objects through recursive calls tofilter_dealloc(), eventually exceeding the C stack limit and causing a segmentation fault.Solution
Added a
tp_clearmethod to the filter object type:Implemented
filter_clear()function:Updated
PyFilter_Typedefinition:The
tp_clearmethod allows the garbage collector to break reference cycles safely without relying on deep recursive deallocation, preventing stack overflow crashes.