-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
Description
Crash report
What happened?
heap overflow by corrupted PySetObject internal data(used
)
Version
Python 3.13.7 (main, Sep 14 2025, 14:06:10) [GCC 13.3.0]
Root Cause
-
CPython PySetObject entry states: Python sets maintain three types of entries:
- active: successfully inserted entries
- dummy: entries that were inserted but subsequently deleted
- unused(null): uninitialized/empty entries
https://github.com/python/cpython/blob/3.13/Include/cpython/setobject.h#L7-L9
-
Hash collision exploitation: The hash value is used for both table indexing in set_add_entry and entry comparison. Setting hash to 0 causes deterministic traversal starting from index 0, creating intentional hash collisions between entries.
https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L144
-
Dummy entry setup: Two dummy objects that return hash 0 are inserted into the set, then deleted, leaving behind dummy entries.
-
Recursive eq calls and exception handling: The eq method is the callback executed when set entries have identical hashes to resolve hash collisions. Adding a corruption object triggers recursive calls that eventually hit Python's maximum recursion depth, causing an error return and raising an exception.
https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L165
-
Freeslot handling logic: When Python set encounters a dummy entry, it stores the location in freeslot. When it subsequently encounters a null entry, it moves to the dummy_or_freeslot location.
https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L175-L177
https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L152-L153
-
Entry overwrite at dummy_or_freeslot: If an entry's hash matches a dummy's hash, Python writes the key and hash to that dummy entry at the dummy_or_freeslot location.
https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L185-L191
-
Exception handling and recursion unwind corruption: Even though the exception is caught and handled with try-except, as the recursion stack unwinds, each recursive call attempts to complete its add_entry operation. Since freeslot was determined deterministically, all recursive calls reference the same memory address. During the unwind process, each level overwrites the same freeslot location and incorrectly increments used += 1, as the function determines the insertion was successful.
recusively overwrite same freeslot location
Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e640>, hash=0) at Objects/setobject.c:191
191 return 0;
1: freeslot = (setentry *) 0x5110001dcd60 // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e640 0x0000000000000000
(gdb) py-bt
Traceback (most recent call first):
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
_asyncio._register_task(self)
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
_asyncio._register_task(self)
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
_asyncio._register_task(self)
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
<built-in method _register_task of module object at remote 0x5080000de5c0>
File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
_asyncio._register_task(self)
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
<built-in method _register_task of module object at remote 0x5080000de5c0>
File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
_asyncio._register_task(a[1])
(gdb) c
Continuing.
Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e590>, hash=0) at Objects/setobject.c:191
191 return 0;
1: freeslot = (setentry *) 0x5110001dcd60 // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e590 0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
_asyncio._register_task(self)
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
_asyncio._register_task(self)
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
<built-in method _register_task of module object at remote 0x5080000de5c0>
File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
_asyncio._register_task(self)
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
<built-in method _register_task of module object at remote 0x5080000de5c0>
File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
_asyncio._register_task(a[1])
(gdb) c
Continuing.
Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e4e0>, hash=0) at Objects/setobject.c:191
191 return 0;
1: freeslot = (setentry *) 0x5110001dcd60 // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e4e0 0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
_asyncio._register_task(self)
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
<built-in method _register_task of module object at remote 0x5080000de5c0>
File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
_asyncio._register_task(self)
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
<built-in method _register_task of module object at remote 0x5080000de5c0>
File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
_asyncio._register_task(a[1])
(gdb) c
Continuing.
Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e430>, hash=0) at Objects/setobject.c:191
191 return 0;
1: freeslot = (setentry *) 0x5110001dcd60 // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e430 0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
<built-in method _register_task of module object at remote 0x5080000de5c0>
File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
_asyncio._register_task(self)
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
<built-in method _register_task of module object at remote 0x5080000de5c0>
File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
_asyncio._register_task(a[1])
(gdb) c
Continuing.
Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e380>, hash=0) at Objects/setobject.c:191
191 return 0;
1: freeslot = (setentry *) 0x5110001dcd60 // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e380 0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
self.data.add(ref(item, self._remove))
<built-in method _register_task of module object at remote 0x5080000de5c0>
File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
_asyncio._register_task(a[1])
(gdb) c
- Entry counter bug: This results in the set's used counter being incremented multiple times for what is essentially a single memory location, leading to incorrect set size tracking and potential memory corruption.
entry counter(used : 70 ) bug
Breakpoint 4, set_dealloc (so=0x5110001dcd20) at Objects/setobject.c:495
495 {
1: *(PySetObject *)so = {ob_base = {{ob_refcnt = 0, ob_refcnt_split = {0, 0}}, ob_type = 0x555556228520 <PySet_Type>}, fill = 2, used = 70 //<--
, mask = 7, table = 0x5110001dcd60, hash = -1, finger = 0, smalltable = {{key = 0x555556228f40 <_dummy_struct>, hash = -1}, {key = 0x555556228f40 <_dummy_struct>,
hash = -1}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0,
hash = 0}}, weakreflist = 0x0}
(gdb) x/8gx 0x5110001dcd60 <-- PySetObject internal table address
0x5110001dcd60: 0x0000555556228f40 0xffffffffffffffff
0x5110001dcd70: 0x0000555556228f40 0xffffffffffffffff
0x5110001dcd80: 0x0000000000000000 0x0000000000000000
0x5110001dcd90: 0x0000000000000000 0x0000000000000000
0x5110001dcda0: 0x0000000000000000 0x0000000000000000
0x5110001dcdb0: 0x0000000000000000 0x0000000000000000
0x5110001dcdc0: 0x0000000000000000 0x0000000000000000
0x5110001dcdd0: 0x0000000000000000 0x0000000000000000
-
Memory corruption during deallocation: In set_dealloc, the deallocation routine iterates through the table by incrementing addresses sequentially. When it encounters non-dummy entries, it performs cleanup and decrements used--. However, since used is now greater than the actual (fill - dummy) count due to the previous corruption, the deallocation process accesses invalid memory addresses beyond the allocated table bounds, leading to memory corruption or segmentation faults.
https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L494-L514
POC
import _asyncio
class dummy(object):
def __hash__(self):
return 0
class CorrupTrigger():
def __hash__(self):
return 0
def __eq__(self,other):
try:
_asyncio._register_task(self)
except:
pass
return False
a = [dummy(),dummy()]
_asyncio._register_task(a[0])
_asyncio._register_task(a[1])
del a # make dummy entries in set
print (_asyncio._scheduled_tasks)
a = [CorrupTrigger() for i in range(2)]
_asyncio._register_task(a[0])
_asyncio._register_task(a[1]) # make recursive comparision
ASAN
asan
==4613==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5110001dcdf0 at pc 0x555555a458a5 bp 0x7fffffffdb20 sp 0x7fffffffdb10
READ of size 8 at 0x5110001dcdf0 thread T0
#0 0x555555a458a4 in set_dealloc Objects/setobject.c:505
#1 0x555555a0c1dc in _Py_Dealloc Objects/object.c:2939
#2 0x5555559d8d5a in Py_DECREF Include/object.h:934
#3 0x5555559d9787 in set_dict_inline_values Objects/dictobject.c:7151
#4 0x5555559e55ad in _PyObject_SetManagedDict Objects/dictobject.c:7180
#5 0x5555559e5667 in PyObject_ClearManagedDict Objects/dictobject.c:7222
#6 0x555555a746da in subtype_dealloc Objects/typeobject.c:2359
#7 0x555555a0c1dc in _Py_Dealloc Objects/object.c:2939
#8 0x7ffff6ff7dd8 in Py_DECREF Include/object.h:934
#9 0x7ffff6ff83a7 in module_clear Modules/_asynciomodule.c:3584
#10 0x555555a05030 in module_clear Objects/moduleobject.c:1103
#11 0x555555c3a017 in delete_garbage Python/gc.c:1015
#12 0x555555c3b4f9 in gc_collect_main Python/gc.c:1421
#13 0x555555c3c05f in _PyGC_CollectNoFail Python/gc.c:1657
#14 0x555555ca8fdb in finalize_modules Python/pylifecycle.c:1797
#15 0x555555cb2079 in _Py_Finalize Python/pylifecycle.c:2134
#16 0x555555cb21ad in Py_FinalizeEx Python/pylifecycle.c:2261
#17 0x555555d15c05 in Py_RunMain Modules/main.c:777
#18 0x555555d15de7 in pymain_main Modules/main.c:805
#19 0x555555d1616c in Py_BytesMain Modules/main.c:829
#20 0x5555557de445 in main Programs/python.c:15
#21 0x7ffff742a1c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#22 0x7ffff742a28a in __libc_start_main_impl ../csu/libc-start.c:360
#23 0x5555557de374 in _start (/home/ubuntu/Python-3.13.7/python+0x28a374) (BuildId: c1408b7ecd69be204d23a50df1288230bd9ca6bd)
0x5110001dcdf0 is located 0 bytes after 240-byte region [0x5110001dcd00,0x5110001dcdf0)
allocated by thread T0 here:
#0 0x7ffff78fd9c7 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
#1 0x555555a15097 in _PyMem_RawMalloc Objects/obmalloc.c:62
#2 0x555555a14468 in _PyMem_DebugRawAlloc Objects/obmalloc.c:2792
#3 0x555555a144d0 in _PyMem_DebugRawMalloc Objects/obmalloc.c:2825
#4 0x555555a16a0a in _PyMem_DebugMalloc Objects/obmalloc.c:2982
#5 0x555555a3e147 in PyObject_Malloc Objects/obmalloc.c:1400
#6 0x555555a6ea6f in _PyObject_MallocWithType Include/internal/pycore_object_alloc.h:46
#7 0x555555a6ea6f in _PyType_AllocNoTrack Objects/typeobject.c:2041
#8 0x555555a6ebf8 in PyType_GenericAlloc Objects/typeobject.c:2070
#9 0x555555a47924 in make_new_set Objects/setobject.c:1086
#10 0x555555a4964c in set_vectorcall Objects/setobject.c:2379
#11 0x555555bb4e2c in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1123
#12 0x555555bdc0ae in _PyEval_EvalFrame Include/internal/pycore_ceval.h:119
#13 0x555555bdc28b in _PyEval_Vector Python/ceval.c:1816
#14 0x55555595a39b in _PyFunction_Vectorcall Objects/call.c:413
#15 0x55555595d4eb in _PyObject_VectorcallDictTstate Objects/call.c:135
#16 0x55555595d910 in _PyObject_Call_Prepend Objects/call.c:504
#17 0x555555a7debc in slot_tp_init Objects/typeobject.c:9816
#18 0x555555a719e0 in type_call Objects/typeobject.c:1997
#19 0x55555595a659 in _PyObject_MakeTpCall Objects/call.c:242
#20 0x55555595a93a in _PyObject_VectorcallTstate Include/internal/pycore_call.h:166
#21 0x55555595a96a in PyObject_CallNoArgs Objects/call.c:106
#22 0x7ffff7004bcc in module_init Modules/_asynciomodule.c:3665
#23 0x7ffff7004fb8 in module_exec Modules/_asynciomodule.c:3734
#24 0x555555a06fa6 in PyModule_ExecDef Objects/moduleobject.c:491
#25 0x555555c5d45a in exec_builtin_or_dynamic Python/import.c:815
#26 0x555555c5d47d in _imp_exec_dynamic_impl Python/import.c:4770
#27 0x555555c5de55 in _imp_exec_dynamic Python/clinic/import.c.h:513
#28 0x555555a02aeb in cfunction_vectorcall_O Objects/methodobject.c:511
#29 0x55555595db69 in _PyVectorcall_Call Objects/call.c:273
#30 0x55555595e1a8 in _PyObject_Call Objects/call.c:348
SUMMARY: AddressSanitizer: heap-buffer-overflow Objects/setobject.c:505 in set_dealloc
Shadow bytes around the buggy address:
0x5110001dcb00: fd fd fd fd fd fd fd fd fd fd fd fd fd fa fa fa
0x5110001dcb80: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
0x5110001dcc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x5110001dcc80: 00 00 00 00 00 00 fa fa fa fa fa fa fa fa fa fa
0x5110001dcd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x5110001dcd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[fa]fa
0x5110001dce00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
0x5110001dce80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x5110001dcf00: 00 00 00 00 00 00 fa fa fa fa fa fa fa fa fa fa
0x5110001dcf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x5110001dd000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==4613==ABORTING
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
Python 3.13.7 (main, Sep 14 2025, 14:06:10) [GCC 13.3.0]