Skip to content

heap overflow in set_dealloc (setobject.c) #139071

@kcatss

Description

@kcatss

Crash report

What happened?

heap overflow by corrupted PySetObject internal data(used)

Version

Python 3.13.7 (main, Sep 14 2025, 14:06:10) [GCC 13.3.0]

Root Cause

  1. CPython PySetObject entry states: Python sets maintain three types of entries:

    • active: successfully inserted entries
    • dummy: entries that were inserted but subsequently deleted
    • unused(null): uninitialized/empty entries

    https://github.com/python/cpython/blob/3.13/Include/cpython/setobject.h#L7-L9

  2. Hash collision exploitation: The hash value is used for both table indexing in set_add_entry and entry comparison. Setting hash to 0 causes deterministic traversal starting from index 0, creating intentional hash collisions between entries.

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L144

  3. Dummy entry setup: Two dummy objects that return hash 0 are inserted into the set, then deleted, leaving behind dummy entries.

  4. Recursive eq calls and exception handling: The eq method is the callback executed when set entries have identical hashes to resolve hash collisions. Adding a corruption object triggers recursive calls that eventually hit Python's maximum recursion depth, causing an error return and raising an exception.

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L165

  5. Freeslot handling logic: When Python set encounters a dummy entry, it stores the location in freeslot. When it subsequently encounters a null entry, it moves to the dummy_or_freeslot location.

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L175-L177

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L152-L153

  6. Entry overwrite at dummy_or_freeslot: If an entry's hash matches a dummy's hash, Python writes the key and hash to that dummy entry at the dummy_or_freeslot location.

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L185-L191

  7. Exception handling and recursion unwind corruption: Even though the exception is caught and handled with try-except, as the recursion stack unwinds, each recursive call attempts to complete its add_entry operation. Since freeslot was determined deterministically, all recursive calls reference the same memory address. During the unwind process, each level overwrites the same freeslot location and incorrectly increments used += 1, as the function determines the insertion was successful.

recusively overwrite same freeslot location
Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e640>, hash=0) at Objects/setobject.c:191
191         return 0;
1: freeslot = (setentry *) 0x5110001dcd60  // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e640      0x0000000000000000
(gdb) py-bt
Traceback (most recent call first):
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
    _asyncio._register_task(self)
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
    _asyncio._register_task(self)
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
    _asyncio._register_task(self)
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  <built-in method _register_task of module object at remote 0x5080000de5c0>
  File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
    _asyncio._register_task(self)
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  <built-in method _register_task of module object at remote 0x5080000de5c0>
  File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
    _asyncio._register_task(a[1])
(gdb) c
Continuing.

Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e590>, hash=0) at Objects/setobject.c:191
191         return 0;
1: freeslot = (setentry *) 0x5110001dcd60    // same freeslot location
2: x/2xg freeslot  
0x5110001dcd60: 0x000050b00015e590      0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
    _asyncio._register_task(self)
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
    _asyncio._register_task(self)
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  <built-in method _register_task of module object at remote 0x5080000de5c0>
  File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
    _asyncio._register_task(self)
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  <built-in method _register_task of module object at remote 0x5080000de5c0>
  File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
    _asyncio._register_task(a[1])
(gdb) c
Continuing.

Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e4e0>, hash=0) at Objects/setobject.c:191
191         return 0;
1: freeslot = (setentry *) 0x5110001dcd60  // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e4e0      0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
    _asyncio._register_task(self)
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  <built-in method _register_task of module object at remote 0x5080000de5c0>
  File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
    _asyncio._register_task(self)
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  <built-in method _register_task of module object at remote 0x5080000de5c0>
  File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
    _asyncio._register_task(a[1])
(gdb) c
Continuing.

Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e430>, hash=0) at Objects/setobject.c:191
191         return 0;
1: freeslot = (setentry *) 0x5110001dcd60  // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e430      0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  <built-in method _register_task of module object at remote 0x5080000de5c0>
  File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
    _asyncio._register_task(self)
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  <built-in method _register_task of module object at remote 0x5080000de5c0>
  File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
    _asyncio._register_task(a[1])
(gdb) c
Continuing.

Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e380>, hash=0) at Objects/setobject.c:191
191         return 0;
1: freeslot = (setentry *) 0x5110001dcd60  // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e380      0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
  File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
    self.data.add(ref(item, self._remove))
  <built-in method _register_task of module object at remote 0x5080000de5c0>
  File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
    _asyncio._register_task(a[1])
(gdb) c

  1. Entry counter bug: This results in the set's used counter being incremented multiple times for what is essentially a single memory location, leading to incorrect set size tracking and potential memory corruption.
entry counter(used : 70 ) bug
Breakpoint 4, set_dealloc (so=0x5110001dcd20) at Objects/setobject.c:495
495     {
1: *(PySetObject *)so = {ob_base = {{ob_refcnt = 0, ob_refcnt_split = {0, 0}}, ob_type = 0x555556228520 <PySet_Type>}, fill = 2, used = 70 //<-- 
, mask = 7,   table = 0x5110001dcd60, hash = -1, finger = 0, smalltable = {{key = 0x555556228f40 <_dummy_struct>, hash = -1}, {key = 0x555556228f40 <_dummy_struct>,
      hash = -1}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0,
      hash = 0}}, weakreflist = 0x0}

(gdb) x/8gx 0x5110001dcd60 <-- PySetObject internal table address

0x5110001dcd60: 0x0000555556228f40      0xffffffffffffffff
0x5110001dcd70: 0x0000555556228f40      0xffffffffffffffff
0x5110001dcd80: 0x0000000000000000      0x0000000000000000
0x5110001dcd90: 0x0000000000000000      0x0000000000000000
0x5110001dcda0: 0x0000000000000000      0x0000000000000000
0x5110001dcdb0: 0x0000000000000000      0x0000000000000000
0x5110001dcdc0: 0x0000000000000000      0x0000000000000000
0x5110001dcdd0: 0x0000000000000000      0x0000000000000000

  1. Memory corruption during deallocation: In set_dealloc, the deallocation routine iterates through the table by incrementing addresses sequentially. When it encounters non-dummy entries, it performs cleanup and decrements used--. However, since used is now greater than the actual (fill - dummy) count due to the previous corruption, the deallocation process accesses invalid memory addresses beyond the allocated table bounds, leading to memory corruption or segmentation faults.

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L494-L514

POC

import _asyncio

class dummy(object):
    def __hash__(self):
        return 0
class CorrupTrigger():
    def __hash__(self):
        return 0
    def __eq__(self,other):
        try:
            _asyncio._register_task(self)
        except:
            pass
        return False
a = [dummy(),dummy()]
_asyncio._register_task(a[0])
_asyncio._register_task(a[1])
del a # make dummy entries in set 
print (_asyncio._scheduled_tasks)
a = [CorrupTrigger() for i in range(2)]
_asyncio._register_task(a[0])
_asyncio._register_task(a[1]) # make recursive comparision

ASAN

asan
==4613==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5110001dcdf0 at pc 0x555555a458a5 bp 0x7fffffffdb20 sp 0x7fffffffdb10
READ of size 8 at 0x5110001dcdf0 thread T0
    #0 0x555555a458a4 in set_dealloc Objects/setobject.c:505
    #1 0x555555a0c1dc in _Py_Dealloc Objects/object.c:2939
    #2 0x5555559d8d5a in Py_DECREF Include/object.h:934
    #3 0x5555559d9787 in set_dict_inline_values Objects/dictobject.c:7151
    #4 0x5555559e55ad in _PyObject_SetManagedDict Objects/dictobject.c:7180
    #5 0x5555559e5667 in PyObject_ClearManagedDict Objects/dictobject.c:7222
    #6 0x555555a746da in subtype_dealloc Objects/typeobject.c:2359
    #7 0x555555a0c1dc in _Py_Dealloc Objects/object.c:2939
    #8 0x7ffff6ff7dd8 in Py_DECREF Include/object.h:934
    #9 0x7ffff6ff83a7 in module_clear Modules/_asynciomodule.c:3584
    #10 0x555555a05030 in module_clear Objects/moduleobject.c:1103
    #11 0x555555c3a017 in delete_garbage Python/gc.c:1015
    #12 0x555555c3b4f9 in gc_collect_main Python/gc.c:1421
    #13 0x555555c3c05f in _PyGC_CollectNoFail Python/gc.c:1657
    #14 0x555555ca8fdb in finalize_modules Python/pylifecycle.c:1797
    #15 0x555555cb2079 in _Py_Finalize Python/pylifecycle.c:2134
    #16 0x555555cb21ad in Py_FinalizeEx Python/pylifecycle.c:2261
    #17 0x555555d15c05 in Py_RunMain Modules/main.c:777
    #18 0x555555d15de7 in pymain_main Modules/main.c:805
    #19 0x555555d1616c in Py_BytesMain Modules/main.c:829
    #20 0x5555557de445 in main Programs/python.c:15
    #21 0x7ffff742a1c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #22 0x7ffff742a28a in __libc_start_main_impl ../csu/libc-start.c:360
    #23 0x5555557de374 in _start (/home/ubuntu/Python-3.13.7/python+0x28a374) (BuildId: c1408b7ecd69be204d23a50df1288230bd9ca6bd)

0x5110001dcdf0 is located 0 bytes after 240-byte region [0x5110001dcd00,0x5110001dcdf0)
allocated by thread T0 here:
    #0 0x7ffff78fd9c7 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x555555a15097 in _PyMem_RawMalloc Objects/obmalloc.c:62
    #2 0x555555a14468 in _PyMem_DebugRawAlloc Objects/obmalloc.c:2792
    #3 0x555555a144d0 in _PyMem_DebugRawMalloc Objects/obmalloc.c:2825
    #4 0x555555a16a0a in _PyMem_DebugMalloc Objects/obmalloc.c:2982
    #5 0x555555a3e147 in PyObject_Malloc Objects/obmalloc.c:1400
    #6 0x555555a6ea6f in _PyObject_MallocWithType Include/internal/pycore_object_alloc.h:46
    #7 0x555555a6ea6f in _PyType_AllocNoTrack Objects/typeobject.c:2041
    #8 0x555555a6ebf8 in PyType_GenericAlloc Objects/typeobject.c:2070
    #9 0x555555a47924 in make_new_set Objects/setobject.c:1086
    #10 0x555555a4964c in set_vectorcall Objects/setobject.c:2379
    #11 0x555555bb4e2c in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1123
    #12 0x555555bdc0ae in _PyEval_EvalFrame Include/internal/pycore_ceval.h:119
    #13 0x555555bdc28b in _PyEval_Vector Python/ceval.c:1816
    #14 0x55555595a39b in _PyFunction_Vectorcall Objects/call.c:413
    #15 0x55555595d4eb in _PyObject_VectorcallDictTstate Objects/call.c:135
    #16 0x55555595d910 in _PyObject_Call_Prepend Objects/call.c:504
    #17 0x555555a7debc in slot_tp_init Objects/typeobject.c:9816
    #18 0x555555a719e0 in type_call Objects/typeobject.c:1997
    #19 0x55555595a659 in _PyObject_MakeTpCall Objects/call.c:242
    #20 0x55555595a93a in _PyObject_VectorcallTstate Include/internal/pycore_call.h:166
    #21 0x55555595a96a in PyObject_CallNoArgs Objects/call.c:106
    #22 0x7ffff7004bcc in module_init Modules/_asynciomodule.c:3665
    #23 0x7ffff7004fb8 in module_exec Modules/_asynciomodule.c:3734
    #24 0x555555a06fa6 in PyModule_ExecDef Objects/moduleobject.c:491
    #25 0x555555c5d45a in exec_builtin_or_dynamic Python/import.c:815
    #26 0x555555c5d47d in _imp_exec_dynamic_impl Python/import.c:4770
    #27 0x555555c5de55 in _imp_exec_dynamic Python/clinic/import.c.h:513
    #28 0x555555a02aeb in cfunction_vectorcall_O Objects/methodobject.c:511
    #29 0x55555595db69 in _PyVectorcall_Call Objects/call.c:273
    #30 0x55555595e1a8 in _PyObject_Call Objects/call.c:348

SUMMARY: AddressSanitizer: heap-buffer-overflow Objects/setobject.c:505 in set_dealloc
Shadow bytes around the buggy address:
  0x5110001dcb00: fd fd fd fd fd fd fd fd fd fd fd fd fd fa fa fa
  0x5110001dcb80: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x5110001dcc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x5110001dcc80: 00 00 00 00 00 00 fa fa fa fa fa fa fa fa fa fa
  0x5110001dcd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x5110001dcd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[fa]fa
  0x5110001dce00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x5110001dce80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x5110001dcf00: 00 00 00 00 00 00 fa fa fa fa fa fa fa fa fa fa
  0x5110001dcf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x5110001dd000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==4613==ABORTING

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.13.7 (main, Sep 14 2025, 14:06:10) [GCC 13.3.0]

Metadata

Metadata

Assignees

Labels

interpreter-core(Objects, Python, Grammar, and Parser dirs)type-crashA hard crash of the interpreter, possibly with a core dump

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions