Skip to content

Commit 8c2e52e

Browse files
committed
eventpoll: don't decrement ep refcount while still holding the ep mutex
Jann Horn points out that epoll is decrementing the ep refcount and then doing a mutex_unlock(&ep->mtx); afterwards. That's very wrong, because it can lead to a use-after-free. That pattern is actually fine for the very last reference, because the code in question will delay the actual call to "ep_free(ep)" until after it has unlocked the mutex. But it's wrong for the much subtler "next to last" case when somebody *else* may also be dropping their reference and free the ep while we're still using the mutex. Note that this is true even if that other user is also using the same ep mutex: mutexes, unlike spinlocks, can not be used for object ownership, even if they guarantee mutual exclusion. A mutex "unlock" operation is not atomic, and as one user is still accessing the mutex as part of unlocking it, another user can come in and get the now released mutex and free the data structure while the first user is still cleaning up. See our mutex documentation in Documentation/locking/mutex-design.rst, in particular the section [1] about semantics: "mutex_unlock() may access the mutex structure even after it has internally released the lock already - so it's not safe for another context to acquire the mutex and assume that the mutex_unlock() context is not using the structure anymore" So if we drop our ep ref before the mutex unlock, but we weren't the last one, we may then unlock the mutex, another user comes in, drops _their_ reference and releases the 'ep' as it now has no users - all while the mutex_unlock() is still accessing it. Fix this by simply moving the ep refcount dropping to outside the mutex: the refcount itself is atomic, and doesn't need mutex protection (that's the whole _point_ of refcounts: unlike mutexes, they are inherently about object lifetimes). Reported-by: Jann Horn <[email protected]> Link: https://docs.kernel.org/locking/mutex-design.html#semantics [1] Cc: Alexander Viro <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Jan Kara <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent f69f5aa commit 8c2e52e

File tree

1 file changed

+5
-7
lines changed

1 file changed

+5
-7
lines changed

fs/eventpoll.c

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -828,22 +828,22 @@ static bool __ep_remove(struct eventpoll *ep, struct epitem *epi, bool force)
828828
kfree_rcu(epi, rcu);
829829

830830
percpu_counter_dec(&ep->user->epoll_watches);
831-
return ep_refcount_dec_and_test(ep);
831+
return true;
832832
}
833833

834834
/*
835835
* ep_remove variant for callers owing an additional reference to the ep
836836
*/
837837
static void ep_remove_safe(struct eventpoll *ep, struct epitem *epi)
838838
{
839-
WARN_ON_ONCE(__ep_remove(ep, epi, false));
839+
if (__ep_remove(ep, epi, false))
840+
WARN_ON_ONCE(ep_refcount_dec_and_test(ep));
840841
}
841842

842843
static void ep_clear_and_put(struct eventpoll *ep)
843844
{
844845
struct rb_node *rbp, *next;
845846
struct epitem *epi;
846-
bool dispose;
847847

848848
/* We need to release all tasks waiting for these file */
849849
if (waitqueue_active(&ep->poll_wait))
@@ -876,10 +876,8 @@ static void ep_clear_and_put(struct eventpoll *ep)
876876
cond_resched();
877877
}
878878

879-
dispose = ep_refcount_dec_and_test(ep);
880879
mutex_unlock(&ep->mtx);
881-
882-
if (dispose)
880+
if (ep_refcount_dec_and_test(ep))
883881
ep_free(ep);
884882
}
885883

@@ -1100,7 +1098,7 @@ void eventpoll_release_file(struct file *file)
11001098
dispose = __ep_remove(ep, epi, true);
11011099
mutex_unlock(&ep->mtx);
11021100

1103-
if (dispose)
1101+
if (dispose && ep_refcount_dec_and_test(ep))
11041102
ep_free(ep);
11051103
goto again;
11061104
}

0 commit comments

Comments
 (0)