Skip to content

Commit bc0c4d1

Browse files
committed
mm: check that mm is still valid in madvise()
IORING_OP_MADVISE can end up basically doing mprotect() on the VM of another process, which means that it can race with our crazy core dump handling which accesses the VM state without holding the mmap_sem (because it incorrectly thinks that it is the final user). This is clearly a core dumping problem, but we've never fixed it the right way, and instead have the notion of "check that the mm is still ok" using mmget_still_valid() after getting the mmap_sem for writing in any situation where we're not the original VM thread. See commit 04f5866 ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping") for more background on this whole mmget_still_valid() thing. You might want to have a barf bag handy when you do. We're discussing just fixing this properly in the only remaining core dumping routines. But even if we do that, let's make do_madvise() do the right thing, and then when we fix core dumping, we can remove all these mmget_still_valid() checks. Reported-and-tested-by: Jann Horn <[email protected]> Fixes: c1ca757 ("io_uring: add IORING_OP_MADVISE") Acked-by: Jens Axboe <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent aee1a00 commit bc0c4d1

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

mm/madvise.c

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
#include <linux/swapops.h>
2828
#include <linux/shmem_fs.h>
2929
#include <linux/mmu_notifier.h>
30+
#include <linux/sched/mm.h>
3031

3132
#include <asm/tlb.h>
3233

@@ -1090,6 +1091,23 @@ int do_madvise(unsigned long start, size_t len_in, int behavior)
10901091
if (write) {
10911092
if (down_write_killable(&current->mm->mmap_sem))
10921093
return -EINTR;
1094+
1095+
/*
1096+
* We may have stolen the mm from another process
1097+
* that is undergoing core dumping.
1098+
*
1099+
* Right now that's io_ring, in the future it may
1100+
* be remote process management and not "current"
1101+
* at all.
1102+
*
1103+
* We need to fix core dumping to not do this,
1104+
* but for now we have the mmget_still_valid()
1105+
* model.
1106+
*/
1107+
if (!mmget_still_valid(current->mm)) {
1108+
up_write(&current->mm->mmap_sem);
1109+
return -EINTR;
1110+
}
10931111
} else {
10941112
down_read(&current->mm->mmap_sem);
10951113
}

0 commit comments

Comments
 (0)