Skip to content

Commit f81fdd0

Browse files
committed
mm: document warning in move_normal_pmd() and make it warn only once
Naresh Kamboju reported that the LTP tests can cause warnings on i386 going back all the way to v5.0, and bisected it to commit 2c91bd4 ("mm: speed up mremap by 20x on large regions"). The warning in move_normal_pmd() is actually mostly correct, but we have a very unusual special case at process creation time, when we may move the stack down with an overlapping mode (kind of like a "memmove()" except using the page tables). And when you have just the right condition of "move a large initial stack by the right alignment in the end, but with the early part of the move being only page-aligned", we'll be in a situation where we're trying to move a normal PMD entry on top of an already existing - but now empty - PMD entry. The warning is still worth having, in case it ever triggers other cases, and perhaps as a reminder that we could do the stack move case more efficiently (although it's clearly rare enough that it probably doesn't matter). But make it do WARN_ON_ONCE(), so that you can't flood the logs with it. And add a *big* comment above it to explain and remind us what's going on, because it took some figuring out to see how this could trigger. Kudos to Joel Fernandes for debugging this. Reported-by: Naresh Kamboju <[email protected]> Debugged-and-acked-by: Joel Fernandes <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent 11ba468 commit f81fdd0

File tree

1 file changed

+21
-2
lines changed

1 file changed

+21
-2
lines changed

mm/mremap.c

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -206,9 +206,28 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
206206

207207
/*
208208
* The destination pmd shouldn't be established, free_pgtables()
209-
* should have release it.
209+
* should have released it.
210+
*
211+
* However, there's a case during execve() where we use mremap
212+
* to move the initial stack, and in that case the target area
213+
* may overlap the source area (always moving down).
214+
*
215+
* If everything is PMD-aligned, that works fine, as moving
216+
* each pmd down will clear the source pmd. But if we first
217+
* have a few 4kB-only pages that get moved down, and then
218+
* hit the "now the rest is PMD-aligned, let's do everything
219+
* one pmd at a time", we will still have the old (now empty
220+
* of any 4kB pages, but still there) PMD in the page table
221+
* tree.
222+
*
223+
* Warn on it once - because we really should try to figure
224+
* out how to do this better - but then say "I won't move
225+
* this pmd".
226+
*
227+
* One alternative might be to just unmap the target pmd at
228+
* this point, and verify that it really is empty. We'll see.
210229
*/
211-
if (WARN_ON(!pmd_none(*new_pmd)))
230+
if (WARN_ON_ONCE(!pmd_none(*new_pmd)))
212231
return false;
213232

214233
/*

0 commit comments

Comments
 (0)