Skip to content

Commit c444eb5

Browse files
aagittorvalds
authored andcommitted
mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()
Write protect anon page faults require an accurate mapcount to decide if to break the COW or not. This is implemented in the THP path with reuse_swap_page() -> page_trans_huge_map_swapcount()/page_trans_huge_mapcount(). If the COW triggers while the other processes sharing the page are under a huge pmd split, to do an accurate reading, we must ensure the mapcount isn't computed while it's being transferred from the head page to the tail pages. reuse_swap_cache() already runs serialized by the page lock, so it's enough to add the page lock around __split_huge_pmd_locked too, in order to add the missing serialization. Note: the commit in "Fixes" is just to facilitate the backporting, because the code before such commit didn't try to do an accurate THP mapcount calculation and it instead used the page_count() to decide if to COW or not. Both the page_count and the pin_count are THP-wide refcounts, so they're inaccurate if used in reuse_swap_page(). Reverting such commit (besides the unrelated fix to the local anon_vma assignment) would have also opened the window for memory corruption side effects to certain workloads as documented in such commit header. Signed-off-by: Andrea Arcangeli <[email protected]> Suggested-by: Jann Horn <[email protected]> Reported-by: Jann Horn <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Fixes: 6d0a07e ("mm: thp: calculate the mapcount correctly for THP pages during WP faults") Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
1 parent cb8e59c commit c444eb5

File tree

1 file changed

+28
-3
lines changed

1 file changed

+28
-3
lines changed

mm/huge_memory.c

Lines changed: 28 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2385,6 +2385,8 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
23852385
{
23862386
spinlock_t *ptl;
23872387
struct mmu_notifier_range range;
2388+
bool was_locked = false;
2389+
pmd_t _pmd;
23882390

23892391
mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
23902392
address & HPAGE_PMD_MASK,
@@ -2397,18 +2399,41 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
23972399
* pmd against. Otherwise we can end up replacing wrong page.
23982400
*/
23992401
VM_BUG_ON(freeze && !page);
2400-
if (page && page != pmd_page(*pmd))
2401-
goto out;
2402+
if (page) {
2403+
VM_WARN_ON_ONCE(!PageLocked(page));
2404+
was_locked = true;
2405+
if (page != pmd_page(*pmd))
2406+
goto out;
2407+
}
24022408

2409+
repeat:
24032410
if (pmd_trans_huge(*pmd)) {
2404-
page = pmd_page(*pmd);
2411+
if (!page) {
2412+
page = pmd_page(*pmd);
2413+
if (unlikely(!trylock_page(page))) {
2414+
get_page(page);
2415+
_pmd = *pmd;
2416+
spin_unlock(ptl);
2417+
lock_page(page);
2418+
spin_lock(ptl);
2419+
if (unlikely(!pmd_same(*pmd, _pmd))) {
2420+
unlock_page(page);
2421+
put_page(page);
2422+
page = NULL;
2423+
goto repeat;
2424+
}
2425+
put_page(page);
2426+
}
2427+
}
24052428
if (PageMlocked(page))
24062429
clear_page_mlock(page);
24072430
} else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd)))
24082431
goto out;
24092432
__split_huge_pmd_locked(vma, pmd, range.start, freeze);
24102433
out:
24112434
spin_unlock(ptl);
2435+
if (!was_locked && page)
2436+
unlock_page(page);
24122437
/*
24132438
* No need to double call mmu_notifier->invalidate_range() callback.
24142439
* They are 3 cases to consider inside __split_huge_pmd_locked():

0 commit comments

Comments
 (0)