Skip to content

Commit ddd0574

Browse files
ioworker0akpm00
authored andcommitted
mm/rmap: fix potential out-of-bounds page table access during batched unmap
As pointed out by David[1], the batched unmap logic in try_to_unmap_one() may read past the end of a PTE table when a large folio's PTE mappings are not fully contained within a single page table. While this scenario might be rare, an issue triggerable from userspace must be fixed regardless of its likelihood. This patch fixes the out-of-bounds access by refactoring the logic into a new helper, folio_unmap_pte_batch(). The new helper correctly calculates the safe batch size by capping the scan at both the VMA and PMD boundaries. To simplify the code, it also supports partial batching (i.e., any number of pages from 1 up to the calculated safe maximum), as there is no strong reason to special-case for fully mapped folios. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/linux-mm/[email protected] [1] Fixes: 354dffd ("mm: support batched unmap for lazyfree large folios during reclamation") Signed-off-by: Lance Yang <[email protected]> Suggested-by: David Hildenbrand <[email protected]> Reported-by: David Hildenbrand <[email protected]> Closes: https://lore.kernel.org/linux-mm/[email protected] Suggested-by: Barry Song <[email protected]> Acked-by: Barry Song <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Acked-by: David Hildenbrand <[email protected]> Reviewed-by: Harry Yoo <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Chris Li <[email protected]> Cc: "Huang, Ying" <[email protected]> Cc: Kairui Song <[email protected]> Cc: Lance Yang <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Mingzhe Yang <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Tangquan Zheng <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent c39b874 commit ddd0574

File tree

1 file changed

+28
-18
lines changed

1 file changed

+28
-18
lines changed

mm/rmap.c

Lines changed: 28 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1845,23 +1845,32 @@ void folio_remove_rmap_pud(struct folio *folio, struct page *page,
18451845
#endif
18461846
}
18471847

1848-
/* We support batch unmapping of PTEs for lazyfree large folios */
1849-
static inline bool can_batch_unmap_folio_ptes(unsigned long addr,
1850-
struct folio *folio, pte_t *ptep)
1848+
static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
1849+
struct page_vma_mapped_walk *pvmw,
1850+
enum ttu_flags flags, pte_t pte)
18511851
{
18521852
const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY;
1853-
int max_nr = folio_nr_pages(folio);
1854-
pte_t pte = ptep_get(ptep);
1853+
unsigned long end_addr, addr = pvmw->address;
1854+
struct vm_area_struct *vma = pvmw->vma;
1855+
unsigned int max_nr;
1856+
1857+
if (flags & TTU_HWPOISON)
1858+
return 1;
1859+
if (!folio_test_large(folio))
1860+
return 1;
18551861

1862+
/* We may only batch within a single VMA and a single page table. */
1863+
end_addr = pmd_addr_end(addr, vma->vm_end);
1864+
max_nr = (end_addr - addr) >> PAGE_SHIFT;
1865+
1866+
/* We only support lazyfree batching for now ... */
18561867
if (!folio_test_anon(folio) || folio_test_swapbacked(folio))
1857-
return false;
1868+
return 1;
18581869
if (pte_unused(pte))
1859-
return false;
1860-
if (pte_pfn(pte) != folio_pfn(folio))
1861-
return false;
1870+
return 1;
18621871

1863-
return folio_pte_batch(folio, addr, ptep, pte, max_nr, fpb_flags, NULL,
1864-
NULL, NULL) == max_nr;
1872+
return folio_pte_batch(folio, addr, pvmw->pte, pte, max_nr, fpb_flags,
1873+
NULL, NULL, NULL);
18651874
}
18661875

18671876
/*
@@ -2024,9 +2033,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
20242033
if (pte_dirty(pteval))
20252034
folio_mark_dirty(folio);
20262035
} else if (likely(pte_present(pteval))) {
2027-
if (folio_test_large(folio) && !(flags & TTU_HWPOISON) &&
2028-
can_batch_unmap_folio_ptes(address, folio, pvmw.pte))
2029-
nr_pages = folio_nr_pages(folio);
2036+
nr_pages = folio_unmap_pte_batch(folio, &pvmw, flags, pteval);
20302037
end_addr = address + nr_pages * PAGE_SIZE;
20312038
flush_cache_range(vma, address, end_addr);
20322039

@@ -2206,13 +2213,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
22062213
hugetlb_remove_rmap(folio);
22072214
} else {
22082215
folio_remove_rmap_ptes(folio, subpage, nr_pages, vma);
2209-
folio_ref_sub(folio, nr_pages - 1);
22102216
}
22112217
if (vma->vm_flags & VM_LOCKED)
22122218
mlock_drain_local();
2213-
folio_put(folio);
2214-
/* We have already batched the entire folio */
2215-
if (nr_pages > 1)
2219+
folio_put_refs(folio, nr_pages);
2220+
2221+
/*
2222+
* If we are sure that we batched the entire folio and cleared
2223+
* all PTEs, we can just optimize and stop right here.
2224+
*/
2225+
if (nr_pages == folio_nr_pages(folio))
22162226
goto walk_done;
22172227
continue;
22182228
walk_abort:

0 commit comments

Comments
 (0)