Skip to content

Commit c7b1850

Browse files
mjkravetztorvalds
authored andcommitted
hugetlb: don't pass page cache pages to restore_reserve_on_error
syzbot hit kernel BUG at fs/hugetlbfs/inode.c:532 as described in [1]. This BUG triggers if the HPageRestoreReserve flag is set on a page in the page cache. It should never be set, as the routine huge_add_to_page_cache explicitly clears the flag after adding a page to the cache. The only code other than huge page allocation which sets the flag is restore_reserve_on_error. It will potentially set the flag in rare out of memory conditions. syzbot was injecting errors to cause memory allocation errors which exercised this specific path. The code in restore_reserve_on_error is doing the right thing. However, there are instances where pages in the page cache were being passed to restore_reserve_on_error. This is incorrect, as once a page goes into the cache reservation information will not be modified for the page until it is removed from the cache. Error paths do not remove pages from the cache, so even in the case of error, the page will remain in the cache and no reservation adjustment is needed. Modify routines that potentially call restore_reserve_on_error with a page cache page to no longer do so. Note on fixes tag: Prior to commit 846be08 ("mm/hugetlb: expand restore_reserve_on_error functionality") the routine would not process page cache pages because the HPageRestoreReserve flag is not set on such pages. Therefore, this issue could not be trigggered. The code added by commit 846be08 ("mm/hugetlb: expand restore_reserve_on_error functionality") is needed and correct. It exposed incorrect calls to restore_reserve_on_error which is the root cause addressed by this commit. [1] https://lore.kernel.org/linux-mm/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 846be08 ("mm/hugetlb: expand restore_reserve_on_error functionality") Signed-off-by: Mike Kravetz <[email protected]> Reported-by: <[email protected]> Cc: Mina Almasry <[email protected]> Cc: Axel Rasmussen <[email protected]> Cc: Peter Xu <[email protected]> Cc: Muchun Song <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent a7cb5d2 commit c7b1850

File tree

1 file changed

+14
-5
lines changed

1 file changed

+14
-5
lines changed

mm/hugetlb.c

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2476,7 +2476,7 @@ void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma,
24762476
if (!rc) {
24772477
/*
24782478
* This indicates there is an entry in the reserve map
2479-
* added by alloc_huge_page. We know it was added
2479+
* not added by alloc_huge_page. We know it was added
24802480
* before the alloc_huge_page call, otherwise
24812481
* HPageRestoreReserve would be set on the page.
24822482
* Remove the entry so that a subsequent allocation
@@ -4660,7 +4660,9 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
46604660
spin_unlock(ptl);
46614661
mmu_notifier_invalidate_range_end(&range);
46624662
out_release_all:
4663-
restore_reserve_on_error(h, vma, haddr, new_page);
4663+
/* No restore in case of successful pagetable update (Break COW) */
4664+
if (new_page != old_page)
4665+
restore_reserve_on_error(h, vma, haddr, new_page);
46644666
put_page(new_page);
46654667
out_release_old:
46664668
put_page(old_page);
@@ -4776,7 +4778,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
47764778
pte_t new_pte;
47774779
spinlock_t *ptl;
47784780
unsigned long haddr = address & huge_page_mask(h);
4779-
bool new_page = false;
4781+
bool new_page, new_pagecache_page = false;
47804782

47814783
/*
47824784
* Currently, we are forced to kill the process in the event the
@@ -4799,6 +4801,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
47994801
goto out;
48004802

48014803
retry:
4804+
new_page = false;
48024805
page = find_lock_page(mapping, idx);
48034806
if (!page) {
48044807
/* Check for page in userfault range */
@@ -4842,6 +4845,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
48424845
goto retry;
48434846
goto out;
48444847
}
4848+
new_pagecache_page = true;
48454849
} else {
48464850
lock_page(page);
48474851
if (unlikely(anon_vma_prepare(vma))) {
@@ -4926,7 +4930,9 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
49264930
spin_unlock(ptl);
49274931
backout_unlocked:
49284932
unlock_page(page);
4929-
restore_reserve_on_error(h, vma, haddr, page);
4933+
/* restore reserve for newly allocated pages not in page cache */
4934+
if (new_page && !new_pagecache_page)
4935+
restore_reserve_on_error(h, vma, haddr, page);
49304936
put_page(page);
49314937
goto out;
49324938
}
@@ -5135,6 +5141,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
51355141
int ret = -ENOMEM;
51365142
struct page *page;
51375143
int writable;
5144+
bool new_pagecache_page = false;
51385145

51395146
if (is_continue) {
51405147
ret = -EFAULT;
@@ -5228,6 +5235,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
52285235
ret = huge_add_to_page_cache(page, mapping, idx);
52295236
if (ret)
52305237
goto out_release_nounlock;
5238+
new_pagecache_page = true;
52315239
}
52325240

52335241
ptl = huge_pte_lockptr(h, dst_mm, dst_pte);
@@ -5291,7 +5299,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
52915299
if (vm_shared || is_continue)
52925300
unlock_page(page);
52935301
out_release_nounlock:
5294-
restore_reserve_on_error(h, dst_vma, dst_addr, page);
5302+
if (!new_pagecache_page)
5303+
restore_reserve_on_error(h, dst_vma, dst_addr, page);
52955304
put_page(page);
52965305
goto out;
52975306
}

0 commit comments

Comments
 (0)