Skip to content

Commit 131a79b

Browse files
mjkravetzakpm00
authored andcommitted
hugetlb: fix vma lock handling during split vma and range unmapping
Patch series "hugetlb: fixes for new vma lock series". In review of the series "hugetlb: Use new vma lock for huge pmd sharing synchronization", Miaohe Lin pointed out two key issues: 1) There is a race in the routine hugetlb_unmap_file_folio when locks are dropped and reacquired in the correct order [1]. 2) With the switch to using vma lock for fault/truncate synchronization, we need to make sure lock exists for all VM_MAYSHARE vmas, not just vmas capable of pmd sharing. These two issues are addressed here. In addition, having a vma lock present in all VM_MAYSHARE vmas, uncovered some issues around vma splitting. Those are also addressed. [1] https://lore.kernel.org/linux-mm/[email protected]/ This patch (of 3): The hugetlb vma lock hangs off the vm_private_data field and is specific to the vma. When vm_area_dup() is called as part of vma splitting, the vma lock pointer is copied to the new vma. This will result in issues such as double freeing of the structure. Update the hugetlb open vm_ops to allocate a new vma lock for the new vma. The routine __unmap_hugepage_range_final unconditionally unset VM_MAYSHARE to prevent subsequent pmd sharing. hugetlb_vma_lock_free attempted to anticipate this by checking both VM_MAYSHARE and VM_SHARED. However, if only VM_MAYSHARE was set we would miss the free. With the introduction of the vma lock, a vma can not participate in pmd sharing if vm_private_data is NULL. Instead of clearing VM_MAYSHARE in __unmap_hugepage_range_final, free the vma lock to prevent sharing. Also, update the sharing code to make sure vma lock is indeed a condition for pmd sharing. hugetlb_vma_lock_free can then key off VM_MAYSHARE and not miss any vmas. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: "hugetlb: add vma based lock for pmd sharing" Signed-off-by: Mike Kravetz <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: "Aneesh Kumar K.V" <[email protected]> Cc: Axel Rasmussen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: James Houghton <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mina Almasry <[email protected]> Cc: Muchun Song <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Pasha Tatashin <[email protected]> Cc: Peter Xu <[email protected]> Cc: Prakash Sangappa <[email protected]> Cc: Sven Schnelle <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent e4fea72 commit 131a79b

File tree

2 files changed

+27
-20
lines changed

2 files changed

+27
-20
lines changed

mm/hugetlb.c

Lines changed: 27 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -4612,7 +4612,14 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma)
46124612
kref_get(&resv->refs);
46134613
}
46144614

4615-
hugetlb_vma_lock_alloc(vma);
4615+
/*
4616+
* vma_lock structure for sharable mappings is vma specific.
4617+
* Clear old pointer (if copied via vm_area_dup) and create new.
4618+
*/
4619+
if (vma->vm_flags & VM_MAYSHARE) {
4620+
vma->vm_private_data = NULL;
4621+
hugetlb_vma_lock_alloc(vma);
4622+
}
46164623
}
46174624

46184625
static void hugetlb_vm_op_close(struct vm_area_struct *vma)
@@ -5168,19 +5175,23 @@ void __unmap_hugepage_range_final(struct mmu_gather *tlb,
51685175
unsigned long end, struct page *ref_page,
51695176
zap_flags_t zap_flags)
51705177
{
5178+
hugetlb_vma_lock_write(vma);
5179+
i_mmap_lock_write(vma->vm_file->f_mapping);
5180+
51715181
__unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags);
51725182

51735183
/*
5174-
* Clear this flag so that x86's huge_pmd_share page_table_shareable
5175-
* test will fail on a vma being torn down, and not grab a page table
5176-
* on its way out. We're lucky that the flag has such an appropriate
5177-
* name, and can in fact be safely cleared here. We could clear it
5178-
* before the __unmap_hugepage_range above, but all that's necessary
5179-
* is to clear it before releasing the i_mmap_rwsem. This works
5180-
* because in the context this is called, the VMA is about to be
5181-
* destroyed and the i_mmap_rwsem is held.
5184+
* Unlock and free the vma lock before releasing i_mmap_rwsem. When
5185+
* the vma_lock is freed, this makes the vma ineligible for pmd
5186+
* sharing. And, i_mmap_rwsem is required to set up pmd sharing.
5187+
* This is important as page tables for this unmapped range will
5188+
* be asynchrously deleted. If the page tables are shared, there
5189+
* will be issues when accessed by someone else.
51825190
*/
5183-
vma->vm_flags &= ~VM_MAYSHARE;
5191+
hugetlb_vma_unlock_write(vma);
5192+
hugetlb_vma_lock_free(vma);
5193+
5194+
i_mmap_unlock_write(vma->vm_file->f_mapping);
51845195
}
51855196

51865197
void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start,
@@ -6664,10 +6675,13 @@ static unsigned long page_table_shareable(struct vm_area_struct *svma,
66646675
/*
66656676
* match the virtual addresses, permission and the alignment of the
66666677
* page table page.
6678+
*
6679+
* Also, vma_lock (vm_private_data) is required for sharing.
66676680
*/
66686681
if (pmd_index(addr) != pmd_index(saddr) ||
66696682
vm_flags != svm_flags ||
6670-
!range_in_vma(svma, sbase, s_end))
6683+
!range_in_vma(svma, sbase, s_end) ||
6684+
!svma->vm_private_data)
66716685
return 0;
66726686

66736687
return saddr;
@@ -6817,12 +6831,9 @@ void hugetlb_vma_lock_release(struct kref *kref)
68176831
static void hugetlb_vma_lock_free(struct vm_area_struct *vma)
68186832
{
68196833
/*
6820-
* Only present in sharable vmas. See comment in
6821-
* __unmap_hugepage_range_final about how VM_SHARED could
6822-
* be set without VM_MAYSHARE. As a result, we need to
6823-
* check if either is set in the free path.
6834+
* Only present in sharable vmas.
68246835
*/
6825-
if (!vma || !(vma->vm_flags & (VM_MAYSHARE | VM_SHARED)))
6836+
if (!vma || !__vma_shareable_flags_pmd(vma))
68266837
return;
68276838

68286839
if (vma->vm_private_data) {

mm/memory.c

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1685,12 +1685,8 @@ static void unmap_single_vma(struct mmu_gather *tlb,
16851685
if (vma->vm_file) {
16861686
zap_flags_t zap_flags = details ?
16871687
details->zap_flags : 0;
1688-
hugetlb_vma_lock_write(vma);
1689-
i_mmap_lock_write(vma->vm_file->f_mapping);
16901688
__unmap_hugepage_range_final(tlb, vma, start, end,
16911689
NULL, zap_flags);
1692-
i_mmap_unlock_write(vma->vm_file->f_mapping);
1693-
hugetlb_vma_unlock_write(vma);
16941690
}
16951691
} else
16961692
unmap_page_range(tlb, vma, start, end, details);

0 commit comments

Comments
 (0)