Skip to content

Commit 949f5c1

Browse files
davidhildenbrandborntraeger
authored andcommitted
s390/mm: fix VMA and page table handling code in storage key handling functions
There are multiple things broken about our storage key handling functions: 1. We should not walk/touch page tables outside of VMA boundaries when holding only the mmap sem in read mode. Evil user space can modify the VMA layout just before this function runs and e.g., trigger races with page table removal code since commit dd2283f ("mm: mmap: zap pages with read mmap_sem in munmap"). gfn_to_hva() will only translate using KVM memory regions, but won't validate the VMA. 2. We should not allocate page tables outside of VMA boundaries: if evil user space decides to map hugetlbfs to these ranges, bad things will happen because we suddenly have PTE or PMD page tables where we shouldn't have them. 3. We don't handle large PUDs that might suddenly appeared inside our page table hierarchy. Don't manually allocate page tables, properly validate that we have VMA and bail out on pud_large(). All callers of page table handling functions, except get_guest_storage_key(), call fixup_user_fault() in case they receive an -EFAULT and retry; this will allocate the necessary page tables if required. To keep get_guest_storage_key() working as expected and not requiring kvm_s390_get_skeys() to call fixup_user_fault() distinguish between "there is simply no page table or huge page yet and the key is assumed to be 0" and "this is a fault to be reported". Although commit 637ff9e ("s390/mm: Add huge pmd storage key handling") introduced most of the affected code, it was actually already broken before when using get_locked_pte() without any VMA checks. Note: Ever since commit 637ff9e ("s390/mm: Add huge pmd storage key handling") we can no longer set a guest storage key (for example from QEMU during VM live migration) without actually resolving a fault. Although we would have created most page tables, we would choke on the !pmd_present(), requiring a call to fixup_user_fault(). I would have thought that this is problematic in combination with postcopy life migration ... but nobody noticed and this patch doesn't change the situation. So maybe it's just fine. Fixes: 9fcf93b ("KVM: S390: Create helper function get_guest_storage_key") Fixes: 24d5dd0 ("s390/kvm: Provide function for setting the guest storage key") Fixes: a7e19ab ("KVM: s390: handle missing storage-key facility") Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Acked-by: Heiko Carstens <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Borntraeger <[email protected]>
1 parent fe3d100 commit 949f5c1

File tree

1 file changed

+39
-18
lines changed

1 file changed

+39
-18
lines changed

arch/s390/mm/pgtable.c

Lines changed: 39 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -429,22 +429,36 @@ static inline pmd_t pmdp_flush_lazy(struct mm_struct *mm,
429429
}
430430

431431
#ifdef CONFIG_PGSTE
432-
static pmd_t *pmd_alloc_map(struct mm_struct *mm, unsigned long addr)
432+
static int pmd_lookup(struct mm_struct *mm, unsigned long addr, pmd_t **pmdp)
433433
{
434+
struct vm_area_struct *vma;
434435
pgd_t *pgd;
435436
p4d_t *p4d;
436437
pud_t *pud;
437-
pmd_t *pmd;
438+
439+
/* We need a valid VMA, otherwise this is clearly a fault. */
440+
vma = vma_lookup(mm, addr);
441+
if (!vma)
442+
return -EFAULT;
438443

439444
pgd = pgd_offset(mm, addr);
440-
p4d = p4d_alloc(mm, pgd, addr);
441-
if (!p4d)
442-
return NULL;
443-
pud = pud_alloc(mm, p4d, addr);
444-
if (!pud)
445-
return NULL;
446-
pmd = pmd_alloc(mm, pud, addr);
447-
return pmd;
445+
if (!pgd_present(*pgd))
446+
return -ENOENT;
447+
448+
p4d = p4d_offset(pgd, addr);
449+
if (!p4d_present(*p4d))
450+
return -ENOENT;
451+
452+
pud = pud_offset(p4d, addr);
453+
if (!pud_present(*pud))
454+
return -ENOENT;
455+
456+
/* Large PUDs are not supported yet. */
457+
if (pud_large(*pud))
458+
return -EFAULT;
459+
460+
*pmdp = pmd_offset(pud, addr);
461+
return 0;
448462
}
449463
#endif
450464

@@ -778,8 +792,7 @@ int set_guest_storage_key(struct mm_struct *mm, unsigned long addr,
778792
pmd_t *pmdp;
779793
pte_t *ptep;
780794

781-
pmdp = pmd_alloc_map(mm, addr);
782-
if (unlikely(!pmdp))
795+
if (pmd_lookup(mm, addr, &pmdp))
783796
return -EFAULT;
784797

785798
ptl = pmd_lock(mm, pmdp);
@@ -881,8 +894,7 @@ int reset_guest_reference_bit(struct mm_struct *mm, unsigned long addr)
881894
pte_t *ptep;
882895
int cc = 0;
883896

884-
pmdp = pmd_alloc_map(mm, addr);
885-
if (unlikely(!pmdp))
897+
if (pmd_lookup(mm, addr, &pmdp))
886898
return -EFAULT;
887899

888900
ptl = pmd_lock(mm, pmdp);
@@ -935,15 +947,24 @@ int get_guest_storage_key(struct mm_struct *mm, unsigned long addr,
935947
pmd_t *pmdp;
936948
pte_t *ptep;
937949

938-
pmdp = pmd_alloc_map(mm, addr);
939-
if (unlikely(!pmdp))
950+
/*
951+
* If we don't have a PTE table and if there is no huge page mapped,
952+
* the storage key is 0.
953+
*/
954+
*key = 0;
955+
956+
switch (pmd_lookup(mm, addr, &pmdp)) {
957+
case -ENOENT:
958+
return 0;
959+
case 0:
960+
break;
961+
default:
940962
return -EFAULT;
963+
}
941964

942965
ptl = pmd_lock(mm, pmdp);
943966
if (!pmd_present(*pmdp)) {
944-
/* Not yet mapped memory has a zero key */
945967
spin_unlock(ptl);
946-
*key = 0;
947968
return 0;
948969
}
949970

0 commit comments

Comments
 (0)