Skip to content

Commit 58a20a9

Browse files
sean-jcbonzini
authored andcommitted
KVM: x86/mmu: Zap only SPs that shadow gPTEs when deleting memslot
When performing a targeted zap on memslot removal, zap only MMU pages that shadow guest PTEs, as zapping all SPs that "match" the gfn is inexact and unnecessary. Furthermore, for_each_gfn_valid_sp() arguably shouldn't exist, because it doesn't do what most people would it expect it to do. The "round gfn for level" adjustment that is done for direct SPs (no gPTE) means that the exact gfn comparison will not get a match, even when a SP does "cover" a gfn, or was even created specifically for a gfn. For memslot deletion specifically, KVM's behavior will vary significantly based on the size and alignment of a memslot, and in weird ways. E.g. for a 4KiB memslot, KVM will zap more SPs if the slot is 1GiB aligned than if it's only 4KiB aligned. And as described below, zapping SPs in the aligned case overzaps for direct MMUs, as odds are good the upper-level SPs are serving other memslots. To iterate over all potentially-relevant gfns, KVM would need to make a pass over the hash table for each level, with the gfn used for lookup rounded for said level. And then check that the SP is of the correct level, too, e.g. to avoid over-zapping. But even then, KVM would massively overzap, as processing every level is all but guaranteed to zap SPs that serve other memslots, especially if the memslot being removed is relatively small. KVM could mitigate that issue by processing only levels that can be possible guest huge pages, i.e. are less likely to be re-used for other memslot, but while somewhat logical, that's quite arbitrary and would be a bit of a mess to implement. So, zap only SPs with gPTEs, as the resulting behavior is easy to describe, is predictable, and is explicitly minimal, i.e. KVM only zaps SPs that absolutely must be zapped. Cc: Yan Zhao <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yan Zhao <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
1 parent 8e690b8 commit 58a20a9

File tree

1 file changed

+6
-10
lines changed

1 file changed

+6
-10
lines changed

arch/x86/kvm/mmu/mmu.c

Lines changed: 6 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1884,14 +1884,10 @@ static bool sp_has_gptes(struct kvm_mmu_page *sp)
18841884
if (is_obsolete_sp((_kvm), (_sp))) { \
18851885
} else
18861886

1887-
#define for_each_gfn_valid_sp(_kvm, _sp, _gfn) \
1887+
#define for_each_gfn_valid_sp_with_gptes(_kvm, _sp, _gfn) \
18881888
for_each_valid_sp(_kvm, _sp, \
18891889
&(_kvm)->arch.mmu_page_hash[kvm_page_table_hashfn(_gfn)]) \
1890-
if ((_sp)->gfn != (_gfn)) {} else
1891-
1892-
#define for_each_gfn_valid_sp_with_gptes(_kvm, _sp, _gfn) \
1893-
for_each_gfn_valid_sp(_kvm, _sp, _gfn) \
1894-
if (!sp_has_gptes(_sp)) {} else
1890+
if ((_sp)->gfn != (_gfn) || !sp_has_gptes(_sp)) {} else
18951891

18961892
static bool kvm_sync_page_check(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
18971893
{
@@ -7063,15 +7059,15 @@ static void kvm_mmu_zap_memslot_pages_and_flush(struct kvm *kvm,
70637059

70647060
/*
70657061
* Since accounting information is stored in struct kvm_arch_memory_slot,
7066-
* shadow pages deletion (e.g. unaccount_shadowed()) requires that all
7067-
* gfns with a shadow page have a corresponding memslot. Do so before
7068-
* the memslot goes away.
7062+
* all MMU pages that are shadowing guest PTEs must be zapped before the
7063+
* memslot is deleted, as freeing such pages after the memslot is freed
7064+
* will result in use-after-free, e.g. in unaccount_shadowed().
70697065
*/
70707066
for (i = 0; i < slot->npages; i++) {
70717067
struct kvm_mmu_page *sp;
70727068
gfn_t gfn = slot->base_gfn + i;
70737069

7074-
for_each_gfn_valid_sp(kvm, sp, gfn)
7070+
for_each_gfn_valid_sp_with_gptes(kvm, sp, gfn)
70757071
kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
70767072

70777073
if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) {

0 commit comments

Comments
 (0)