Skip to content

Commit 1383279

Browse files
sean-jcbonzini
authored andcommitted
KVM: x86: Allow guest to set EFER.NX=1 on non-PAE 32-bit kernels
Remove an ancient restriction that disallowed exposing EFER.NX to the guest if EFER.NX=0 on the host, even if NX is fully supported by the CPU. The motivation of the check, added by commit 2cc5156 ("KVM: VMX: Avoid saving and restoring msr_efer on lightweight vmexit"), was to rule out the case of host.EFER.NX=0 and guest.EFER.NX=1 so that KVM could run the guest with the host's EFER.NX and thus avoid context switching EFER if the only divergence was the NX bit. Fast forward to today, and KVM has long since stopped running the guest with the host's EFER.NX. Not only does KVM context switch EFER if host.EFER.NX=1 && guest.EFER.NX=0, KVM also forces host.EFER.NX=0 && guest.EFER.NX=1 when using shadow paging (to emulate SMEP). Furthermore, the entire motivation for the restriction was made obsolete over a decade ago when Intel added dedicated host and guest EFER fields in the VMCS (Nehalem timeframe), which reduced the overhead of context switching EFER from 400+ cycles (2 * WRMSR + 1 * RDMSR) to a mere ~2 cycles. In practice, the removed restriction only affects non-PAE 32-bit kernels, as EFER.NX is set during boot if NX is supported and the kernel will use PAE paging (32-bit or 64-bit), regardless of whether or not the kernel will actually use NX itself (mark PTEs non-executable). Alternatively and/or complementarily, startup_32_smp() in head_32.S could be modified to set EFER.NX=1 regardless of paging mode, thus eliminating the scenario where NX is supported but not enabled. However, that runs the risk of breaking non-KVM non-PAE kernels (though the risk is very, very low as there are no known EFER.NX errata), and also eliminates an easy-to-use mechanism for stressing KVM's handling of guest vs. host EFER across nested virtualization transitions. Suggested-by: Paolo Bonzini <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
1 parent d5aaad6 commit 1383279

File tree

1 file changed

+1
-27
lines changed

1 file changed

+1
-27
lines changed

arch/x86/kvm/cpuid.c

Lines changed: 1 addition & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -208,30 +208,6 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
208208
kvm_mmu_after_set_cpuid(vcpu);
209209
}
210210

211-
static int is_efer_nx(void)
212-
{
213-
return host_efer & EFER_NX;
214-
}
215-
216-
static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu)
217-
{
218-
int i;
219-
struct kvm_cpuid_entry2 *e, *entry;
220-
221-
entry = NULL;
222-
for (i = 0; i < vcpu->arch.cpuid_nent; ++i) {
223-
e = &vcpu->arch.cpuid_entries[i];
224-
if (e->function == 0x80000001) {
225-
entry = e;
226-
break;
227-
}
228-
}
229-
if (entry && cpuid_entry_has(entry, X86_FEATURE_NX) && !is_efer_nx()) {
230-
cpuid_entry_clear(entry, X86_FEATURE_NX);
231-
printk(KERN_INFO "kvm: guest NX capability removed\n");
232-
}
233-
}
234-
235211
int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu)
236212
{
237213
struct kvm_cpuid_entry2 *best;
@@ -302,7 +278,6 @@ int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
302278
vcpu->arch.cpuid_entries = e2;
303279
vcpu->arch.cpuid_nent = cpuid->nent;
304280

305-
cpuid_fix_nx_cap(vcpu);
306281
kvm_update_cpuid_runtime(vcpu);
307282
kvm_vcpu_after_set_cpuid(vcpu);
308283

@@ -401,7 +376,6 @@ static __always_inline void kvm_cpu_cap_mask(enum cpuid_leafs leaf, u32 mask)
401376

402377
void kvm_set_cpu_caps(void)
403378
{
404-
unsigned int f_nx = is_efer_nx() ? F(NX) : 0;
405379
#ifdef CONFIG_X86_64
406380
unsigned int f_gbpages = F(GBPAGES);
407381
unsigned int f_lm = F(LM);
@@ -515,7 +489,7 @@ void kvm_set_cpu_caps(void)
515489
F(CX8) | F(APIC) | 0 /* Reserved */ | F(SYSCALL) |
516490
F(MTRR) | F(PGE) | F(MCA) | F(CMOV) |
517491
F(PAT) | F(PSE36) | 0 /* Reserved */ |
518-
f_nx | 0 /* Reserved */ | F(MMXEXT) | F(MMX) |
492+
F(NX) | 0 /* Reserved */ | F(MMXEXT) | F(MMX) |
519493
F(FXSR) | F(FXSR_OPT) | f_gbpages | F(RDTSCP) |
520494
0 /* Reserved */ | f_lm | F(3DNOWEXT) | F(3DNOW)
521495
);

0 commit comments

Comments
 (0)