Skip to content

Commit 3f6821a

Browse files
committed
KVM: x86: Forcibly leave nested if RSM to L2 hits shutdown
Leave nested mode before synthesizing shutdown (a.k.a. TRIPLE_FAULT) if RSM fails when resuming L2 (a.k.a. guest mode). Architecturally, shutdown on RSM occurs _before_ the transition back to guest mode on both Intel and AMD. On Intel, per the SDM pseudocode, SMRAM state is loaded before critical VMX state: restore state normally from SMRAM; ... CR4.VMXE := value stored internally; IF internal storage indicates that the logical processor had been in VMX operation (root or non-root) THEN enter VMX operation (root or non-root); restore VMX-critical state as defined in Section 32.14.1; ... restore current VMCS pointer; FI; AMD's APM is both less clearcut and more explicit. Because AMD CPUs save VMCB and guest state in SMRAM itself, given the lack of anything in the APM to indicate a shutdown in guest mode is possible, a straightforward reading of the clause on invalid state is that _what_ state is invalid is irrelevant, i.e. all roads lead to shutdown. An RSM causes a processor shutdown if an invalid-state condition is found in the SMRAM state-save area. This fixes a bug found by syzkaller where synthesizing shutdown for L2 led to a nested VM-Exit (if L1 is intercepting shutdown), which in turn caused KVM to complain about trying to cancel a nested VM-Enter (see commit 759cbd5 ("KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM"). Note, Paolo pointed out that KVM shouldn't set nested_run_pending until after loading SMRAM state. But as above, that's only half the story, KVM shouldn't transition to guest mode either. Unfortunately, fixing that mess requires rewriting the nVMX and nSVM RSM flows to not piggyback their nested VM-Enter flows, as executing the nested VM-Enter flows after loading state from SMRAM would clobber much of said state. For now, add a FIXME to call out that transitioning to guest mode before loading state from SMRAM is wrong. Link: https://lore.kernel.org/all/CABgObfYaUHXyRmsmg8UjRomnpQ0Jnaog9-L2gMjsjkqChjDYUQ@mail.gmail.com Reported-by: [email protected] Closes: https://lore.kernel.org/all/[email protected] Reported-by: Zheyu Ma <[email protected]> Closes: https://lore.kernel.org/all/CAMhUBjmXMYsEoVYw_M8hSZjBMHh24i88QYm-RY6HDta5YZ7Wgw@mail.gmail.com Analyzed-by: Michal Wilczynski <[email protected]> Cc: Kishen Maloor <[email protected]> Reviewed-by: Maxim Levitsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
1 parent 1876dd6 commit 3f6821a

File tree

3 files changed

+25
-11
lines changed

3 files changed

+25
-11
lines changed

arch/x86/kvm/smm.c

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -624,17 +624,31 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt)
624624
#endif
625625

626626
/*
627-
* Give leave_smm() a chance to make ISA-specific changes to the vCPU
628-
* state (e.g. enter guest mode) before loading state from the SMM
629-
* state-save area.
627+
* FIXME: When resuming L2 (a.k.a. guest mode), the transition to guest
628+
* mode should happen _after_ loading state from SMRAM. However, KVM
629+
* piggybacks the nested VM-Enter flows (which is wrong for many other
630+
* reasons), and so nSVM/nVMX would clobber state that is loaded from
631+
* SMRAM and from the VMCS/VMCB.
630632
*/
631633
if (kvm_x86_call(leave_smm)(vcpu, &smram))
632634
return X86EMUL_UNHANDLEABLE;
633635

634636
#ifdef CONFIG_X86_64
635637
if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
636-
return rsm_load_state_64(ctxt, &smram.smram64);
638+
ret = rsm_load_state_64(ctxt, &smram.smram64);
637639
else
638640
#endif
639-
return rsm_load_state_32(ctxt, &smram.smram32);
641+
ret = rsm_load_state_32(ctxt, &smram.smram32);
642+
643+
/*
644+
* If RSM fails and triggers shutdown, architecturally the shutdown
645+
* occurs *before* the transition to guest mode. But due to KVM's
646+
* flawed handling of RSM to L2 (see above), the vCPU may already be
647+
* in_guest_mode(). Force the vCPU out of guest mode before delivering
648+
* the shutdown, so that L1 enters shutdown instead of seeing a VM-Exit
649+
* that architecturally shouldn't be possible.
650+
*/
651+
if (ret != X86EMUL_CONTINUE && is_guest_mode(vcpu))
652+
kvm_leave_nested(vcpu);
653+
return ret;
640654
}

arch/x86/kvm/x86.c

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -833,12 +833,6 @@ static void kvm_queue_exception_vmexit(struct kvm_vcpu *vcpu, unsigned int vecto
833833
ex->payload = payload;
834834
}
835835

836-
/* Forcibly leave the nested mode in cases like a vCPU reset */
837-
static void kvm_leave_nested(struct kvm_vcpu *vcpu)
838-
{
839-
kvm_x86_ops.nested_ops->leave_nested(vcpu);
840-
}
841-
842836
static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
843837
unsigned nr, bool has_error, u32 error_code,
844838
bool has_payload, unsigned long payload, bool reinject)

arch/x86/kvm/x86.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,12 @@ static inline unsigned int __shrink_ple_window(unsigned int val,
108108
void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu);
109109
int kvm_check_nested_events(struct kvm_vcpu *vcpu);
110110

111+
/* Forcibly leave the nested mode in cases like a vCPU reset */
112+
static inline void kvm_leave_nested(struct kvm_vcpu *vcpu)
113+
{
114+
kvm_x86_ops.nested_ops->leave_nested(vcpu);
115+
}
116+
111117
static inline bool kvm_vcpu_has_run(struct kvm_vcpu *vcpu)
112118
{
113119
return vcpu->arch.last_vmentry_cpu != -1;

0 commit comments

Comments
 (0)