Skip to content

Commit cd0e615

Browse files
sean-jcbonzini
authored andcommitted
KVM: nVMX: Synthesize TRIPLE_FAULT for L2 if emulation is required
Synthesize a triple fault if L2 guest state is invalid at the time of VM-Enter, which can happen if L1 modifies SMRAM or if userspace stuffs guest state via ioctls(), e.g. KVM_SET_SREGS. KVM should never emulate invalid guest state, since from L1's perspective, it's architecturally impossible for L2 to have invalid state while L2 is running in hardware. E.g. attempts to set CR0 or CR4 to unsupported values will either VM-Exit or #GP. Modifying vCPU state via RSM+SMRAM and ioctl() are the only paths that can trigger this scenario, as nested VM-Enter correctly rejects any attempt to enter L2 with invalid state. RSM is a straightforward case as (a) KVM follows AMD's SMRAM layout and behavior, and (b) Intel's SDM states that loading reserved CR0/CR4 bits via RSM results in shutdown, i.e. there is precedent for KVM's behavior. Following AMD's SMRAM layout is important as AMD's layout saves/restores the descriptor cache information, including CS.RPL and SS.RPL, and also defines all the fields relevant to invalid guest state as read-only, i.e. so long as the vCPU had valid state before the SMI, which is guaranteed for L2, RSM will generate valid state unless SMRAM was modified. Intel's layout saves/restores only the selector, which means that scenarios where the selector and cached RPL don't match, e.g. conforming code segments, would yield invalid guest state. Intel CPUs fudge around this issued by stuffing SS.RPL and CS.RPL on RSM. Per Intel's SDM on the "Default Treatment of RSM", paraphrasing for brevity: IF internal storage indicates that the [CPU was post-VMXON] THEN enter VMX operation (root or non-root); restore VMX-critical state as defined in Section 34.14.1; set to their fixed values any bits in CR0 and CR4 whose values must be fixed in VMX operation [unless coming from an unrestricted guest]; IF RFLAGS.VM = 0 AND (in VMX root operation OR the “unrestricted guest” VM-execution control is 0) THEN CS.RPL := SS.DPL; SS.RPL := SS.DPL; FI; restore current VMCS pointer; FI; Note that Intel CPUs also overwrite the fixed CR0/CR4 bits, whereas KVM will sythesize TRIPLE_FAULT in this scenario. KVM's behavior is allowed as both Intel and AMD define CR0/CR4 SMRAM fields as read-only, i.e. the only way for CR0 and/or CR4 to have illegal values is if they were modified by the L1 SMM handler, and Intel's SDM "SMRAM State Save Map" section states "modifying these registers will result in unpredictable behavior". KVM's ioctl() behavior is less straightforward. Because KVM allows ioctls() to be executed in any order, rejecting an ioctl() if it would result in invalid L2 guest state is not an option as KVM cannot know if a future ioctl() would resolve the invalid state, e.g. KVM_SET_SREGS, or drop the vCPU out of L2, e.g. KVM_SET_NESTED_STATE. Ideally, KVM would reject KVM_RUN if L2 contained invalid guest state, but that carries the risk of a false positive, e.g. if RSM loaded invalid guest state and KVM exited to userspace. Setting a flag/request to detect such a scenario is undesirable because (a) it's extremely unlikely to add value to KVM as a whole, and (b) KVM would need to consider ioctl() interactions with such a flag, e.g. if userspace migrated the vCPU while the flag were set. Cc: [email protected] Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Reviewed-by: Maxim Levitsky <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
1 parent a80dfc0 commit cd0e615

File tree

1 file changed

+24
-8
lines changed

1 file changed

+24
-8
lines changed

arch/x86/kvm/vmx/vmx.c

Lines changed: 24 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5882,18 +5882,14 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
58825882
vmx_flush_pml_buffer(vcpu);
58835883

58845884
/*
5885-
* We should never reach this point with a pending nested VM-Enter, and
5886-
* more specifically emulation of L2 due to invalid guest state (see
5887-
* below) should never happen as that means we incorrectly allowed a
5888-
* nested VM-Enter with an invalid vmcs12.
5885+
* KVM should never reach this point with a pending nested VM-Enter.
5886+
* More specifically, short-circuiting VM-Entry to emulate L2 due to
5887+
* invalid guest state should never happen as that means KVM knowingly
5888+
* allowed a nested VM-Enter with an invalid vmcs12. More below.
58895889
*/
58905890
if (KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm))
58915891
return -EIO;
58925892

5893-
/* If guest state is invalid, start emulating */
5894-
if (vmx->emulation_required)
5895-
return handle_invalid_guest_state(vcpu);
5896-
58975893
if (is_guest_mode(vcpu)) {
58985894
/*
58995895
* PML is never enabled when running L2, bail immediately if a
@@ -5915,10 +5911,30 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
59155911
*/
59165912
nested_mark_vmcs12_pages_dirty(vcpu);
59175913

5914+
/*
5915+
* Synthesize a triple fault if L2 state is invalid. In normal
5916+
* operation, nested VM-Enter rejects any attempt to enter L2
5917+
* with invalid state. However, those checks are skipped if
5918+
* state is being stuffed via RSM or KVM_SET_NESTED_STATE. If
5919+
* L2 state is invalid, it means either L1 modified SMRAM state
5920+
* or userspace provided bad state. Synthesize TRIPLE_FAULT as
5921+
* doing so is architecturally allowed in the RSM case, and is
5922+
* the least awful solution for the userspace case without
5923+
* risking false positives.
5924+
*/
5925+
if (vmx->emulation_required) {
5926+
nested_vmx_vmexit(vcpu, EXIT_REASON_TRIPLE_FAULT, 0, 0);
5927+
return 1;
5928+
}
5929+
59185930
if (nested_vmx_reflect_vmexit(vcpu))
59195931
return 1;
59205932
}
59215933

5934+
/* If guest state is invalid, start emulating. L2 is handled above. */
5935+
if (vmx->emulation_required)
5936+
return handle_invalid_guest_state(vcpu);
5937+
59225938
if (exit_reason.failed_vmentry) {
59235939
dump_vmcs(vcpu);
59245940
vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;

0 commit comments

Comments
 (0)