Skip to content

Commit 89f9edf

Browse files
Kullu14sean-jc
authored andcommitted
KVM: SVM: Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs
Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM CPUs with Bus Lock Threshold, which is close enough to VMX's Bus Lock Detection VM-Exit to allow reusing KVM_CAP_X86_BUS_LOCK_EXIT. The biggest difference between the two features is that Threshold is fault-like, whereas Detection is trap-like. To allow the guest to make forward progress, Threshold provides a per-VMCB counter which is decremented every time a bus lock occurs, and a VM-Exit is triggered if and only if the counter is '0'. To provide Detection-like semantics, initialize the counter to '0', i.e. exit on every bus lock, and when re-executing the guilty instruction, set the counter to '1' to effectively step past the instruction. Note, in the unlikely scenario that re-executing the instruction doesn't trigger a bus lock, e.g. because the guest has changed memory types or patched the guilty instruction, the bus lock counter will be left at '1', i.e. the guest will be able to do a bus lock on a different instruction. In a perfect world, KVM would ensure the counter is '0' if the guest has made forward progress, e.g. if RIP has changed. But trying to close that hole would incur non-trivial complexity, for marginal benefit; the intent of KVM_CAP_X86_BUS_LOCK_EXIT is to allow userspace rate-limit bus locks, not to allow for precise detection of problematic guest code. And, it's simply not feasible to fully close the hole, e.g. if an interrupt arrives before the original instruction can re-execute, the guest could step past a different bus lock. Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Manali Shukla <[email protected]> Link: https://lore.kernel.org/r/[email protected] [sean: fix typo in comment] Signed-off-by: Sean Christopherson <[email protected]>
1 parent 827547b commit 89f9edf

File tree

4 files changed

+78
-0
lines changed

4 files changed

+78
-0
lines changed

Documentation/virt/kvm/api.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7989,6 +7989,11 @@ apply some other policy-based mitigation. When exiting to userspace, KVM sets
79897989
KVM_RUN_X86_BUS_LOCK in vcpu-run->flags, and conditionally sets the exit_reason
79907990
to KVM_EXIT_X86_BUS_LOCK.
79917991

7992+
Due to differences in the underlying hardware implementation, the vCPU's RIP at
7993+
the time of exit diverges between Intel and AMD. On Intel hosts, RIP points at
7994+
the next instruction, i.e. the exit is trap-like. On AMD hosts, RIP points at
7995+
the offending instruction, i.e. the exit is fault-like.
7996+
79927997
Note! Detected bus locks may be coincident with other exits to userspace, i.e.
79937998
KVM_RUN_X86_BUS_LOCK should be checked regardless of the primary exit reason if
79947999
userspace wants to take action on all detected bus locks.

arch/x86/kvm/svm/nested.c

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -678,6 +678,33 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
678678
vmcb02->control.iopm_base_pa = vmcb01->control.iopm_base_pa;
679679
vmcb02->control.msrpm_base_pa = vmcb01->control.msrpm_base_pa;
680680

681+
/*
682+
* Stash vmcb02's counter if the guest hasn't moved past the guilty
683+
* instruction; otherwise, reset the counter to '0'.
684+
*
685+
* In order to detect if L2 has made forward progress or not, track the
686+
* RIP at which a bus lock has occurred on a per-vmcb12 basis. If RIP
687+
* is changed, guest has clearly made forward progress, bus_lock_counter
688+
* still remained '1', so reset bus_lock_counter to '0'. Eg. In the
689+
* scenario, where a buslock happened in L1 before VMRUN, the bus lock
690+
* firmly happened on an instruction in the past. Even if vmcb01's
691+
* counter is still '1', (because the guilty instruction got patched),
692+
* the vCPU has clearly made forward progress and so KVM should reset
693+
* vmcb02's counter to '0'.
694+
*
695+
* If the RIP hasn't changed, stash the bus lock counter at nested VMRUN
696+
* to prevent the same guilty instruction from triggering a VM-Exit. Eg.
697+
* if userspace rate-limits the vCPU, then it's entirely possible that
698+
* L1's tick interrupt is pending by the time userspace re-runs the
699+
* vCPU. If KVM unconditionally clears the counter on VMRUN, then when
700+
* L1 re-enters L2, the same instruction will trigger a VM-Exit and the
701+
* entire cycle start over.
702+
*/
703+
if (vmcb02->save.rip && (svm->nested.ctl.bus_lock_rip == vmcb02->save.rip))
704+
vmcb02->control.bus_lock_counter = 1;
705+
else
706+
vmcb02->control.bus_lock_counter = 0;
707+
681708
/* Done at vmrun: asid. */
682709

683710
/* Also overwritten later if necessary. */
@@ -1039,6 +1066,13 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
10391066

10401067
}
10411068

1069+
/*
1070+
* Invalidate bus_lock_rip unless KVM is still waiting for the guest
1071+
* to make forward progress before re-enabling bus lock detection.
1072+
*/
1073+
if (!vmcb02->control.bus_lock_counter)
1074+
svm->nested.ctl.bus_lock_rip = INVALID_GPA;
1075+
10421076
nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
10431077

10441078
svm_switch_vmcb(svm, &svm->vmcb01);

arch/x86/kvm/svm/svm.c

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1384,6 +1384,9 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
13841384
svm->vmcb->control.int_ctl |= V_GIF_ENABLE_MASK;
13851385
}
13861386

1387+
if (vcpu->kvm->arch.bus_lock_detection_enabled)
1388+
svm_set_intercept(svm, INTERCEPT_BUSLOCK);
1389+
13871390
if (sev_guest(vcpu->kvm))
13881391
sev_init_vmcb(svm);
13891392

@@ -3306,6 +3309,37 @@ static int invpcid_interception(struct kvm_vcpu *vcpu)
33063309
return kvm_handle_invpcid(vcpu, type, gva);
33073310
}
33083311

3312+
static inline int complete_userspace_buslock(struct kvm_vcpu *vcpu)
3313+
{
3314+
struct vcpu_svm *svm = to_svm(vcpu);
3315+
3316+
/*
3317+
* If userspace has NOT changed RIP, then KVM's ABI is to let the guest
3318+
* execute the bus-locking instruction. Set the bus lock counter to '1'
3319+
* to effectively step past the bus lock.
3320+
*/
3321+
if (kvm_is_linear_rip(vcpu, vcpu->arch.cui_linear_rip))
3322+
svm->vmcb->control.bus_lock_counter = 1;
3323+
3324+
return 1;
3325+
}
3326+
3327+
static int bus_lock_exit(struct kvm_vcpu *vcpu)
3328+
{
3329+
struct vcpu_svm *svm = to_svm(vcpu);
3330+
3331+
vcpu->run->exit_reason = KVM_EXIT_X86_BUS_LOCK;
3332+
vcpu->run->flags |= KVM_RUN_X86_BUS_LOCK;
3333+
3334+
vcpu->arch.cui_linear_rip = kvm_get_linear_rip(vcpu);
3335+
vcpu->arch.complete_userspace_io = complete_userspace_buslock;
3336+
3337+
if (is_guest_mode(vcpu))
3338+
svm->nested.ctl.bus_lock_rip = vcpu->arch.cui_linear_rip;
3339+
3340+
return 0;
3341+
}
3342+
33093343
static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
33103344
[SVM_EXIT_READ_CR0] = cr_interception,
33113345
[SVM_EXIT_READ_CR3] = cr_interception,
@@ -3375,6 +3409,7 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
33753409
[SVM_EXIT_INVPCID] = invpcid_interception,
33763410
[SVM_EXIT_IDLE_HLT] = kvm_emulate_halt,
33773411
[SVM_EXIT_NPF] = npf_interception,
3412+
[SVM_EXIT_BUS_LOCK] = bus_lock_exit,
33783413
[SVM_EXIT_RSM] = rsm_interception,
33793414
[SVM_EXIT_AVIC_INCOMPLETE_IPI] = avic_incomplete_ipi_interception,
33803415
[SVM_EXIT_AVIC_UNACCELERATED_ACCESS] = avic_unaccelerated_access_interception,
@@ -5377,6 +5412,9 @@ static __init void svm_set_cpu_caps(void)
53775412
kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
53785413
}
53795414

5415+
if (cpu_feature_enabled(X86_FEATURE_BUS_LOCK_THRESHOLD))
5416+
kvm_caps.has_bus_lock_exit = true;
5417+
53805418
/* CPUID 0x80000008 */
53815419
if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD) ||
53825420
boot_cpu_has(X86_FEATURE_AMD_SSBD))

arch/x86/kvm/svm/svm.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,7 @@ struct vmcb_ctrl_area_cached {
173173
u64 nested_cr3;
174174
u64 virt_ext;
175175
u32 clean;
176+
u64 bus_lock_rip;
176177
union {
177178
#if IS_ENABLED(CONFIG_HYPERV) || IS_ENABLED(CONFIG_KVM_HYPERV)
178179
struct hv_vmcb_enlightenments hv_enlightenments;

0 commit comments

Comments
 (0)