Skip to content

Commit 85542ad

Browse files
tpressuresean-jc
authored andcommitted
KVM: x86: Add KVM_RUN_X86_GUEST_MODE kvm_run flag
When a vCPU is interrupted by a signal while running a nested guest, KVM will exit to userspace with L2 state. However, userspace has no way to know whether it sees L1 or L2 state (besides calling KVM_GET_STATS_FD, which does not have a stable ABI). This causes multiple problems: The simplest one is L2 state corruption when userspace marks the sregs as dirty. See this mailing list thread [1] for a complete discussion. Another problem is that if userspace decides to continue by emulating instructions, it will unknowingly emulate with L2 state as if L1 doesn't exist, which can be considered a weird guest escape. Introduce a new flag, KVM_RUN_X86_GUEST_MODE, in the kvm_run data structure, which is set when the vCPU exited while running a nested guest. Also introduce a new capability, KVM_CAP_X86_GUEST_MODE, to advertise the functionality to userspace. [1] https://lore.kernel.org/kvm/[email protected]/T/#m280aadcb2e10ae02c191a7dc4ed4b711a74b1f55 Signed-off-by: Thomas Prescher <[email protected]> Signed-off-by: Julian Stecklina <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
1 parent 508f0c7 commit 85542ad

File tree

4 files changed

+22
-0
lines changed

4 files changed

+22
-0
lines changed

Documentation/virt/kvm/api.rst

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6419,6 +6419,9 @@ affect the device's behavior. Current defined flags::
64196419
#define KVM_RUN_X86_SMM (1 << 0)
64206420
/* x86, set if bus lock detected in VM */
64216421
#define KVM_RUN_X86_BUS_LOCK (1 << 1)
6422+
/* x86, set if the VCPU is executing a nested (L2) guest */
6423+
#define KVM_RUN_X86_GUEST_MODE (1 << 2)
6424+
64226425
/* arm64, set for KVM_EXIT_DEBUG */
64236426
#define KVM_DEBUG_ARCH_HSR_HIGH_VALID (1 << 0)
64246427

@@ -8089,6 +8092,20 @@ by KVM_CHECK_EXTENSION.
80898092
Note: Userspace is responsible for correctly configuring CPUID 0x15, a.k.a. the
80908093
core crystal clock frequency, if a non-zero CPUID 0x15 is exposed to the guest.
80918094

8095+
7.36 KVM_CAP_X86_GUEST_MODE
8096+
------------------------------
8097+
8098+
:Architectures: x86
8099+
:Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP.
8100+
8101+
The presence of this capability indicates that KVM_RUN will update the
8102+
KVM_RUN_X86_GUEST_MODE bit in kvm_run.flags to indicate whether the
8103+
vCPU was executing nested guest code when it exited.
8104+
8105+
KVM exits with the register state of either the L1 or L2 guest
8106+
depending on which executed at the time of an exit. Userspace must
8107+
take care to differentiate between these cases.
8108+
80928109
8. Other capabilities.
80938110
======================
80948111

arch/x86/include/uapi/asm/kvm.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,7 @@ struct kvm_ioapic_state {
106106

107107
#define KVM_RUN_X86_SMM (1 << 0)
108108
#define KVM_RUN_X86_BUS_LOCK (1 << 1)
109+
#define KVM_RUN_X86_GUEST_MODE (1 << 2)
109110

110111
/* for KVM_GET_REGS and KVM_SET_REGS */
111112
struct kvm_regs {

arch/x86/kvm/x86.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4704,6 +4704,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
47044704
case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
47054705
case KVM_CAP_IRQFD_RESAMPLE:
47064706
case KVM_CAP_MEMORY_FAULT_INFO:
4707+
case KVM_CAP_X86_GUEST_MODE:
47074708
r = 1;
47084709
break;
47094710
case KVM_CAP_X86_APIC_BUS_CYCLES_NS:
@@ -10277,6 +10278,8 @@ static void post_kvm_run_save(struct kvm_vcpu *vcpu)
1027710278

1027810279
if (is_smm(vcpu))
1027910280
kvm_run->flags |= KVM_RUN_X86_SMM;
10281+
if (is_guest_mode(vcpu))
10282+
kvm_run->flags |= KVM_RUN_X86_GUEST_MODE;
1028010283
}
1028110284

1028210285
static void update_cr8_intercept(struct kvm_vcpu *vcpu)

include/uapi/linux/kvm.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -918,6 +918,7 @@ struct kvm_enable_cap {
918918
#define KVM_CAP_GUEST_MEMFD 234
919919
#define KVM_CAP_VM_TYPES 235
920920
#define KVM_CAP_X86_APIC_BUS_CYCLES_NS 236
921+
#define KVM_CAP_X86_GUEST_MODE 237
921922

922923
struct kvm_irq_routing_irqchip {
923924
__u32 irqchip;

0 commit comments

Comments
 (0)