Skip to content

Commit c09dd2b

Browse files
committed
Merge branch 'kvm-redo-enable-virt' into HEAD
Register KVM's cpuhp and syscore callbacks when enabling virtualization in hardware, as the sole purpose of said callbacks is to disable and re-enable virtualization as needed. The primary motivation for this series is to simplify dealing with enabling virtualization for Intel's TDX, which needs to enable virtualization when kvm-intel.ko is loaded, i.e. long before the first VM is created. That said, this is a nice cleanup on its own. By registering the callbacks on-demand, the callbacks themselves don't need to check kvm_usage_count, because their very existence implies a non-zero count. Patch 1 (re)adds a dedicated lock for kvm_usage_count. This avoids a lock ordering issue between cpus_read_lock() and kvm_lock. The lock ordering issue still exist in very rare cases, and will be fixed for good by switching vm_list to an (S)RCU-protected list. Signed-off-by: Paolo Bonzini <[email protected]>
2 parents 55f50b2 + 590b09b commit c09dd2b

File tree

18 files changed

+251
-198
lines changed

18 files changed

+251
-198
lines changed

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2648,6 +2648,23 @@
26482648

26492649
Default is Y (on).
26502650

2651+
kvm.enable_virt_at_load=[KVM,ARM64,LOONGARCH,MIPS,RISCV,X86]
2652+
If enabled, KVM will enable virtualization in hardware
2653+
when KVM is loaded, and disable virtualization when KVM
2654+
is unloaded (if KVM is built as a module).
2655+
2656+
If disabled, KVM will dynamically enable and disable
2657+
virtualization on-demand when creating and destroying
2658+
VMs, i.e. on the 0=>1 and 1=>0 transitions of the
2659+
number of VMs.
2660+
2661+
Enabling virtualization at module lode avoids potential
2662+
latency for creation of the 0=>1 VM, as KVM serializes
2663+
virtualization enabling across all online CPUs. The
2664+
"cost" of enabling virtualization when KVM is loaded,
2665+
is that doing so may interfere with using out-of-tree
2666+
hypervisors that want to "own" virtualization hardware.
2667+
26512668
kvm.enable_vmware_backdoor=[KVM] Support VMware backdoor PV interface.
26522669
Default is false (don't support).
26532670

Documentation/virt/kvm/locking.rst

Lines changed: 23 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ The acquisition orders for mutexes are as follows:
1111

1212
- cpus_read_lock() is taken outside kvm_lock
1313

14+
- kvm_usage_lock is taken outside cpus_read_lock()
15+
1416
- kvm->lock is taken outside vcpu->mutex
1517

1618
- kvm->lock is taken outside kvm->slots_lock and kvm->irq_lock
@@ -24,6 +26,12 @@ The acquisition orders for mutexes are as follows:
2426
are taken on the waiting side when modifying memslots, so MMU notifiers
2527
must not take either kvm->slots_lock or kvm->slots_arch_lock.
2628

29+
cpus_read_lock() vs kvm_lock:
30+
- Taking cpus_read_lock() outside of kvm_lock is problematic, despite that
31+
being the official ordering, as it is quite easy to unknowingly trigger
32+
cpus_read_lock() while holding kvm_lock. Use caution when walking vm_list,
33+
e.g. avoid complex operations when possible.
34+
2735
For SRCU:
2836

2937
- ``synchronize_srcu(&kvm->srcu)`` is called inside critical sections
@@ -227,10 +235,16 @@ time it will be set using the Dirty tracking mechanism described above.
227235
:Type: mutex
228236
:Arch: any
229237
:Protects: - vm_list
230-
- kvm_usage_count
238+
239+
``kvm_usage_lock``
240+
^^^^^^^^^^^^^^^^^^
241+
242+
:Type: mutex
243+
:Arch: any
244+
:Protects: - kvm_usage_count
231245
- hardware virtualization enable/disable
232-
:Comment: KVM also disables CPU hotplug via cpus_read_lock() during
233-
enable/disable.
246+
:Comment: Exists to allow taking cpus_read_lock() while kvm_usage_count is
247+
protected, which simplifies the virtualization enabling logic.
234248

235249
``kvm->mn_invalidate_lock``
236250
^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -290,11 +304,12 @@ time it will be set using the Dirty tracking mechanism described above.
290304
wakeup.
291305

292306
``vendor_module_lock``
293-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
307+
^^^^^^^^^^^^^^^^^^^^^^
294308
:Type: mutex
295309
:Arch: x86
296310
:Protects: loading a vendor module (kvm_amd or kvm_intel)
297-
:Comment: Exists because using kvm_lock leads to deadlock. cpu_hotplug_lock is
298-
taken outside of kvm_lock, e.g. in KVM's CPU online/offline callbacks, and
299-
many operations need to take cpu_hotplug_lock when loading a vendor module,
300-
e.g. updating static calls.
311+
:Comment: Exists because using kvm_lock leads to deadlock. kvm_lock is taken
312+
in notifiers, e.g. __kvmclock_cpufreq_notifier(), that may be invoked while
313+
cpu_hotplug_lock is held, e.g. from cpufreq_boost_trigger_state(), and many
314+
operations need to take cpu_hotplug_lock when loading a vendor module, e.g.
315+
updating static calls.

arch/arm64/kvm/arm.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2164,7 +2164,7 @@ static void cpu_hyp_uninit(void *discard)
21642164
}
21652165
}
21662166

2167-
int kvm_arch_hardware_enable(void)
2167+
int kvm_arch_enable_virtualization_cpu(void)
21682168
{
21692169
/*
21702170
* Most calls to this function are made with migration
@@ -2184,7 +2184,7 @@ int kvm_arch_hardware_enable(void)
21842184
return 0;
21852185
}
21862186

2187-
void kvm_arch_hardware_disable(void)
2187+
void kvm_arch_disable_virtualization_cpu(void)
21882188
{
21892189
kvm_timer_cpu_down();
21902190
kvm_vgic_cpu_down();
@@ -2380,7 +2380,7 @@ static int __init do_pkvm_init(u32 hyp_va_bits)
23802380

23812381
/*
23822382
* The stub hypercalls are now disabled, so set our local flag to
2383-
* prevent a later re-init attempt in kvm_arch_hardware_enable().
2383+
* prevent a later re-init attempt in kvm_arch_enable_virtualization_cpu().
23842384
*/
23852385
__this_cpu_write(kvm_hyp_initialized, 1);
23862386
preempt_enable();

arch/loongarch/kvm/main.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -261,7 +261,7 @@ long kvm_arch_dev_ioctl(struct file *filp,
261261
return -ENOIOCTLCMD;
262262
}
263263

264-
int kvm_arch_hardware_enable(void)
264+
int kvm_arch_enable_virtualization_cpu(void)
265265
{
266266
unsigned long env, gcfg = 0;
267267

@@ -300,7 +300,7 @@ int kvm_arch_hardware_enable(void)
300300
return 0;
301301
}
302302

303-
void kvm_arch_hardware_disable(void)
303+
void kvm_arch_disable_virtualization_cpu(void)
304304
{
305305
write_csr_gcfg(0);
306306
write_csr_gstat(0);

arch/mips/include/asm/kvm_host.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -728,8 +728,8 @@ struct kvm_mips_callbacks {
728728
int (*handle_fpe)(struct kvm_vcpu *vcpu);
729729
int (*handle_msa_disabled)(struct kvm_vcpu *vcpu);
730730
int (*handle_guest_exit)(struct kvm_vcpu *vcpu);
731-
int (*hardware_enable)(void);
732-
void (*hardware_disable)(void);
731+
int (*enable_virtualization_cpu)(void);
732+
void (*disable_virtualization_cpu)(void);
733733
int (*check_extension)(struct kvm *kvm, long ext);
734734
int (*vcpu_init)(struct kvm_vcpu *vcpu);
735735
void (*vcpu_uninit)(struct kvm_vcpu *vcpu);

arch/mips/kvm/mips.c

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -125,14 +125,14 @@ int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
125125
return 1;
126126
}
127127

128-
int kvm_arch_hardware_enable(void)
128+
int kvm_arch_enable_virtualization_cpu(void)
129129
{
130-
return kvm_mips_callbacks->hardware_enable();
130+
return kvm_mips_callbacks->enable_virtualization_cpu();
131131
}
132132

133-
void kvm_arch_hardware_disable(void)
133+
void kvm_arch_disable_virtualization_cpu(void)
134134
{
135-
kvm_mips_callbacks->hardware_disable();
135+
kvm_mips_callbacks->disable_virtualization_cpu();
136136
}
137137

138138
int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)

arch/mips/kvm/vz.c

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2869,7 +2869,7 @@ static unsigned int kvm_vz_resize_guest_vtlb(unsigned int size)
28692869
return ret + 1;
28702870
}
28712871

2872-
static int kvm_vz_hardware_enable(void)
2872+
static int kvm_vz_enable_virtualization_cpu(void)
28732873
{
28742874
unsigned int mmu_size, guest_mmu_size, ftlb_size;
28752875
u64 guest_cvmctl, cvmvmconfig;
@@ -2983,7 +2983,7 @@ static int kvm_vz_hardware_enable(void)
29832983
return 0;
29842984
}
29852985

2986-
static void kvm_vz_hardware_disable(void)
2986+
static void kvm_vz_disable_virtualization_cpu(void)
29872987
{
29882988
u64 cvmvmconfig;
29892989
unsigned int mmu_size;
@@ -3280,8 +3280,8 @@ static struct kvm_mips_callbacks kvm_vz_callbacks = {
32803280
.handle_msa_disabled = kvm_trap_vz_handle_msa_disabled,
32813281
.handle_guest_exit = kvm_trap_vz_handle_guest_exit,
32823282

3283-
.hardware_enable = kvm_vz_hardware_enable,
3284-
.hardware_disable = kvm_vz_hardware_disable,
3283+
.enable_virtualization_cpu = kvm_vz_enable_virtualization_cpu,
3284+
.disable_virtualization_cpu = kvm_vz_disable_virtualization_cpu,
32853285
.check_extension = kvm_vz_check_extension,
32863286
.vcpu_init = kvm_vz_vcpu_init,
32873287
.vcpu_uninit = kvm_vz_vcpu_uninit,

arch/riscv/kvm/main.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ long kvm_arch_dev_ioctl(struct file *filp,
2020
return -EINVAL;
2121
}
2222

23-
int kvm_arch_hardware_enable(void)
23+
int kvm_arch_enable_virtualization_cpu(void)
2424
{
2525
csr_write(CSR_HEDELEG, KVM_HEDELEG_DEFAULT);
2626
csr_write(CSR_HIDELEG, KVM_HIDELEG_DEFAULT);
@@ -35,7 +35,7 @@ int kvm_arch_hardware_enable(void)
3535
return 0;
3636
}
3737

38-
void kvm_arch_hardware_disable(void)
38+
void kvm_arch_disable_virtualization_cpu(void)
3939
{
4040
kvm_riscv_aia_disable();
4141

arch/x86/include/asm/kvm-x86-ops.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ BUILD_BUG_ON(1)
1414
* be __static_call_return0.
1515
*/
1616
KVM_X86_OP(check_processor_compatibility)
17-
KVM_X86_OP(hardware_enable)
18-
KVM_X86_OP(hardware_disable)
17+
KVM_X86_OP(enable_virtualization_cpu)
18+
KVM_X86_OP(disable_virtualization_cpu)
1919
KVM_X86_OP(hardware_unsetup)
2020
KVM_X86_OP(has_emulated_msr)
2121
KVM_X86_OP(vcpu_after_set_cpuid)

arch/x86/include/asm/kvm_host.h

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@
3636
#include <asm/kvm_page_track.h>
3737
#include <asm/kvm_vcpu_regs.h>
3838
#include <asm/hyperv-tlfs.h>
39+
#include <asm/reboot.h>
3940

4041
#define __KVM_HAVE_ARCH_VCPU_DEBUGFS
4142

@@ -1629,8 +1630,10 @@ struct kvm_x86_ops {
16291630

16301631
int (*check_processor_compatibility)(void);
16311632

1632-
int (*hardware_enable)(void);
1633-
void (*hardware_disable)(void);
1633+
int (*enable_virtualization_cpu)(void);
1634+
void (*disable_virtualization_cpu)(void);
1635+
cpu_emergency_virt_cb *emergency_disable_virtualization_cpu;
1636+
16341637
void (*hardware_unsetup)(void);
16351638
bool (*has_emulated_msr)(struct kvm *kvm, u32 index);
16361639
void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu);

0 commit comments

Comments
 (0)