Skip to content

Commit a549688

Browse files
committed
Merge branch 'kvm-late-6.1-fixes' into HEAD
x86: * several fixes to nested VMX execution controls * fixes and clarification to the documentation for Xen emulation * do not unnecessarily release a pmu event with zero period * MMU fixes * fix Coverity warning in kvm_hv_flush_tlb() selftests: * fixes for the ucall mechanism in selftests * other fixes mostly related to compilation with clang
2 parents 1b929c0 + 129c48c commit a549688

File tree

28 files changed

+290
-287
lines changed

28 files changed

+290
-287
lines changed

Documentation/virt/kvm/api.rst

Lines changed: 26 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -5343,9 +5343,9 @@ KVM_XEN_ATTR_TYPE_SHARED_INFO
53435343
32 vCPUs in the shared_info page, KVM does not automatically do so
53445344
and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO be used
53455345
explicitly even when the vcpu_info for a given vCPU resides at the
5346-
"default" location in the shared_info page. This is because KVM is
5347-
not aware of the Xen CPU id which is used as the index into the
5348-
vcpu_info[] array, so cannot know the correct default location.
5346+
"default" location in the shared_info page. This is because KVM may
5347+
not be aware of the Xen CPU id which is used as the index into the
5348+
vcpu_info[] array, so may know the correct default location.
53495349

53505350
Note that the shared info page may be constantly written to by KVM;
53515351
it contains the event channel bitmap used to deliver interrupts to
@@ -5356,23 +5356,29 @@ KVM_XEN_ATTR_TYPE_SHARED_INFO
53565356
any vCPU has been running or any event channel interrupts can be
53575357
routed to the guest.
53585358

5359+
Setting the gfn to KVM_XEN_INVALID_GFN will disable the shared info
5360+
page.
5361+
53595362
KVM_XEN_ATTR_TYPE_UPCALL_VECTOR
53605363
Sets the exception vector used to deliver Xen event channel upcalls.
53615364
This is the HVM-wide vector injected directly by the hypervisor
53625365
(not through the local APIC), typically configured by a guest via
5363-
HVM_PARAM_CALLBACK_IRQ.
5366+
HVM_PARAM_CALLBACK_IRQ. This can be disabled again (e.g. for guest
5367+
SHUTDOWN_soft_reset) by setting it to zero.
53645368

53655369
KVM_XEN_ATTR_TYPE_EVTCHN
53665370
This attribute is available when the KVM_CAP_XEN_HVM ioctl indicates
53675371
support for KVM_XEN_HVM_CONFIG_EVTCHN_SEND features. It configures
53685372
an outbound port number for interception of EVTCHNOP_send requests
5369-
from the guest. A given sending port number may be directed back
5370-
to a specified vCPU (by APIC ID) / port / priority on the guest,
5371-
or to trigger events on an eventfd. The vCPU and priority can be
5372-
changed by setting KVM_XEN_EVTCHN_UPDATE in a subsequent call,
5373-
but other fields cannot change for a given sending port. A port
5374-
mapping is removed by using KVM_XEN_EVTCHN_DEASSIGN in the flags
5375-
field.
5373+
from the guest. A given sending port number may be directed back to
5374+
a specified vCPU (by APIC ID) / port / priority on the guest, or to
5375+
trigger events on an eventfd. The vCPU and priority can be changed
5376+
by setting KVM_XEN_EVTCHN_UPDATE in a subsequent call, but but other
5377+
fields cannot change for a given sending port. A port mapping is
5378+
removed by using KVM_XEN_EVTCHN_DEASSIGN in the flags field. Passing
5379+
KVM_XEN_EVTCHN_RESET in the flags field removes all interception of
5380+
outbound event channels. The values of the flags field are mutually
5381+
exclusive and cannot be combined as a bitmask.
53765382

53775383
KVM_XEN_ATTR_TYPE_XEN_VERSION
53785384
This attribute is available when the KVM_CAP_XEN_HVM ioctl indicates
@@ -5388,7 +5394,7 @@ KVM_XEN_ATTR_TYPE_RUNSTATE_UPDATE_FLAG
53885394
support for KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG. It enables the
53895395
XEN_RUNSTATE_UPDATE flag which allows guest vCPUs to safely read
53905396
other vCPUs' vcpu_runstate_info. Xen guests enable this feature via
5391-
the VM_ASST_TYPE_runstate_update_flag of the HYPERVISOR_vm_assist
5397+
the VMASST_TYPE_runstate_update_flag of the HYPERVISOR_vm_assist
53925398
hypercall.
53935399

53945400
4.127 KVM_XEN_HVM_GET_ATTR
@@ -5446,15 +5452,18 @@ KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
54465452
As with the shared_info page for the VM, the corresponding page may be
54475453
dirtied at any time if event channel interrupt delivery is enabled, so
54485454
userspace should always assume that the page is dirty without relying
5449-
on dirty logging.
5455+
on dirty logging. Setting the gpa to KVM_XEN_INVALID_GPA will disable
5456+
the vcpu_info.
54505457

54515458
KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO
54525459
Sets the guest physical address of an additional pvclock structure
54535460
for a given vCPU. This is typically used for guest vsyscall support.
5461+
Setting the gpa to KVM_XEN_INVALID_GPA will disable the structure.
54545462

54555463
KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR
54565464
Sets the guest physical address of the vcpu_runstate_info for a given
54575465
vCPU. This is how a Xen guest tracks CPU state such as steal time.
5466+
Setting the gpa to KVM_XEN_INVALID_GPA will disable the runstate area.
54585467

54595468
KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT
54605469
Sets the runstate (RUNSTATE_running/_runnable/_blocked/_offline) of
@@ -5487,15 +5496,17 @@ KVM_XEN_VCPU_ATTR_TYPE_TIMER
54875496
This attribute is available when the KVM_CAP_XEN_HVM ioctl indicates
54885497
support for KVM_XEN_HVM_CONFIG_EVTCHN_SEND features. It sets the
54895498
event channel port/priority for the VIRQ_TIMER of the vCPU, as well
5490-
as allowing a pending timer to be saved/restored.
5499+
as allowing a pending timer to be saved/restored. Setting the timer
5500+
port to zero disables kernel handling of the singleshot timer.
54915501

54925502
KVM_XEN_VCPU_ATTR_TYPE_UPCALL_VECTOR
54935503
This attribute is available when the KVM_CAP_XEN_HVM ioctl indicates
54945504
support for KVM_XEN_HVM_CONFIG_EVTCHN_SEND features. It sets the
54955505
per-vCPU local APIC upcall vector, configured by a Xen guest with
54965506
the HVMOP_set_evtchn_upcall_vector hypercall. This is typically
54975507
used by Windows guests, and is distinct from the HVM-wide upcall
5498-
vector configured with HVM_PARAM_CALLBACK_IRQ.
5508+
vector configured with HVM_PARAM_CALLBACK_IRQ. It is disabled by
5509+
setting the vector to zero.
54995510

55005511

55015512
4.129 KVM_XEN_VCPU_GET_ATTR
@@ -6577,11 +6588,6 @@ Please note that the kernel is allowed to use the kvm_run structure as the
65776588
primary storage for certain register types. Therefore, the kernel may use the
65786589
values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
65796590

6580-
::
6581-
6582-
};
6583-
6584-
65856591

65866592
6. Capabilities that can be enabled on vCPUs
65876593
============================================

Documentation/virt/kvm/locking.rst

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -16,17 +16,26 @@ The acquisition orders for mutexes are as follows:
1616
- kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
1717
them together is quite rare.
1818

19-
- Unlike kvm->slots_lock, kvm->slots_arch_lock is released before
20-
synchronize_srcu(&kvm->srcu). Therefore kvm->slots_arch_lock
21-
can be taken inside a kvm->srcu read-side critical section,
22-
while kvm->slots_lock cannot.
23-
2419
- kvm->mn_active_invalidate_count ensures that pairs of
2520
invalidate_range_start() and invalidate_range_end() callbacks
2621
use the same memslots array. kvm->slots_lock and kvm->slots_arch_lock
2722
are taken on the waiting side in install_new_memslots, so MMU notifiers
2823
must not take either kvm->slots_lock or kvm->slots_arch_lock.
2924

25+
For SRCU:
26+
27+
- ``synchronize_srcu(&kvm->srcu)`` is called _inside_
28+
the kvm->slots_lock critical section, therefore kvm->slots_lock
29+
cannot be taken inside a kvm->srcu read-side critical section.
30+
Instead, kvm->slots_arch_lock is released before the call
31+
to ``synchronize_srcu()`` and _can_ be taken inside a
32+
kvm->srcu read-side critical section.
33+
34+
- kvm->lock is taken inside kvm->srcu, therefore
35+
``synchronize_srcu(&kvm->srcu)`` cannot be called inside
36+
a kvm->lock critical section. If you cannot delay the
37+
call until after kvm->lock is released, use ``call_srcu``.
38+
3039
On x86:
3140

3241
- vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock

MAINTAINERS

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11468,7 +11468,7 @@ F: arch/x86/kvm/hyperv.*
1146811468
F: arch/x86/kvm/kvm_onhyperv.*
1146911469
F: arch/x86/kvm/svm/hyperv.*
1147011470
F: arch/x86/kvm/svm/svm_onhyperv.*
11471-
F: arch/x86/kvm/vmx/evmcs.*
11471+
F: arch/x86/kvm/vmx/hyperv.*
1147211472

1147311473
KVM X86 Xen (KVM/Xen)
1147411474
M: David Woodhouse <[email protected]>

arch/x86/kvm/hyperv.c

Lines changed: 36 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1769,6 +1769,7 @@ static bool hv_is_vp_in_sparse_set(u32 vp_id, u64 valid_bank_mask, u64 sparse_ba
17691769
}
17701770

17711771
struct kvm_hv_hcall {
1772+
/* Hypercall input data */
17721773
u64 param;
17731774
u64 ingpa;
17741775
u64 outgpa;
@@ -1779,12 +1780,21 @@ struct kvm_hv_hcall {
17791780
bool fast;
17801781
bool rep;
17811782
sse128_t xmm[HV_HYPERCALL_MAX_XMM_REGISTERS];
1783+
1784+
/*
1785+
* Current read offset when KVM reads hypercall input data gradually,
1786+
* either offset in bytes from 'ingpa' for regular hypercalls or the
1787+
* number of already consumed 'XMM halves' for 'fast' hypercalls.
1788+
*/
1789+
union {
1790+
gpa_t data_offset;
1791+
int consumed_xmm_halves;
1792+
};
17821793
};
17831794

17841795

17851796
static int kvm_hv_get_hc_data(struct kvm *kvm, struct kvm_hv_hcall *hc,
1786-
u16 orig_cnt, u16 cnt_cap, u64 *data,
1787-
int consumed_xmm_halves, gpa_t offset)
1797+
u16 orig_cnt, u16 cnt_cap, u64 *data)
17881798
{
17891799
/*
17901800
* Preserve the original count when ignoring entries via a "cap", KVM
@@ -1799,11 +1809,11 @@ static int kvm_hv_get_hc_data(struct kvm *kvm, struct kvm_hv_hcall *hc,
17991809
* Each XMM holds two sparse banks, but do not count halves that
18001810
* have already been consumed for hypercall parameters.
18011811
*/
1802-
if (orig_cnt > 2 * HV_HYPERCALL_MAX_XMM_REGISTERS - consumed_xmm_halves)
1812+
if (orig_cnt > 2 * HV_HYPERCALL_MAX_XMM_REGISTERS - hc->consumed_xmm_halves)
18031813
return HV_STATUS_INVALID_HYPERCALL_INPUT;
18041814

18051815
for (i = 0; i < cnt; i++) {
1806-
j = i + consumed_xmm_halves;
1816+
j = i + hc->consumed_xmm_halves;
18071817
if (j % 2)
18081818
data[i] = sse128_hi(hc->xmm[j / 2]);
18091819
else
@@ -1812,27 +1822,24 @@ static int kvm_hv_get_hc_data(struct kvm *kvm, struct kvm_hv_hcall *hc,
18121822
return 0;
18131823
}
18141824

1815-
return kvm_read_guest(kvm, hc->ingpa + offset, data,
1825+
return kvm_read_guest(kvm, hc->ingpa + hc->data_offset, data,
18161826
cnt * sizeof(*data));
18171827
}
18181828

18191829
static u64 kvm_get_sparse_vp_set(struct kvm *kvm, struct kvm_hv_hcall *hc,
1820-
u64 *sparse_banks, int consumed_xmm_halves,
1821-
gpa_t offset)
1830+
u64 *sparse_banks)
18221831
{
18231832
if (hc->var_cnt > HV_MAX_SPARSE_VCPU_BANKS)
18241833
return -EINVAL;
18251834

18261835
/* Cap var_cnt to ignore banks that cannot contain a legal VP index. */
18271836
return kvm_hv_get_hc_data(kvm, hc, hc->var_cnt, KVM_HV_MAX_SPARSE_VCPU_SET_BITS,
1828-
sparse_banks, consumed_xmm_halves, offset);
1837+
sparse_banks);
18291838
}
18301839

1831-
static int kvm_hv_get_tlb_flush_entries(struct kvm *kvm, struct kvm_hv_hcall *hc, u64 entries[],
1832-
int consumed_xmm_halves, gpa_t offset)
1840+
static int kvm_hv_get_tlb_flush_entries(struct kvm *kvm, struct kvm_hv_hcall *hc, u64 entries[])
18331841
{
1834-
return kvm_hv_get_hc_data(kvm, hc, hc->rep_cnt, hc->rep_cnt,
1835-
entries, consumed_xmm_halves, offset);
1842+
return kvm_hv_get_hc_data(kvm, hc, hc->rep_cnt, hc->rep_cnt, entries);
18361843
}
18371844

18381845
static void hv_tlb_flush_enqueue(struct kvm_vcpu *vcpu,
@@ -1926,8 +1933,6 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
19261933
struct kvm_vcpu *v;
19271934
unsigned long i;
19281935
bool all_cpus;
1929-
int consumed_xmm_halves = 0;
1930-
gpa_t data_offset;
19311936

19321937
/*
19331938
* The Hyper-V TLFS doesn't allow more than HV_MAX_SPARSE_VCPU_BANKS
@@ -1955,12 +1960,12 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
19551960
flush.address_space = hc->ingpa;
19561961
flush.flags = hc->outgpa;
19571962
flush.processor_mask = sse128_lo(hc->xmm[0]);
1958-
consumed_xmm_halves = 1;
1963+
hc->consumed_xmm_halves = 1;
19591964
} else {
19601965
if (unlikely(kvm_read_guest(kvm, hc->ingpa,
19611966
&flush, sizeof(flush))))
19621967
return HV_STATUS_INVALID_HYPERCALL_INPUT;
1963-
data_offset = sizeof(flush);
1968+
hc->data_offset = sizeof(flush);
19641969
}
19651970

19661971
trace_kvm_hv_flush_tlb(flush.processor_mask,
@@ -1985,12 +1990,12 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
19851990
flush_ex.flags = hc->outgpa;
19861991
memcpy(&flush_ex.hv_vp_set,
19871992
&hc->xmm[0], sizeof(hc->xmm[0]));
1988-
consumed_xmm_halves = 2;
1993+
hc->consumed_xmm_halves = 2;
19891994
} else {
19901995
if (unlikely(kvm_read_guest(kvm, hc->ingpa, &flush_ex,
19911996
sizeof(flush_ex))))
19921997
return HV_STATUS_INVALID_HYPERCALL_INPUT;
1993-
data_offset = sizeof(flush_ex);
1998+
hc->data_offset = sizeof(flush_ex);
19941999
}
19952000

19962001
trace_kvm_hv_flush_tlb_ex(flush_ex.hv_vp_set.valid_bank_mask,
@@ -2009,8 +2014,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
20092014
if (!hc->var_cnt)
20102015
goto ret_success;
20112016

2012-
if (kvm_get_sparse_vp_set(kvm, hc, sparse_banks,
2013-
consumed_xmm_halves, data_offset))
2017+
if (kvm_get_sparse_vp_set(kvm, hc, sparse_banks))
20142018
return HV_STATUS_INVALID_HYPERCALL_INPUT;
20152019
}
20162020

@@ -2021,17 +2025,18 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
20212025
* consumed_xmm_halves to make sure TLB flush entries are read
20222026
* from the correct offset.
20232027
*/
2024-
data_offset += hc->var_cnt * sizeof(sparse_banks[0]);
2025-
consumed_xmm_halves += hc->var_cnt;
2028+
if (hc->fast)
2029+
hc->consumed_xmm_halves += hc->var_cnt;
2030+
else
2031+
hc->data_offset += hc->var_cnt * sizeof(sparse_banks[0]);
20262032
}
20272033

20282034
if (hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE ||
20292035
hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX ||
20302036
hc->rep_cnt > ARRAY_SIZE(__tlb_flush_entries)) {
20312037
tlb_flush_entries = NULL;
20322038
} else {
2033-
if (kvm_hv_get_tlb_flush_entries(kvm, hc, __tlb_flush_entries,
2034-
consumed_xmm_halves, data_offset))
2039+
if (kvm_hv_get_tlb_flush_entries(kvm, hc, __tlb_flush_entries))
20352040
return HV_STATUS_INVALID_HYPERCALL_INPUT;
20362041
tlb_flush_entries = __tlb_flush_entries;
20372042
}
@@ -2180,9 +2185,13 @@ static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
21802185
if (!hc->var_cnt)
21812186
goto ret_success;
21822187

2183-
if (kvm_get_sparse_vp_set(kvm, hc, sparse_banks, 1,
2184-
offsetof(struct hv_send_ipi_ex,
2185-
vp_set.bank_contents)))
2188+
if (!hc->fast)
2189+
hc->data_offset = offsetof(struct hv_send_ipi_ex,
2190+
vp_set.bank_contents);
2191+
else
2192+
hc->consumed_xmm_halves = 1;
2193+
2194+
if (kvm_get_sparse_vp_set(kvm, hc, sparse_banks))
21862195
return HV_STATUS_INVALID_HYPERCALL_INPUT;
21872196
}
21882197

arch/x86/kvm/irq_comm.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -426,8 +426,9 @@ void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
426426
kvm_set_msi_irq(vcpu->kvm, entry, &irq);
427427

428428
if (irq.trig_mode &&
429-
kvm_apic_match_dest(vcpu, NULL, APIC_DEST_NOSHORT,
430-
irq.dest_id, irq.dest_mode))
429+
(kvm_apic_match_dest(vcpu, NULL, APIC_DEST_NOSHORT,
430+
irq.dest_id, irq.dest_mode) ||
431+
kvm_apic_pending_eoi(vcpu, irq.vector)))
431432
__set_bit(irq.vector, ioapic_handled_vectors);
432433
}
433434
}

arch/x86/kvm/lapic.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -188,11 +188,11 @@ static inline bool lapic_in_kernel(struct kvm_vcpu *vcpu)
188188

189189
extern struct static_key_false_deferred apic_hw_disabled;
190190

191-
static inline int kvm_apic_hw_enabled(struct kvm_lapic *apic)
191+
static inline bool kvm_apic_hw_enabled(struct kvm_lapic *apic)
192192
{
193193
if (static_branch_unlikely(&apic_hw_disabled.key))
194194
return apic->vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE;
195-
return MSR_IA32_APICBASE_ENABLE;
195+
return true;
196196
}
197197

198198
extern struct static_key_false_deferred apic_sw_disabled;

arch/x86/kvm/mmu/spte.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -363,7 +363,7 @@ static __always_inline bool is_rsvd_spte(struct rsvd_bits_validate *rsvd_check,
363363
* A shadow-present leaf SPTE may be non-writable for 4 possible reasons:
364364
*
365365
* 1. To intercept writes for dirty logging. KVM write-protects huge pages
366-
* so that they can be split be split down into the dirty logging
366+
* so that they can be split down into the dirty logging
367367
* granularity (4KiB) whenever the guest writes to them. KVM also
368368
* write-protects 4KiB pages so that writes can be recorded in the dirty log
369369
* (e.g. if not using PML). SPTEs are write-protected for dirty logging

0 commit comments

Comments
 (0)