Skip to content

Commit c7de79e

Browse files
committed
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini: "ARM: - Fix use of u64_replace_bits() in adjusting the guest's view of MDCR_EL2.HPMN RISC-V: - Fix an issue related to timer cleanup when exiting to user-space - Fix a race-condition in updating interrupts enabled for the guest when IMSIC is hardware-virtualized x86: - Reject KVM_SET_TSC_KHZ for guests with a protected TSC (currently only TDX) - Ensure struct kvm_tdx_capabilities fields that are not explicitly set by KVM are zeroed Documentation: - Explain how KVM contributions should be made testable - Fix a formatting goof in the TDX documentation" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: TDX: Don't report base TDVMCALLs KVM: VMX: Ensure unused kvm_tdx_capabilities fields are zeroed out KVM: Documentation: document how KVM is tested KVM: Documentation: minimal updates to review-checklist.rst KVM: x86: Reject KVM_SET_TSC_KHZ vCPU ioctl for TSC protected guest RISC-V: KVM: Move HGEI[E|P] CSR access to IMSIC virtualization RISC-V: KVM: Disable vstimecmp before exiting to user-space Documentation: KVM: Fix unexpected unindent warning KVM: arm64: Fix enforcement of upper bound on MDCR_EL2.HPMN
2 parents e347810 + 4b7d440 commit c7de79e

File tree

11 files changed

+180
-70
lines changed

11 files changed

+180
-70
lines changed

Documentation/virt/kvm/api.rst

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2008,6 +2008,13 @@ If the KVM_CAP_VM_TSC_CONTROL capability is advertised, this can also
20082008
be used as a vm ioctl to set the initial tsc frequency of subsequently
20092009
created vCPUs.
20102010

2011+
For TSC protected Confidential Computing (CoCo) VMs where TSC frequency
2012+
is configured once at VM scope and remains unchanged during VM's
2013+
lifetime, the vm ioctl should be used to configure the TSC frequency
2014+
and the vcpu ioctl is not supported.
2015+
2016+
Example of such CoCo VMs: TDX guests.
2017+
20112018
4.56 KVM_GET_TSC_KHZ
20122019
--------------------
20132020

@@ -7230,8 +7237,8 @@ inputs and outputs of the TDVMCALL. Currently the following values of
72307237
placed in fields from ``r11`` to ``r14`` of the ``get_tdvmcall_info``
72317238
field of the union.
72327239

7233-
* ``TDVMCALL_SETUP_EVENT_NOTIFY_INTERRUPT``: the guest has requested to
7234-
set up a notification interrupt for vector ``vector``.
7240+
* ``TDVMCALL_SETUP_EVENT_NOTIFY_INTERRUPT``: the guest has requested to
7241+
set up a notification interrupt for vector ``vector``.
72357242

72367243
KVM may add support for more values in the future that may cause a userspace
72377244
exit, even without calls to ``KVM_ENABLE_CAP`` or similar. In this case,

Documentation/virt/kvm/review-checklist.rst

Lines changed: 88 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Review checklist for kvm patches
77
1. The patch must follow Documentation/process/coding-style.rst and
88
Documentation/process/submitting-patches.rst.
99

10-
2. Patches should be against kvm.git master branch.
10+
2. Patches should be against kvm.git master or next branches.
1111

1212
3. If the patch introduces or modifies a new userspace API:
1313
- the API must be documented in Documentation/virt/kvm/api.rst
@@ -18,10 +18,10 @@ Review checklist for kvm patches
1818
5. New features must default to off (userspace should explicitly request them).
1919
Performance improvements can and should default to on.
2020

21-
6. New cpu features should be exposed via KVM_GET_SUPPORTED_CPUID2
21+
6. New cpu features should be exposed via KVM_GET_SUPPORTED_CPUID2,
22+
or its equivalent for non-x86 architectures
2223

23-
7. Emulator changes should be accompanied by unit tests for qemu-kvm.git
24-
kvm/test directory.
24+
7. The feature should be testable (see below).
2525

2626
8. Changes should be vendor neutral when possible. Changes to common code
2727
are better than duplicating changes to vendor code.
@@ -36,6 +36,87 @@ Review checklist for kvm patches
3636
11. New guest visible features must either be documented in a hardware manual
3737
or be accompanied by documentation.
3838

39-
12. Features must be robust against reset and kexec - for example, shared
40-
host/guest memory must be unshared to prevent the host from writing to
41-
guest memory that the guest has not reserved for this purpose.
39+
Testing of KVM code
40+
-------------------
41+
42+
All features contributed to KVM, and in many cases bugfixes too, should be
43+
accompanied by some kind of tests and/or enablement in open source guests
44+
and VMMs. KVM is covered by multiple test suites:
45+
46+
*Selftests*
47+
These are low level tests that allow granular testing of kernel APIs.
48+
This includes API failure scenarios, invoking APIs after specific
49+
guest instructions, and testing multiple calls to ``KVM_CREATE_VM``
50+
within a single test. They are included in the kernel tree at
51+
``tools/testing/selftests/kvm``.
52+
53+
``kvm-unit-tests``
54+
A collection of small guests that test CPU and emulated device features
55+
from a guest's perspective. They run under QEMU or ``kvmtool``, and
56+
are generally not KVM-specific: they can be run with any accelerator
57+
that QEMU support or even on bare metal, making it possible to compare
58+
behavior across hypervisors and processor families.
59+
60+
Functional test suites
61+
Various sets of functional tests exist, such as QEMU's ``tests/functional``
62+
suite and `avocado-vt <https://avocado-vt.readthedocs.io/en/latest/>`__.
63+
These typically involve running a full operating system in a virtual
64+
machine.
65+
66+
The best testing approach depends on the feature's complexity and
67+
operation. Here are some examples and guidelines:
68+
69+
New instructions (no new registers or APIs)
70+
The corresponding CPU features (if applicable) should be made available
71+
in QEMU. If the instructions require emulation support or other code in
72+
KVM, it is worth adding coverage to ``kvm-unit-tests`` or selftests;
73+
the latter can be a better choice if the instructions relate to an API
74+
that already has good selftest coverage.
75+
76+
New hardware features (new registers, no new APIs)
77+
These should be tested via ``kvm-unit-tests``; this more or less implies
78+
supporting them in QEMU and/or ``kvmtool``. In some cases selftests
79+
can be used instead, similar to the previous case, or specifically to
80+
test corner cases in guest state save/restore.
81+
82+
Bug fixes and performance improvements
83+
These usually do not introduce new APIs, but it's worth sharing
84+
any benchmarks and tests that will validate your contribution,
85+
ideally in the form of regression tests. Tests and benchmarks
86+
can be included in either ``kvm-unit-tests`` or selftests, depending
87+
on the specifics of your change. Selftests are especially useful for
88+
regression tests because they are included directly in Linux's tree.
89+
90+
Large scale internal changes
91+
While it's difficult to provide a single policy, you should ensure that
92+
the changed code is covered by either ``kvm-unit-tests`` or selftests.
93+
In some cases the affected code is run for any guests and functional
94+
tests suffice. Explain your testing process in the cover letter,
95+
as that can help identify gaps in existing test suites.
96+
97+
New APIs
98+
It is important to demonstrate your use case. This can be as simple as
99+
explaining that the feature is already in use on bare metal, or it can be
100+
a proof-of-concept implementation in userspace. The latter need not be
101+
open source, though that is of course preferrable for easier testing.
102+
Selftests should test corner cases of the APIs, and should also cover
103+
basic host and guest operation if no open source VMM uses the feature.
104+
105+
Bigger features, usually spanning host and guest
106+
These should be supported by Linux guests, with limited exceptions for
107+
Hyper-V features that are testable on Windows guests. It is strongly
108+
suggested that the feature be usable with an open source host VMM, such
109+
as at least one of QEMU or crosvm, and guest firmware. Selftests should
110+
test at least API error cases. Guest operation can be covered by
111+
either selftests of ``kvm-unit-tests`` (this is especially important for
112+
paravirtualized and Windows-only features). Strong selftest coverage
113+
can also be a replacement for implementation in an open source VMM,
114+
but this is generally not recommended.
115+
116+
Following the above suggestions for testing in selftests and
117+
``kvm-unit-tests`` will make it easier for the maintainers to review
118+
and accept your code. In fact, even before you contribute your changes
119+
upstream it will make it easier for you to develop for KVM.
120+
121+
Of course, the KVM maintainers reserve the right to require more tests,
122+
though they may also waive the requirement from time to time.

arch/arm64/kvm/sys_regs.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2624,7 +2624,7 @@ static bool access_mdcr(struct kvm_vcpu *vcpu,
26242624
*/
26252625
if (hpmn > vcpu->kvm->arch.nr_pmu_counters) {
26262626
hpmn = vcpu->kvm->arch.nr_pmu_counters;
2627-
u64_replace_bits(val, hpmn, MDCR_EL2_HPMN);
2627+
u64p_replace_bits(&val, hpmn, MDCR_EL2_HPMN);
26282628
}
26292629

26302630
__vcpu_assign_sys_reg(vcpu, MDCR_EL2, val);

arch/riscv/include/asm/kvm_aia.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,9 @@ DECLARE_STATIC_KEY_FALSE(kvm_riscv_aia_available);
8787

8888
extern struct kvm_device_ops kvm_riscv_aia_device_ops;
8989

90+
bool kvm_riscv_vcpu_aia_imsic_has_interrupt(struct kvm_vcpu *vcpu);
91+
void kvm_riscv_vcpu_aia_imsic_load(struct kvm_vcpu *vcpu, int cpu);
92+
void kvm_riscv_vcpu_aia_imsic_put(struct kvm_vcpu *vcpu);
9093
void kvm_riscv_vcpu_aia_imsic_release(struct kvm_vcpu *vcpu);
9194
int kvm_riscv_vcpu_aia_imsic_update(struct kvm_vcpu *vcpu);
9295

@@ -161,7 +164,6 @@ void kvm_riscv_aia_destroy_vm(struct kvm *kvm);
161164
int kvm_riscv_aia_alloc_hgei(int cpu, struct kvm_vcpu *owner,
162165
void __iomem **hgei_va, phys_addr_t *hgei_pa);
163166
void kvm_riscv_aia_free_hgei(int cpu, int hgei);
164-
void kvm_riscv_aia_wakeon_hgei(struct kvm_vcpu *owner, bool enable);
165167

166168
void kvm_riscv_aia_enable(void);
167169
void kvm_riscv_aia_disable(void);

arch/riscv/include/asm/kvm_host.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -306,6 +306,9 @@ static inline bool kvm_arch_pmi_in_guest(struct kvm_vcpu *vcpu)
306306
return IS_ENABLED(CONFIG_GUEST_PERF_EVENTS) && !!vcpu;
307307
}
308308

309+
static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
310+
static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
311+
309312
#define KVM_RISCV_GSTAGE_TLB_MIN_ORDER 12
310313

311314
void kvm_riscv_local_hfence_gvma_vmid_gpa(unsigned long vmid,

arch/riscv/kvm/aia.c

Lines changed: 7 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -30,28 +30,6 @@ unsigned int kvm_riscv_aia_nr_hgei;
3030
unsigned int kvm_riscv_aia_max_ids;
3131
DEFINE_STATIC_KEY_FALSE(kvm_riscv_aia_available);
3232

33-
static int aia_find_hgei(struct kvm_vcpu *owner)
34-
{
35-
int i, hgei;
36-
unsigned long flags;
37-
struct aia_hgei_control *hgctrl = get_cpu_ptr(&aia_hgei);
38-
39-
raw_spin_lock_irqsave(&hgctrl->lock, flags);
40-
41-
hgei = -1;
42-
for (i = 1; i <= kvm_riscv_aia_nr_hgei; i++) {
43-
if (hgctrl->owners[i] == owner) {
44-
hgei = i;
45-
break;
46-
}
47-
}
48-
49-
raw_spin_unlock_irqrestore(&hgctrl->lock, flags);
50-
51-
put_cpu_ptr(&aia_hgei);
52-
return hgei;
53-
}
54-
5533
static inline unsigned long aia_hvictl_value(bool ext_irq_pending)
5634
{
5735
unsigned long hvictl;
@@ -95,7 +73,6 @@ void kvm_riscv_vcpu_aia_sync_interrupts(struct kvm_vcpu *vcpu)
9573

9674
bool kvm_riscv_vcpu_aia_has_interrupts(struct kvm_vcpu *vcpu, u64 mask)
9775
{
98-
int hgei;
9976
unsigned long seip;
10077

10178
if (!kvm_riscv_aia_available())
@@ -114,11 +91,7 @@ bool kvm_riscv_vcpu_aia_has_interrupts(struct kvm_vcpu *vcpu, u64 mask)
11491
if (!kvm_riscv_aia_initialized(vcpu->kvm) || !seip)
11592
return false;
11693

117-
hgei = aia_find_hgei(vcpu);
118-
if (hgei > 0)
119-
return !!(ncsr_read(CSR_HGEIP) & BIT(hgei));
120-
121-
return false;
94+
return kvm_riscv_vcpu_aia_imsic_has_interrupt(vcpu);
12295
}
12396

12497
void kvm_riscv_vcpu_aia_update_hvip(struct kvm_vcpu *vcpu)
@@ -164,6 +137,9 @@ void kvm_riscv_vcpu_aia_load(struct kvm_vcpu *vcpu, int cpu)
164137
csr_write(CSR_HVIPRIO2H, csr->hviprio2h);
165138
#endif
166139
}
140+
141+
if (kvm_riscv_aia_initialized(vcpu->kvm))
142+
kvm_riscv_vcpu_aia_imsic_load(vcpu, cpu);
167143
}
168144

169145
void kvm_riscv_vcpu_aia_put(struct kvm_vcpu *vcpu)
@@ -174,6 +150,9 @@ void kvm_riscv_vcpu_aia_put(struct kvm_vcpu *vcpu)
174150
if (!kvm_riscv_aia_available())
175151
return;
176152

153+
if (kvm_riscv_aia_initialized(vcpu->kvm))
154+
kvm_riscv_vcpu_aia_imsic_put(vcpu);
155+
177156
if (kvm_riscv_nacl_available()) {
178157
nsh = nacl_shmem();
179158
csr->vsiselect = nacl_csr_read(nsh, CSR_VSISELECT);
@@ -472,22 +451,6 @@ void kvm_riscv_aia_free_hgei(int cpu, int hgei)
472451
raw_spin_unlock_irqrestore(&hgctrl->lock, flags);
473452
}
474453

475-
void kvm_riscv_aia_wakeon_hgei(struct kvm_vcpu *owner, bool enable)
476-
{
477-
int hgei;
478-
479-
if (!kvm_riscv_aia_available())
480-
return;
481-
482-
hgei = aia_find_hgei(owner);
483-
if (hgei > 0) {
484-
if (enable)
485-
csr_set(CSR_HGEIE, BIT(hgei));
486-
else
487-
csr_clear(CSR_HGEIE, BIT(hgei));
488-
}
489-
}
490-
491454
static irqreturn_t hgei_interrupt(int irq, void *dev_id)
492455
{
493456
int i;

arch/riscv/kvm/aia_imsic.c

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -676,6 +676,48 @@ static void imsic_swfile_update(struct kvm_vcpu *vcpu,
676676
imsic_swfile_extirq_update(vcpu);
677677
}
678678

679+
bool kvm_riscv_vcpu_aia_imsic_has_interrupt(struct kvm_vcpu *vcpu)
680+
{
681+
struct imsic *imsic = vcpu->arch.aia_context.imsic_state;
682+
unsigned long flags;
683+
bool ret = false;
684+
685+
/*
686+
* The IMSIC SW-file directly injects interrupt via hvip so
687+
* only check for interrupt when IMSIC VS-file is being used.
688+
*/
689+
690+
read_lock_irqsave(&imsic->vsfile_lock, flags);
691+
if (imsic->vsfile_cpu > -1)
692+
ret = !!(csr_read(CSR_HGEIP) & BIT(imsic->vsfile_hgei));
693+
read_unlock_irqrestore(&imsic->vsfile_lock, flags);
694+
695+
return ret;
696+
}
697+
698+
void kvm_riscv_vcpu_aia_imsic_load(struct kvm_vcpu *vcpu, int cpu)
699+
{
700+
/*
701+
* No need to explicitly clear HGEIE CSR bits because the
702+
* hgei interrupt handler (aka hgei_interrupt()) will always
703+
* clear it for us.
704+
*/
705+
}
706+
707+
void kvm_riscv_vcpu_aia_imsic_put(struct kvm_vcpu *vcpu)
708+
{
709+
struct imsic *imsic = vcpu->arch.aia_context.imsic_state;
710+
unsigned long flags;
711+
712+
if (!kvm_vcpu_is_blocking(vcpu))
713+
return;
714+
715+
read_lock_irqsave(&imsic->vsfile_lock, flags);
716+
if (imsic->vsfile_cpu > -1)
717+
csr_set(CSR_HGEIE, BIT(imsic->vsfile_hgei));
718+
read_unlock_irqrestore(&imsic->vsfile_lock, flags);
719+
}
720+
679721
void kvm_riscv_vcpu_aia_imsic_release(struct kvm_vcpu *vcpu)
680722
{
681723
unsigned long flags;
@@ -781,6 +823,9 @@ int kvm_riscv_vcpu_aia_imsic_update(struct kvm_vcpu *vcpu)
781823
* producers to the new IMSIC VS-file.
782824
*/
783825

826+
/* Ensure HGEIE CSR bit is zero before using the new IMSIC VS-file */
827+
csr_clear(CSR_HGEIE, BIT(new_vsfile_hgei));
828+
784829
/* Zero-out new IMSIC VS-file */
785830
imsic_vsfile_local_clear(new_vsfile_hgei, imsic->nr_hw_eix);
786831

arch/riscv/kvm/vcpu.c

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -207,16 +207,6 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
207207
return kvm_riscv_vcpu_timer_pending(vcpu);
208208
}
209209

210-
void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
211-
{
212-
kvm_riscv_aia_wakeon_hgei(vcpu, true);
213-
}
214-
215-
void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
216-
{
217-
kvm_riscv_aia_wakeon_hgei(vcpu, false);
218-
}
219-
220210
int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
221211
{
222212
return (kvm_riscv_vcpu_has_interrupts(vcpu, -1UL) &&

arch/riscv/kvm/vcpu_timer.c

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -345,8 +345,24 @@ void kvm_riscv_vcpu_timer_save(struct kvm_vcpu *vcpu)
345345
/*
346346
* The vstimecmp CSRs are saved by kvm_riscv_vcpu_timer_sync()
347347
* upon every VM exit so no need to save here.
348+
*
349+
* If VS-timer expires when no VCPU running on a host CPU then
350+
* WFI executed by such host CPU will be effective NOP resulting
351+
* in no power savings. This is because as-per RISC-V Privileged
352+
* specificaiton: "WFI is also required to resume execution for
353+
* locally enabled interrupts pending at any privilege level,
354+
* regardless of the global interrupt enable at each privilege
355+
* level."
356+
*
357+
* To address the above issue, vstimecmp CSR must be set to -1UL
358+
* over here when VCPU is scheduled-out or exits to user space.
348359
*/
349360

361+
csr_write(CSR_VSTIMECMP, -1UL);
362+
#if defined(CONFIG_32BIT)
363+
csr_write(CSR_VSTIMECMPH, -1UL);
364+
#endif
365+
350366
/* timer should be enabled for the remaining operations */
351367
if (unlikely(!t->init_done))
352368
return;

0 commit comments

Comments
 (0)