Skip to content

Commit 14edff8

Browse files
committed
Merge tag 'kvmarm-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm updates for Linux 5.5: - Allow non-ISV data aborts to be reported to userspace - Allow injection of data aborts from userspace - Expose stolen time to guests - GICv4 performance improvements - vgic ITS emulation fixes - Simplify FWB handling - Enable halt pool counters - Make the emulated timer PREEMPT_RT compliant Conflicts: include/uapi/linux/kvm.h
2 parents 992edea + cd7056a commit 14edff8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+1016
-276
lines changed

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3083,9 +3083,9 @@
30833083
[X86,PV_OPS] Disable paravirtualized VMware scheduler
30843084
clock and use the default one.
30853085

3086-
no-steal-acc [X86,KVM] Disable paravirtualized steal time accounting.
3087-
steal time is computed, but won't influence scheduler
3088-
behaviour
3086+
no-steal-acc [X86,KVM,ARM64] Disable paravirtualized steal time
3087+
accounting. steal time is computed, but won't
3088+
influence scheduler behaviour
30893089

30903090
nolapic [X86-32,APIC] Do not enable or use the local APIC.
30913091

Documentation/virt/kvm/api.txt

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1002,12 +1002,18 @@ Specifying exception.has_esr on a system that does not support it will return
10021002
-EINVAL. Setting anything other than the lower 24bits of exception.serror_esr
10031003
will return -EINVAL.
10041004

1005+
It is not possible to read back a pending external abort (injected via
1006+
KVM_SET_VCPU_EVENTS or otherwise) because such an exception is always delivered
1007+
directly to the virtual CPU).
1008+
1009+
10051010
struct kvm_vcpu_events {
10061011
struct {
10071012
__u8 serror_pending;
10081013
__u8 serror_has_esr;
1014+
__u8 ext_dabt_pending;
10091015
/* Align it to 8 bytes */
1010-
__u8 pad[6];
1016+
__u8 pad[5];
10111017
__u64 serror_esr;
10121018
} exception;
10131019
__u32 reserved[12];
@@ -1051,9 +1057,23 @@ contain a valid state and shall be written into the VCPU.
10511057

10521058
ARM/ARM64:
10531059

1060+
User space may need to inject several types of events to the guest.
1061+
10541062
Set the pending SError exception state for this VCPU. It is not possible to
10551063
'cancel' an Serror that has been made pending.
10561064

1065+
If the guest performed an access to I/O memory which could not be handled by
1066+
userspace, for example because of missing instruction syndrome decode
1067+
information or because there is no device mapped at the accessed IPA, then
1068+
userspace can ask the kernel to inject an external abort using the address
1069+
from the exiting fault on the VCPU. It is a programming error to set
1070+
ext_dabt_pending after an exit which was not either KVM_EXIT_MMIO or
1071+
KVM_EXIT_ARM_NISV. This feature is only available if the system supports
1072+
KVM_CAP_ARM_INJECT_EXT_DABT. This is a helper which provides commonality in
1073+
how userspace reports accesses for the above cases to guests, across different
1074+
userspace implementations. Nevertheless, userspace can still emulate all Arm
1075+
exceptions by manipulating individual registers using the KVM_SET_ONE_REG API.
1076+
10571077
See KVM_GET_VCPU_EVENTS for the data structure.
10581078

10591079

@@ -4471,6 +4491,39 @@ Hyper-V SynIC state change. Notification is used to remap SynIC
44714491
event/message pages and to enable/disable SynIC messages/events processing
44724492
in userspace.
44734493

4494+
/* KVM_EXIT_ARM_NISV */
4495+
struct {
4496+
__u64 esr_iss;
4497+
__u64 fault_ipa;
4498+
} arm_nisv;
4499+
4500+
Used on arm and arm64 systems. If a guest accesses memory not in a memslot,
4501+
KVM will typically return to userspace and ask it to do MMIO emulation on its
4502+
behalf. However, for certain classes of instructions, no instruction decode
4503+
(direction, length of memory access) is provided, and fetching and decoding
4504+
the instruction from the VM is overly complicated to live in the kernel.
4505+
4506+
Historically, when this situation occurred, KVM would print a warning and kill
4507+
the VM. KVM assumed that if the guest accessed non-memslot memory, it was
4508+
trying to do I/O, which just couldn't be emulated, and the warning message was
4509+
phrased accordingly. However, what happened more often was that a guest bug
4510+
caused access outside the guest memory areas which should lead to a more
4511+
meaningful warning message and an external abort in the guest, if the access
4512+
did not fall within an I/O window.
4513+
4514+
Userspace implementations can query for KVM_CAP_ARM_NISV_TO_USER, and enable
4515+
this capability at VM creation. Once this is done, these types of errors will
4516+
instead return to userspace with KVM_EXIT_ARM_NISV, with the valid bits from
4517+
the HSR (arm) and ESR_EL2 (arm64) in the esr_iss field, and the faulting IPA
4518+
in the fault_ipa field. Userspace can either fix up the access if it's
4519+
actually an I/O access by decoding the instruction from guest memory (if it's
4520+
very brave) and continue executing the guest, or it can decide to suspend,
4521+
dump, or restart the guest.
4522+
4523+
Note that KVM does not skip the faulting instruction as it does for
4524+
KVM_EXIT_MMIO, but userspace has to emulate any change to the processing state
4525+
if it decides to decode and emulate the instruction.
4526+
44744527
/* Fix the size of the union. */
44754528
char padding[256];
44764529
};

Documentation/virt/kvm/arm/pvtime.rst

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
Paravirtualized time support for arm64
4+
======================================
5+
6+
Arm specification DEN0057/A defines a standard for paravirtualised time
7+
support for AArch64 guests:
8+
9+
https://developer.arm.com/docs/den0057/a
10+
11+
KVM/arm64 implements the stolen time part of this specification by providing
12+
some hypervisor service calls to support a paravirtualized guest obtaining a
13+
view of the amount of time stolen from its execution.
14+
15+
Two new SMCCC compatible hypercalls are defined:
16+
17+
* PV_TIME_FEATURES: 0xC5000020
18+
* PV_TIME_ST: 0xC5000021
19+
20+
These are only available in the SMC64/HVC64 calling convention as
21+
paravirtualized time is not available to 32 bit Arm guests. The existence of
22+
the PV_FEATURES hypercall should be probed using the SMCCC 1.1 ARCH_FEATURES
23+
mechanism before calling it.
24+
25+
PV_TIME_FEATURES
26+
============= ======== ==========
27+
Function ID: (uint32) 0xC5000020
28+
PV_call_id: (uint32) The function to query for support.
29+
Currently only PV_TIME_ST is supported.
30+
Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the relevant
31+
PV-time feature is supported by the hypervisor.
32+
============= ======== ==========
33+
34+
PV_TIME_ST
35+
============= ======== ==========
36+
Function ID: (uint32) 0xC5000021
37+
Return value: (int64) IPA of the stolen time data structure for this
38+
VCPU. On failure:
39+
NOT_SUPPORTED (-1)
40+
============= ======== ==========
41+
42+
The IPA returned by PV_TIME_ST should be mapped by the guest as normal memory
43+
with inner and outer write back caching attributes, in the inner shareable
44+
domain. A total of 16 bytes from the IPA returned are guaranteed to be
45+
meaningfully filled by the hypervisor (see structure below).
46+
47+
PV_TIME_ST returns the structure for the calling VCPU.
48+
49+
Stolen Time
50+
-----------
51+
52+
The structure pointed to by the PV_TIME_ST hypercall is as follows:
53+
54+
+-------------+-------------+-------------+----------------------------+
55+
| Field | Byte Length | Byte Offset | Description |
56+
+=============+=============+=============+============================+
57+
| Revision | 4 | 0 | Must be 0 for version 1.0 |
58+
+-------------+-------------+-------------+----------------------------+
59+
| Attributes | 4 | 4 | Must be 0 |
60+
+-------------+-------------+-------------+----------------------------+
61+
| Stolen time | 8 | 8 | Stolen time in unsigned |
62+
| | | | nanoseconds indicating how |
63+
| | | | much time this VCPU thread |
64+
| | | | was involuntarily not |
65+
| | | | running on a physical CPU. |
66+
+-------------+-------------+-------------+----------------------------+
67+
68+
All values in the structure are stored little-endian.
69+
70+
The structure will be updated by the hypervisor prior to scheduling a VCPU. It
71+
will be present within a reserved region of the normal memory given to the
72+
guest. The guest should not attempt to write into this memory. There is a
73+
structure per VCPU of the guest.
74+
75+
It is advisable that one or more 64k pages are set aside for the purpose of
76+
these structures and not used for other purposes, this enables the guest to map
77+
the region using 64k pages and avoids conflicting attributes with other memory.
78+
79+
For the user space interface see Documentation/virt/kvm/devices/vcpu.txt
80+
section "3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL".

Documentation/virt/kvm/devices/vcpu.txt

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,3 +60,17 @@ time to use the number provided for a given timer, overwriting any previously
6060
configured values on other VCPUs. Userspace should configure the interrupt
6161
numbers on at least one VCPU after creating all VCPUs and before running any
6262
VCPUs.
63+
64+
3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL
65+
Architectures: ARM64
66+
67+
3.1 ATTRIBUTE: KVM_ARM_VCPU_PVTIME_IPA
68+
Parameters: 64-bit base address
69+
Returns: -ENXIO: Stolen time not implemented
70+
-EEXIST: Base address already set for this VCPU
71+
-EINVAL: Base address not 64 byte aligned
72+
73+
Specifies the base address of the stolen time structure for this VCPU. The
74+
base address must be 64 byte aligned and exist within a valid guest memory
75+
region. See Documentation/virt/kvm/arm/pvtime.txt for more information
76+
including the layout of the stolen time structure.

arch/arm/include/asm/kvm_arm.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,7 @@
162162
#define HSR_ISV (_AC(1, UL) << HSR_ISV_SHIFT)
163163
#define HSR_SRT_SHIFT (16)
164164
#define HSR_SRT_MASK (0xf << HSR_SRT_SHIFT)
165+
#define HSR_CM (1 << 8)
165166
#define HSR_FSC (0x3f)
166167
#define HSR_FSC_TYPE (0x3c)
167168
#define HSR_SSE (1 << 21)

arch/arm/include/asm/kvm_emulate.h

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -95,12 +95,12 @@ static inline unsigned long *vcpu_hcr(const struct kvm_vcpu *vcpu)
9595
return (unsigned long *)&vcpu->arch.hcr;
9696
}
9797

98-
static inline void vcpu_clear_wfe_traps(struct kvm_vcpu *vcpu)
98+
static inline void vcpu_clear_wfx_traps(struct kvm_vcpu *vcpu)
9999
{
100100
vcpu->arch.hcr &= ~HCR_TWE;
101101
}
102102

103-
static inline void vcpu_set_wfe_traps(struct kvm_vcpu *vcpu)
103+
static inline void vcpu_set_wfx_traps(struct kvm_vcpu *vcpu)
104104
{
105105
vcpu->arch.hcr |= HCR_TWE;
106106
}
@@ -167,6 +167,11 @@ static inline bool kvm_vcpu_dabt_isvalid(struct kvm_vcpu *vcpu)
167167
return kvm_vcpu_get_hsr(vcpu) & HSR_ISV;
168168
}
169169

170+
static inline unsigned long kvm_vcpu_dabt_iss_nisv_sanitized(const struct kvm_vcpu *vcpu)
171+
{
172+
return kvm_vcpu_get_hsr(vcpu) & (HSR_CM | HSR_WNR | HSR_FSC);
173+
}
174+
170175
static inline bool kvm_vcpu_dabt_iswrite(struct kvm_vcpu *vcpu)
171176
{
172177
return kvm_vcpu_get_hsr(vcpu) & HSR_WNR;

arch/arm/include/asm/kvm_host.h

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
#ifndef __ARM_KVM_HOST_H__
88
#define __ARM_KVM_HOST_H__
99

10+
#include <linux/arm-smccc.h>
1011
#include <linux/errno.h>
1112
#include <linux/types.h>
1213
#include <linux/kvm_types.h>
@@ -38,6 +39,7 @@
3839
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
3940
#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1)
4041
#define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2)
42+
#define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3)
4143

4244
DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
4345

@@ -76,6 +78,14 @@ struct kvm_arch {
7678

7779
/* Mandated version of PSCI */
7880
u32 psci_version;
81+
82+
/*
83+
* If we encounter a data abort without valid instruction syndrome
84+
* information, report this to user space. User space can (and
85+
* should) opt in to this feature if KVM_CAP_ARM_NISV_TO_USER is
86+
* supported.
87+
*/
88+
bool return_nisv_io_abort_to_user;
7989
};
8090

8191
#define KVM_NR_MEM_OBJS 40
@@ -323,6 +333,29 @@ static inline int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long ext)
323333
int kvm_perf_init(void);
324334
int kvm_perf_teardown(void);
325335

336+
static inline long kvm_hypercall_pv_features(struct kvm_vcpu *vcpu)
337+
{
338+
return SMCCC_RET_NOT_SUPPORTED;
339+
}
340+
341+
static inline gpa_t kvm_init_stolen_time(struct kvm_vcpu *vcpu)
342+
{
343+
return GPA_INVALID;
344+
}
345+
346+
static inline void kvm_update_stolen_time(struct kvm_vcpu *vcpu)
347+
{
348+
}
349+
350+
static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch *vcpu_arch)
351+
{
352+
}
353+
354+
static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch)
355+
{
356+
return false;
357+
}
358+
326359
void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
327360

328361
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);

arch/arm/include/uapi/asm/kvm.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,8 +131,9 @@ struct kvm_vcpu_events {
131131
struct {
132132
__u8 serror_pending;
133133
__u8 serror_has_esr;
134+
__u8 ext_dabt_pending;
134135
/* Align it to 8 bytes */
135-
__u8 pad[6];
136+
__u8 pad[5];
136137
__u64 serror_esr;
137138
} exception;
138139
__u32 reserved[12];

arch/arm/kvm/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ obj-y += kvm-arm.o init.o interrupts.o
2424
obj-y += handle_exit.o guest.o emulate.o reset.o
2525
obj-y += coproc.o coproc_a15.o coproc_a7.o vgic-v3-coproc.o
2626
obj-y += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
27-
obj-y += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
27+
obj-y += $(KVM)/arm/psci.o $(KVM)/arm/perf.o $(KVM)/arm/hypercalls.o
2828
obj-y += $(KVM)/arm/aarch32.o
2929

3030
obj-y += $(KVM)/arm/vgic/vgic.o

arch/arm/kvm/guest.c

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,10 @@
2121
#define VCPU_STAT(x) { #x, offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU }
2222

2323
struct kvm_stats_debugfs_item debugfs_entries[] = {
24+
VCPU_STAT(halt_successful_poll),
25+
VCPU_STAT(halt_attempted_poll),
26+
VCPU_STAT(halt_poll_invalid),
27+
VCPU_STAT(halt_wakeup),
2428
VCPU_STAT(hvc_exit_stat),
2529
VCPU_STAT(wfe_exit_stat),
2630
VCPU_STAT(wfi_exit_stat),
@@ -255,6 +259,12 @@ int __kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
255259
{
256260
events->exception.serror_pending = !!(*vcpu_hcr(vcpu) & HCR_VA);
257261

262+
/*
263+
* We never return a pending ext_dabt here because we deliver it to
264+
* the virtual CPU directly when setting the event and it's no longer
265+
* 'pending' at this point.
266+
*/
267+
258268
return 0;
259269
}
260270

@@ -263,12 +273,16 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
263273
{
264274
bool serror_pending = events->exception.serror_pending;
265275
bool has_esr = events->exception.serror_has_esr;
276+
bool ext_dabt_pending = events->exception.ext_dabt_pending;
266277

267278
if (serror_pending && has_esr)
268279
return -EINVAL;
269280
else if (serror_pending)
270281
kvm_inject_vabt(vcpu);
271282

283+
if (ext_dabt_pending)
284+
kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
285+
272286
return 0;
273287
}
274288

0 commit comments

Comments
 (0)