Skip to content

Commit 0f099dc

Browse files
committed
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini: "ARM: - Ensure perf events programmed to count during guest execution are actually enabled before entering the guest in the nVHE configuration - Restore out-of-range handler for stage-2 translation faults - Several fixes to stage-2 TLB invalidations to avoid stale translations, possibly including partial walk caches - Fix early handling of architectural VHE-only systems to ensure E2H is appropriately set - Correct a format specifier warning in the arch_timer selftest - Make the KVM banner message correctly handle all of the possible configurations RISC-V: - Remove redundant semicolon in num_isa_ext_regs() - Fix APLIC setipnum_le/be write emulation - Fix APLIC in_clrip[x] read emulation x86: - Fix a bug in KVM_SET_CPUID{2,} where KVM looks at the wrong CPUID entries (old vs. new) and ultimately neglects to clear PV_UNHALT from vCPUs with HLT-exiting disabled - Documentation fixes for SEV - Fix compat ABI for KVM_MEMORY_ENCRYPT_OP - Fix a 14-year-old goof in a declaration shared by host and guest; the enabled field used by Linux when running as a guest pushes the size of "struct kvm_vcpu_pv_apf_data" from 64 to 68 bytes. This is really unconsequential because KVM never consumes anything beyond the first 64 bytes, but the resulting struct does not match the documentation Selftests: - Fix spelling mistake in arch_timer selftest" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits) KVM: arm64: Rationalise KVM banner output arm64: Fix early handling of FEAT_E2H0 not being implemented KVM: arm64: Ensure target address is granule-aligned for range TLBI KVM: arm64: Use TLBI_TTL_UNKNOWN in __kvm_tlb_flush_vmid_range() KVM: arm64: Don't pass a TLBI level hint when zapping table entries KVM: arm64: Don't defer TLB invalidation when zapping table entries KVM: selftests: Fix __GUEST_ASSERT() format warnings in ARM's arch timer test KVM: arm64: Fix out-of-IPA space translation fault handling KVM: arm64: Fix host-programmed guest events in nVHE RISC-V: KVM: Fix APLIC in_clrip[x] read emulation RISC-V: KVM: Fix APLIC setipnum_le/be write emulation RISC-V: KVM: Remove second semicolon KVM: selftests: Fix spelling mistake "trigged" -> "triggered" Documentation: kvm/sev: clarify usage of KVM_MEMORY_ENCRYPT_OP Documentation: kvm/sev: separate description of firmware KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP KVM: selftests: Check that PV_UNHALT is cleared when HLT exiting is disabled KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT KVM: x86: Introduce __kvm_get_hypervisor_cpuid() helper KVM: SVM: Return -EINVAL instead of -EBUSY on attempt to re-init SEV/SEV-ES ...
2 parents 701b389 + 9bc60f7 commit 0f099dc

File tree

21 files changed

+254
-122
lines changed

21 files changed

+254
-122
lines changed

Documentation/virt/kvm/x86/amd-memory-encryption.rst

Lines changed: 24 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -46,21 +46,16 @@ SEV hardware uses ASIDs to associate a memory encryption key with a VM.
4646
Hence, the ASID for the SEV-enabled guests must be from 1 to a maximum value
4747
defined in the CPUID 0x8000001f[ecx] field.
4848

49-
SEV Key Management
50-
==================
49+
The KVM_MEMORY_ENCRYPT_OP ioctl
50+
===============================
5151

52-
The SEV guest key management is handled by a separate processor called the AMD
53-
Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
54-
key management interface to perform common hypervisor activities such as
55-
encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
56-
information, see the SEV Key Management spec [api-spec]_
57-
58-
The main ioctl to access SEV is KVM_MEMORY_ENCRYPT_OP. If the argument
59-
to KVM_MEMORY_ENCRYPT_OP is NULL, the ioctl returns 0 if SEV is enabled
60-
and ``ENOTTY`` if it is disabled (on some older versions of Linux,
61-
the ioctl runs normally even with a NULL argument, and therefore will
62-
likely return ``EFAULT``). If non-NULL, the argument to KVM_MEMORY_ENCRYPT_OP
63-
must be a struct kvm_sev_cmd::
52+
The main ioctl to access SEV is KVM_MEMORY_ENCRYPT_OP, which operates on
53+
the VM file descriptor. If the argument to KVM_MEMORY_ENCRYPT_OP is NULL,
54+
the ioctl returns 0 if SEV is enabled and ``ENOTTY`` if it is disabled
55+
(on some older versions of Linux, the ioctl tries to run normally even
56+
with a NULL argument, and therefore will likely return ``EFAULT`` instead
57+
of zero if SEV is enabled). If non-NULL, the argument to
58+
KVM_MEMORY_ENCRYPT_OP must be a struct kvm_sev_cmd::
6459

6560
struct kvm_sev_cmd {
6661
__u32 id;
@@ -87,10 +82,6 @@ guests, such as launching, running, snapshotting, migrating and decommissioning.
8782
The KVM_SEV_INIT command is used by the hypervisor to initialize the SEV platform
8883
context. In a typical workflow, this command should be the first command issued.
8984

90-
The firmware can be initialized either by using its own non-volatile storage or
91-
the OS can manage the NV storage for the firmware using the module parameter
92-
``init_ex_path``. If the file specified by ``init_ex_path`` does not exist or
93-
is invalid, the OS will create or override the file with output from PSP.
9485

9586
Returns: 0 on success, -negative on error
9687

@@ -434,6 +425,21 @@ issued by the hypervisor to make the guest ready for execution.
434425

435426
Returns: 0 on success, -negative on error
436427

428+
Firmware Management
429+
===================
430+
431+
The SEV guest key management is handled by a separate processor called the AMD
432+
Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
433+
key management interface to perform common hypervisor activities such as
434+
encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
435+
information, see the SEV Key Management spec [api-spec]_
436+
437+
The AMD-SP firmware can be initialized either by using its own non-volatile
438+
storage or the OS can manage the NV storage for the firmware using
439+
parameter ``init_ex_path`` of the ``ccp`` module. If the file specified
440+
by ``init_ex_path`` does not exist or is invalid, the OS will create or
441+
override the file with PSP non-volatile storage.
442+
437443
References
438444
==========
439445

Documentation/virt/kvm/x86/msr.rst

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -193,8 +193,8 @@ data:
193193
Asynchronous page fault (APF) control MSR.
194194

195195
Bits 63-6 hold 64-byte aligned physical address of a 64 byte memory area
196-
which must be in guest RAM and must be zeroed. This memory is expected
197-
to hold a copy of the following structure::
196+
which must be in guest RAM. This memory is expected to hold the
197+
following structure::
198198

199199
struct kvm_vcpu_pv_apf_data {
200200
/* Used for 'page not present' events delivered via #PF */
@@ -204,7 +204,6 @@ data:
204204
__u32 token;
205205

206206
__u8 pad[56];
207-
__u32 enabled;
208207
};
209208

210209
Bits 5-4 of the MSR are reserved and should be zero. Bit 0 is set to 1
@@ -232,14 +231,14 @@ data:
232231
as regular page fault, guest must reset 'flags' to '0' before it does
233232
something that can generate normal page fault.
234233

235-
Bytes 5-7 of 64 byte memory location ('token') will be written to by the
234+
Bytes 4-7 of 64 byte memory location ('token') will be written to by the
236235
hypervisor at the time of APF 'page ready' event injection. The content
237-
of these bytes is a token which was previously delivered as 'page not
238-
present' event. The event indicates the page in now available. Guest is
239-
supposed to write '0' to 'token' when it is done handling 'page ready'
240-
event and to write 1' to MSR_KVM_ASYNC_PF_ACK after clearing the location;
241-
writing to the MSR forces KVM to re-scan its queue and deliver the next
242-
pending notification.
236+
of these bytes is a token which was previously delivered in CR2 as
237+
'page not present' event. The event indicates the page is now available.
238+
Guest is supposed to write '0' to 'token' when it is done handling
239+
'page ready' event and to write '1' to MSR_KVM_ASYNC_PF_ACK after
240+
clearing the location; writing to the MSR forces KVM to re-scan its
241+
queue and deliver the next pending notification.
243242

244243
Note, MSR_KVM_ASYNC_PF_INT MSR specifying the interrupt vector for 'page
245244
ready' APF delivery needs to be written to before enabling APF mechanism

arch/arm64/kernel/head.S

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -291,6 +291,21 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
291291
blr x2
292292
0:
293293
mov_q x0, HCR_HOST_NVHE_FLAGS
294+
295+
/*
296+
* Compliant CPUs advertise their VHE-onlyness with
297+
* ID_AA64MMFR4_EL1.E2H0 < 0. HCR_EL2.E2H can be
298+
* RES1 in that case. Publish the E2H bit early so that
299+
* it can be picked up by the init_el2_state macro.
300+
*
301+
* Fruity CPUs seem to have HCR_EL2.E2H set to RAO/WI, but
302+
* don't advertise it (they predate this relaxation).
303+
*/
304+
mrs_s x1, SYS_ID_AA64MMFR4_EL1
305+
tbz x1, #(ID_AA64MMFR4_EL1_E2H0_SHIFT + ID_AA64MMFR4_EL1_E2H0_WIDTH - 1), 1f
306+
307+
orr x0, x0, #HCR_E2H
308+
1:
294309
msr hcr_el2, x0
295310
isb
296311

@@ -303,22 +318,10 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
303318

304319
mov_q x1, INIT_SCTLR_EL1_MMU_OFF
305320

306-
/*
307-
* Compliant CPUs advertise their VHE-onlyness with
308-
* ID_AA64MMFR4_EL1.E2H0 < 0. HCR_EL2.E2H can be
309-
* RES1 in that case.
310-
*
311-
* Fruity CPUs seem to have HCR_EL2.E2H set to RES1, but
312-
* don't advertise it (they predate this relaxation).
313-
*/
314-
mrs_s x0, SYS_ID_AA64MMFR4_EL1
315-
ubfx x0, x0, #ID_AA64MMFR4_EL1_E2H0_SHIFT, #ID_AA64MMFR4_EL1_E2H0_WIDTH
316-
tbnz x0, #(ID_AA64MMFR4_EL1_E2H0_SHIFT + ID_AA64MMFR4_EL1_E2H0_WIDTH - 1), 1f
317-
318321
mrs x0, hcr_el2
319322
and x0, x0, #HCR_E2H
320323
cbz x0, 2f
321-
1:
324+
322325
/* Set a sane SCTLR_EL1, the VHE way */
323326
pre_disable_mmu_workaround
324327
msr_s SYS_SCTLR_EL12, x1

arch/arm64/kvm/arm.c

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2597,14 +2597,11 @@ static __init int kvm_arm_init(void)
25972597
if (err)
25982598
goto out_hyp;
25992599

2600-
if (is_protected_kvm_enabled()) {
2601-
kvm_info("Protected nVHE mode initialized successfully\n");
2602-
} else if (in_hyp_mode) {
2603-
kvm_info("VHE mode initialized successfully\n");
2604-
} else {
2605-
char mode = cpus_have_final_cap(ARM64_KVM_HVHE) ? 'h' : 'n';
2606-
kvm_info("Hyp mode (%cVHE) initialized successfully\n", mode);
2607-
}
2600+
kvm_info("%s%sVHE mode initialized successfully\n",
2601+
in_hyp_mode ? "" : (is_protected_kvm_enabled() ?
2602+
"Protected " : "Hyp "),
2603+
in_hyp_mode ? "" : (cpus_have_final_cap(ARM64_KVM_HVHE) ?
2604+
"h" : "n"));
26082605

26092606
/*
26102607
* FIXME: Do something reasonable if kvm_init() fails after pKVM

arch/arm64/kvm/hyp/nvhe/tlb.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,8 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
154154
/* Switch to requested VMID */
155155
__tlb_switch_to_guest(mmu, &cxt, false);
156156

157-
__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0);
157+
__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride,
158+
TLBI_TTL_UNKNOWN);
158159

159160
dsb(ish);
160161
__tlbi(vmalle1is);

arch/arm64/kvm/hyp/pgtable.c

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -528,7 +528,7 @@ static int hyp_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
528528

529529
kvm_clear_pte(ctx->ptep);
530530
dsb(ishst);
531-
__tlbi_level(vae2is, __TLBI_VADDR(ctx->addr, 0), ctx->level);
531+
__tlbi_level(vae2is, __TLBI_VADDR(ctx->addr, 0), TLBI_TTL_UNKNOWN);
532532
} else {
533533
if (ctx->end - ctx->addr < granule)
534534
return -EINVAL;
@@ -843,12 +843,15 @@ static bool stage2_try_break_pte(const struct kvm_pgtable_visit_ctx *ctx,
843843
* Perform the appropriate TLB invalidation based on the
844844
* evicted pte value (if any).
845845
*/
846-
if (kvm_pte_table(ctx->old, ctx->level))
847-
kvm_tlb_flush_vmid_range(mmu, ctx->addr,
848-
kvm_granule_size(ctx->level));
849-
else if (kvm_pte_valid(ctx->old))
846+
if (kvm_pte_table(ctx->old, ctx->level)) {
847+
u64 size = kvm_granule_size(ctx->level);
848+
u64 addr = ALIGN_DOWN(ctx->addr, size);
849+
850+
kvm_tlb_flush_vmid_range(mmu, addr, size);
851+
} else if (kvm_pte_valid(ctx->old)) {
850852
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu,
851853
ctx->addr, ctx->level);
854+
}
852855
}
853856

854857
if (stage2_pte_is_counted(ctx->old))
@@ -896,9 +899,13 @@ static void stage2_unmap_put_pte(const struct kvm_pgtable_visit_ctx *ctx,
896899
if (kvm_pte_valid(ctx->old)) {
897900
kvm_clear_pte(ctx->ptep);
898901

899-
if (!stage2_unmap_defer_tlb_flush(pgt))
900-
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu,
901-
ctx->addr, ctx->level);
902+
if (kvm_pte_table(ctx->old, ctx->level)) {
903+
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr,
904+
TLBI_TTL_UNKNOWN);
905+
} else if (!stage2_unmap_defer_tlb_flush(pgt)) {
906+
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr,
907+
ctx->level);
908+
}
902909
}
903910

904911
mm_ops->put_page(ctx->ptep);

arch/arm64/kvm/hyp/vhe/tlb.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,8 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
171171
/* Switch to requested VMID */
172172
__tlb_switch_to_guest(mmu, &cxt);
173173

174-
__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0);
174+
__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride,
175+
TLBI_TTL_UNKNOWN);
175176

176177
dsb(ish);
177178
__tlbi(vmalle1is);

arch/arm64/kvm/mmu.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1637,7 +1637,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
16371637
fault_ipa = kvm_vcpu_get_fault_ipa(vcpu);
16381638
is_iabt = kvm_vcpu_trap_is_iabt(vcpu);
16391639

1640-
if (esr_fsc_is_permission_fault(esr)) {
1640+
if (esr_fsc_is_translation_fault(esr)) {
16411641
/* Beyond sanitised PARange (which is the IPA limit) */
16421642
if (fault_ipa >= BIT_ULL(get_kvm_ipa_limit())) {
16431643
kvm_inject_size_fault(vcpu);

arch/riscv/kvm/aia_aplic.c

Lines changed: 31 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -137,11 +137,21 @@ static void aplic_write_pending(struct aplic *aplic, u32 irq, bool pending)
137137
raw_spin_lock_irqsave(&irqd->lock, flags);
138138

139139
sm = irqd->sourcecfg & APLIC_SOURCECFG_SM_MASK;
140-
if (!pending &&
141-
((sm == APLIC_SOURCECFG_SM_LEVEL_HIGH) ||
142-
(sm == APLIC_SOURCECFG_SM_LEVEL_LOW)))
140+
if (sm == APLIC_SOURCECFG_SM_INACTIVE)
143141
goto skip_write_pending;
144142

143+
if (sm == APLIC_SOURCECFG_SM_LEVEL_HIGH ||
144+
sm == APLIC_SOURCECFG_SM_LEVEL_LOW) {
145+
if (!pending)
146+
goto skip_write_pending;
147+
if ((irqd->state & APLIC_IRQ_STATE_INPUT) &&
148+
sm == APLIC_SOURCECFG_SM_LEVEL_LOW)
149+
goto skip_write_pending;
150+
if (!(irqd->state & APLIC_IRQ_STATE_INPUT) &&
151+
sm == APLIC_SOURCECFG_SM_LEVEL_HIGH)
152+
goto skip_write_pending;
153+
}
154+
145155
if (pending)
146156
irqd->state |= APLIC_IRQ_STATE_PENDING;
147157
else
@@ -187,16 +197,31 @@ static void aplic_write_enabled(struct aplic *aplic, u32 irq, bool enabled)
187197

188198
static bool aplic_read_input(struct aplic *aplic, u32 irq)
189199
{
190-
bool ret;
191-
unsigned long flags;
200+
u32 sourcecfg, sm, raw_input, irq_inverted;
192201
struct aplic_irq *irqd;
202+
unsigned long flags;
203+
bool ret = false;
193204

194205
if (!irq || aplic->nr_irqs <= irq)
195206
return false;
196207
irqd = &aplic->irqs[irq];
197208

198209
raw_spin_lock_irqsave(&irqd->lock, flags);
199-
ret = (irqd->state & APLIC_IRQ_STATE_INPUT) ? true : false;
210+
211+
sourcecfg = irqd->sourcecfg;
212+
if (sourcecfg & APLIC_SOURCECFG_D)
213+
goto skip;
214+
215+
sm = sourcecfg & APLIC_SOURCECFG_SM_MASK;
216+
if (sm == APLIC_SOURCECFG_SM_INACTIVE)
217+
goto skip;
218+
219+
raw_input = (irqd->state & APLIC_IRQ_STATE_INPUT) ? 1 : 0;
220+
irq_inverted = (sm == APLIC_SOURCECFG_SM_LEVEL_LOW ||
221+
sm == APLIC_SOURCECFG_SM_EDGE_FALL) ? 1 : 0;
222+
ret = !!(raw_input ^ irq_inverted);
223+
224+
skip:
200225
raw_spin_unlock_irqrestore(&irqd->lock, flags);
201226

202227
return ret;

arch/riscv/kvm/vcpu_onereg.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -986,7 +986,7 @@ static int copy_isa_ext_reg_indices(const struct kvm_vcpu *vcpu,
986986

987987
static inline unsigned long num_isa_ext_regs(const struct kvm_vcpu *vcpu)
988988
{
989-
return copy_isa_ext_reg_indices(vcpu, NULL);;
989+
return copy_isa_ext_reg_indices(vcpu, NULL);
990990
}
991991

992992
static int copy_sbi_ext_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)

0 commit comments

Comments
 (0)