Skip to content

Commit 31a88c8

Browse files
gkurzpaulusmack
authored andcommitted
KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new one
The EQ page is allocated by the guest and then passed to the hypervisor with the H_INT_SET_QUEUE_CONFIG hcall. A reference is taken on the page before handing it over to the HW. This reference is dropped either when the guest issues the H_INT_RESET hcall or when the KVM device is released. But, the guest can legitimately call H_INT_SET_QUEUE_CONFIG several times, either to reset the EQ (vCPU hot unplug) or to set a new EQ (guest reboot). In both cases the existing EQ page reference is leaked because we simply overwrite it in the XIVE queue structure without calling put_page(). This is especially visible when the guest memory is backed with huge pages: start a VM up to the guest userspace, either reboot it or unplug a vCPU, quit QEMU. The leak is observed by comparing the value of HugePages_Free in /proc/meminfo before and after the VM is run. Ideally we'd want the XIVE code to handle the EQ page de-allocation at the platform level. This isn't the case right now because the various XIVE drivers have different allocation needs. It could maybe worth introducing hooks for this purpose instead of exposing XIVE internals to the drivers, but this is certainly a huge work to be done later. In the meantime, for easier backport, fix both vCPU unplug and guest reboot leaks by introducing a wrapper around xive_native_configure_queue() that does the necessary cleanup. Reported-by: Satheesh Rajendran <[email protected]> Cc: [email protected] # v5.2 Fixes: 13ce329 ("KVM: PPC: Book3S HV: XIVE: Add controls for the EQ configuration") Signed-off-by: Cédric Le Goater <[email protected]> Signed-off-by: Greg Kurz <[email protected]> Tested-by: Lijun Pan <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
1 parent 55d7004 commit 31a88c8

File tree

1 file changed

+22
-9
lines changed

1 file changed

+22
-9
lines changed

arch/powerpc/kvm/book3s_xive_native.c

Lines changed: 22 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,24 @@ static void kvmppc_xive_native_cleanup_queue(struct kvm_vcpu *vcpu, int prio)
5050
}
5151
}
5252

53+
static int kvmppc_xive_native_configure_queue(u32 vp_id, struct xive_q *q,
54+
u8 prio, __be32 *qpage,
55+
u32 order, bool can_escalate)
56+
{
57+
int rc;
58+
__be32 *qpage_prev = q->qpage;
59+
60+
rc = xive_native_configure_queue(vp_id, q, prio, qpage, order,
61+
can_escalate);
62+
if (rc)
63+
return rc;
64+
65+
if (qpage_prev)
66+
put_page(virt_to_page(qpage_prev));
67+
68+
return rc;
69+
}
70+
5371
void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu)
5472
{
5573
struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
@@ -575,19 +593,14 @@ static int kvmppc_xive_native_set_queue_config(struct kvmppc_xive *xive,
575593
q->guest_qaddr = 0;
576594
q->guest_qshift = 0;
577595

578-
rc = xive_native_configure_queue(xc->vp_id, q, priority,
579-
NULL, 0, true);
596+
rc = kvmppc_xive_native_configure_queue(xc->vp_id, q, priority,
597+
NULL, 0, true);
580598
if (rc) {
581599
pr_err("Failed to reset queue %d for VCPU %d: %d\n",
582600
priority, xc->server_num, rc);
583601
return rc;
584602
}
585603

586-
if (q->qpage) {
587-
put_page(virt_to_page(q->qpage));
588-
q->qpage = NULL;
589-
}
590-
591604
return 0;
592605
}
593606

@@ -646,8 +659,8 @@ static int kvmppc_xive_native_set_queue_config(struct kvmppc_xive *xive,
646659
* OPAL level because the use of END ESBs is not supported by
647660
* Linux.
648661
*/
649-
rc = xive_native_configure_queue(xc->vp_id, q, priority,
650-
(__be32 *) qaddr, kvm_eq.qshift, true);
662+
rc = kvmppc_xive_native_configure_queue(xc->vp_id, q, priority,
663+
(__be32 *) qaddr, kvm_eq.qshift, true);
651664
if (rc) {
652665
pr_err("Failed to configure queue %d for VCPU %d: %d\n",
653666
priority, xc->server_num, rc);

0 commit comments

Comments
 (0)