Skip to content

Commit 4f3a6c4

Browse files
committed
Merge branch 'for-next/vcpu-hotplug' into for-next/core
* for-next/vcpu-hotplug: (21 commits) : arm64 support for virtual CPU hotplug (ACPI) irqchip/gic-v3: Fix 'broken_rdists' unused warning when !SMP and !ACPI arm64: Kconfig: Fix dependencies to enable ACPI_HOTPLUG_CPU cpumask: Add enabled cpumask for present CPUs that can be brought online arm64: document virtual CPU hotplug's expectations arm64: Kconfig: Enable hotplug CPU on arm64 if ACPI_PROCESSOR is enabled. arm64: arch_register_cpu() variant to check if an ACPI handle is now available. arm64: psci: Ignore DENIED CPUs irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc() arm64: acpi: Harden get_cpu_for_acpi_id() against missing CPU entry arm64: acpi: Move get_cpu_for_acpi_id() to a header ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug ACPI: scan: switch to flags for acpi_scan_check_and_detach() ACPI: processor: Register deferred CPUs from acpi_processor_get_info() ACPI: processor: Add acpi_get_processor_handle() helper ACPI: processor: Move checks and availability of acpi_processor earlier ACPI: processor: Fix memory leaks in error paths of processor_add() ACPI: processor: Return an error if acpi_processor_get_info() fails in processor_add() ACPI: processor: Drop duplicated check on _STA (enabled + present) cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER ...
2 parents 3346c56 + 0804020 commit 4f3a6c4

File tree

20 files changed

+404
-135
lines changed

20 files changed

+404
-135
lines changed

Documentation/ABI/testing/sysfs-devices-system-cpu

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -694,3 +694,9 @@ Description:
694694
(RO) indicates whether or not the kernel directly supports
695695
modifying the crash elfcorehdr for CPU hot un/plug and/or
696696
on/offline changes.
697+
698+
What: /sys/devices/system/cpu/enabled
699+
Date: Nov 2022
700+
Contact: Linux kernel mailing list <[email protected]>
701+
Description:
702+
(RO) the list of CPUs that can be brought online.
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
.. _cpuhp_index:
3+
4+
====================
5+
CPU Hotplug and ACPI
6+
====================
7+
8+
CPU hotplug in the arm64 world is commonly used to describe the kernel taking
9+
CPUs online/offline using PSCI. This document is about ACPI firmware allowing
10+
CPUs that were not available during boot to be added to the system later.
11+
12+
``possible`` and ``present`` refer to the state of the CPU as seen by linux.
13+
14+
15+
CPU Hotplug on physical systems - CPUs not present at boot
16+
----------------------------------------------------------
17+
18+
Physical systems need to mark a CPU that is ``possible`` but not ``present`` as
19+
being ``present``. An example would be a dual socket machine, where the package
20+
in one of the sockets can be replaced while the system is running.
21+
22+
This is not supported.
23+
24+
In the arm64 world CPUs are not a single device but a slice of the system.
25+
There are no systems that support the physical addition (or removal) of CPUs
26+
while the system is running, and ACPI is not able to sufficiently describe
27+
them.
28+
29+
e.g. New CPUs come with new caches, but the platform's cache toplogy is
30+
described in a static table, the PPTT. How caches are shared between CPUs is
31+
not discoverable, and must be described by firmware.
32+
33+
e.g. The GIC redistributor for each CPU must be accessed by the driver during
34+
boot to discover the system wide supported features. ACPI's MADT GICC
35+
structures can describe a redistributor associated with a disabled CPU, but
36+
can't describe whether the redistributor is accessible, only that it is not
37+
'always on'.
38+
39+
arm64's ACPI tables assume that everything described is ``present``.
40+
41+
42+
CPU Hotplug on virtual systems - CPUs not enabled at boot
43+
---------------------------------------------------------
44+
45+
Virtual systems have the advantage that all the properties the system will
46+
ever have can be described at boot. There are no power-domain considerations
47+
as such devices are emulated.
48+
49+
CPU Hotplug on virtual systems is supported. It is distinct from physical
50+
CPU Hotplug as all resources are described as ``present``, but CPUs may be
51+
marked as disabled by firmware. Only the CPU's online/offline behaviour is
52+
influenced by firmware. An example is where a virtual machine boots with a
53+
single CPU, and additional CPUs are added once a cloud orchestrator deploys
54+
the workload.
55+
56+
For a virtual machine, the VMM (e.g. Qemu) plays the part of firmware.
57+
58+
Virtual hotplug is implemented as a firmware policy affecting which CPUs can be
59+
brought online. Firmware can enforce its policy via PSCI's return codes. e.g.
60+
``DENIED``.
61+
62+
The ACPI tables must describe all the resources of the virtual machine. CPUs
63+
that firmware wishes to disable either from boot (or later) should not be
64+
``enabled`` in the MADT GICC structures, but should have the ``online capable``
65+
bit set, to indicate they can be enabled later. The boot CPU must be marked as
66+
``enabled``. The 'always on' GICR structure must be used to describe the
67+
redistributors.
68+
69+
CPUs described as ``online capable`` but not ``enabled`` can be set to enabled
70+
by the DSDT's Processor object's _STA method. On virtual systems the _STA method
71+
must always report the CPU as ``present``. Changes to the firmware policy can
72+
be notified to the OS via device-check or eject-request.
73+
74+
CPUs described as ``enabled`` in the static table, should not have their _STA
75+
modified dynamically by firmware. Soft-restart features such as kexec will
76+
re-read the static properties of the system from these static tables, and
77+
may malfunction if these no longer describe the running system. Linux will
78+
re-discover the dynamic properties of the system from the _STA method later
79+
during boot.

Documentation/arch/arm64/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ ARM64 Architecture
1313
asymmetric-32bit
1414
booting
1515
cpu-feature-registers
16+
cpu-hotplug
1617
elf_hwcaps
1718
hugetlbpage
1819
kdump

arch/arm64/Kconfig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ config ARM64
55
select ACPI_CCA_REQUIRED if ACPI
66
select ACPI_GENERIC_GSI if ACPI
77
select ACPI_GTDT if ACPI
8+
select ACPI_HOTPLUG_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
89
select ACPI_IORT if ACPI
910
select ACPI_REDUCED_HARDWARE_ONLY if ACPI
1011
select ACPI_MCFG if (ACPI && PCI)

arch/arm64/include/asm/acpi.h

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,18 @@ static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
119119
return acpi_cpu_get_madt_gicc(cpu)->uid;
120120
}
121121

122+
static inline int get_cpu_for_acpi_id(u32 uid)
123+
{
124+
int cpu;
125+
126+
for (cpu = 0; cpu < nr_cpu_ids; cpu++)
127+
if (acpi_cpu_get_madt_gicc(cpu) &&
128+
uid == get_acpi_id_for_cpu(cpu))
129+
return cpu;
130+
131+
return -EINVAL;
132+
}
133+
122134
static inline void arch_fix_phys_package_id(int num, u32 slot) { }
123135
void __init acpi_init_cpus(void);
124136
int apei_claim_sea(struct pt_regs *regs);

arch/arm64/kernel/acpi.c

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@
3030
#include <linux/pgtable.h>
3131

3232
#include <acpi/ghes.h>
33+
#include <acpi/processor.h>
3334
#include <asm/cputype.h>
3435
#include <asm/cpu_ops.h>
3536
#include <asm/daifflags.h>
@@ -438,3 +439,24 @@ void arch_reserve_mem_area(acpi_physical_address addr, size_t size)
438439
{
439440
memblock_mark_nomap(addr, size);
440441
}
442+
443+
#ifdef CONFIG_ACPI_HOTPLUG_CPU
444+
int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 apci_id,
445+
int *pcpu)
446+
{
447+
/* If an error code is passed in this stub can't fix it */
448+
if (*pcpu < 0) {
449+
pr_warn_once("Unable to map CPU to valid ID\n");
450+
return *pcpu;
451+
}
452+
453+
return 0;
454+
}
455+
EXPORT_SYMBOL(acpi_map_cpu);
456+
457+
int acpi_unmap_cpu(int cpu)
458+
{
459+
return 0;
460+
}
461+
EXPORT_SYMBOL(acpi_unmap_cpu);
462+
#endif /* CONFIG_ACPI_HOTPLUG_CPU */

arch/arm64/kernel/acpi_numa.c

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -34,17 +34,6 @@ int __init acpi_numa_get_nid(unsigned int cpu)
3434
return acpi_early_node_map[cpu];
3535
}
3636

37-
static inline int get_cpu_for_acpi_id(u32 uid)
38-
{
39-
int cpu;
40-
41-
for (cpu = 0; cpu < nr_cpu_ids; cpu++)
42-
if (uid == get_acpi_id_for_cpu(cpu))
43-
return cpu;
44-
45-
return -EINVAL;
46-
}
47-
4837
static int __init acpi_parse_gicc_pxm(union acpi_subtable_headers *header,
4938
const unsigned long end)
5039
{

arch/arm64/kernel/psci.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ static int cpu_psci_cpu_boot(unsigned int cpu)
4040
{
4141
phys_addr_t pa_secondary_entry = __pa_symbol(secondary_entry);
4242
int err = psci_ops.cpu_on(cpu_logical_map(cpu), pa_secondary_entry);
43-
if (err)
43+
if (err && err != -EPERM)
4444
pr_err("failed to boot CPU%d (%d)\n", cpu, err);
4545

4646
return err;

arch/arm64/kernel/smp.c

Lines changed: 57 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,8 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
129129
/* Now bring the CPU into our world */
130130
ret = boot_secondary(cpu, idle);
131131
if (ret) {
132-
pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
132+
if (ret != -EPERM)
133+
pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
133134
return ret;
134135
}
135136

@@ -507,6 +508,59 @@ static int __init smp_cpu_setup(int cpu)
507508
static bool bootcpu_valid __initdata;
508509
static unsigned int cpu_count = 1;
509510

511+
int arch_register_cpu(int cpu)
512+
{
513+
acpi_handle acpi_handle = acpi_get_processor_handle(cpu);
514+
struct cpu *c = &per_cpu(cpu_devices, cpu);
515+
516+
if (!acpi_disabled && !acpi_handle &&
517+
IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
518+
return -EPROBE_DEFER;
519+
520+
#ifdef CONFIG_ACPI_HOTPLUG_CPU
521+
/* For now block anything that looks like physical CPU Hotplug */
522+
if (invalid_logical_cpuid(cpu) || !cpu_present(cpu)) {
523+
pr_err_once("Changing CPU present bit is not supported\n");
524+
return -ENODEV;
525+
}
526+
#endif
527+
528+
/*
529+
* Availability of the acpi handle is sufficient to establish
530+
* that _STA has aleady been checked. No need to recheck here.
531+
*/
532+
c->hotpluggable = arch_cpu_is_hotpluggable(cpu);
533+
534+
return register_cpu(c, cpu);
535+
}
536+
537+
#ifdef CONFIG_ACPI_HOTPLUG_CPU
538+
void arch_unregister_cpu(int cpu)
539+
{
540+
acpi_handle acpi_handle = acpi_get_processor_handle(cpu);
541+
struct cpu *c = &per_cpu(cpu_devices, cpu);
542+
acpi_status status;
543+
unsigned long long sta;
544+
545+
if (!acpi_handle) {
546+
pr_err_once("Removing a CPU without associated ACPI handle\n");
547+
return;
548+
}
549+
550+
status = acpi_evaluate_integer(acpi_handle, "_STA", NULL, &sta);
551+
if (ACPI_FAILURE(status))
552+
return;
553+
554+
/* For now do not allow anything that looks like physical CPU HP */
555+
if (cpu_present(cpu) && !(sta & ACPI_STA_DEVICE_PRESENT)) {
556+
pr_err_once("Changing CPU present bit is not supported\n");
557+
return;
558+
}
559+
560+
unregister_cpu(c);
561+
}
562+
#endif /* CONFIG_ACPI_HOTPLUG_CPU */
563+
510564
#ifdef CONFIG_ACPI
511565
static struct acpi_madt_generic_interrupt cpu_madt_gicc[NR_CPUS];
512566

@@ -527,7 +581,8 @@ acpi_map_gic_cpu_interface(struct acpi_madt_generic_interrupt *processor)
527581
{
528582
u64 hwid = processor->arm_mpidr;
529583

530-
if (!acpi_gicc_is_usable(processor)) {
584+
if (!(processor->flags &
585+
(ACPI_MADT_ENABLED | ACPI_MADT_GICC_ONLINE_CAPABLE))) {
531586
pr_debug("skipping disabled CPU entry with 0x%llx MPIDR\n", hwid);
532587
return;
533588
}

0 commit comments

Comments
 (0)