Skip to content

Commit ca7e917

Browse files
committed
Merge tag 'x86-apic-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 APIC updates from Thomas Gleixner: "Rework of APIC enumeration and topology evaluation. The current implementation has a couple of shortcomings: - It fails to handle hybrid systems correctly. - The APIC registration code which handles CPU number assignents is in the middle of the APIC code and detached from the topology evaluation. - The various mechanisms which enumerate APICs, ACPI, MPPARSE and guest specific ones, tweak global variables as they see fit or in case of XENPV just hack around the generic mechanisms completely. - The CPUID topology evaluation code is sprinkled all over the vendor code and reevaluates global variables on every hotplug operation. - There is no way to analyze topology on the boot CPU before bringing up the APs. This causes problems for infrastructure like PERF which needs to size certain aspects upfront or could be simplified if that would be possible. - The APIC admission and CPU number association logic is incomprehensible and overly complex and needs to be kept around after boot instead of completing this right after the APIC enumeration. This update addresses these shortcomings with the following changes: - Rework the CPUID evaluation code so it is common for all vendors and provides information about the APIC ID segments in a uniform way independent of the number of segments (Thread, Core, Module, ..., Die, Package) so that this information can be computed instead of rewriting global variables of dubious value over and over. - A few cleanups and simplifcations of the APIC, IO/APIC and related interfaces to prepare for the topology evaluation changes. - Seperation of the parser stages so the early evaluation which tries to find the APIC address can be seperately overridden from the late evaluation which enumerates and registers the local APIC as further preparation for sanitizing the topology evaluation. - A new registration and admission logic which - encapsulates the inner workings so that parsers and guest logic cannot longer fiddle in it - uses the APIC ID segments to build topology bitmaps at registration time - provides a sane admission logic - allows to detect the crash kernel case, where CPU0 does not run on the real BSP, automatically. This is required to prevent sending INIT/SIPI sequences to the real BSP which would reset the whole machine. This was so far handled by a tedious command line parameter, which does not even work in nested crash scenarios. - Associates CPU number after the enumeration completed and prevents the late registration of APICs, which was somehow tolerated before. - Converting all parsers and guest enumeration mechanisms over to the new interfaces. This allows to get rid of all global variable tweaking from the parsers and enumeration mechanisms and sanitizes the XEN[PV] handling so it can use CPUID evaluation for the first time. - Mopping up existing sins by taking the information from the APIC ID segment bitmaps. This evaluates hybrid systems correctly on the boot CPU and allows for cleanups and fixes in the related drivers, e.g. PERF. The series has been extensively tested and the minimal late fallout due to a broken ACPI/MADT table has been addressed by tightening the admission logic further" * tag 'x86-apic-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (76 commits) x86/topology: Ignore non-present APIC IDs in a present package x86/apic: Build the x86 topology enumeration functions on UP APIC builds too smp: Provide 'setup_max_cpus' definition on UP too smp: Avoid 'setup_max_cpus' namespace collision/shadowing x86/bugs: Use fixed addressing for VERW operand x86/cpu/topology: Get rid of cpuinfo::x86_max_cores x86/cpu/topology: Provide __num_[cores|threads]_per_package x86/cpu/topology: Rename topology_max_die_per_package() x86/cpu/topology: Rename smp_num_siblings x86/cpu/topology: Retrieve cores per package from topology bitmaps x86/cpu/topology: Use topology logical mapping mechanism x86/cpu/topology: Provide logical pkg/die mapping x86/cpu/topology: Simplify cpu_mark_primary_thread() x86/cpu/topology: Mop up primary thread mask handling x86/cpu/topology: Use topology bitmaps for sizing x86/cpu/topology: Let XEN/PV use topology from CPUID/MADT x86/xen/smp_pv: Count number of vCPUs early x86/cpu/topology: Assign hotpluggable CPUIDs during init x86/cpu/topology: Reject unknown APIC IDs on ACPI hotplug x86/topology: Add a mechanism to track topology via APIC IDs ...
2 parents d08c407 + f0551af commit ca7e917

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

86 files changed

+1555
-1569
lines changed

Documentation/admin-guide/kdump/kdump.rst

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -191,9 +191,7 @@ Dump-capture kernel config options (Arch Dependent, i386 and x86_64)
191191
CPU is enough for kdump kernel to dump vmcore on most of systems.
192192

193193
However, you can also specify nr_cpus=X to enable multiple processors
194-
in kdump kernel. In this case, "disable_cpu_apicid=" is needed to
195-
tell kdump kernel which cpu is 1st kernel's BSP. Please refer to
196-
admin-guide/kernel-parameters.txt for more details.
194+
in kdump kernel.
197195

198196
With CONFIG_SMP=n, the above things are not related.
199197

@@ -454,8 +452,7 @@ Notes on loading the dump-capture kernel:
454452
to use multi-thread programs with it, such as parallel dump feature of
455453
makedumpfile. Otherwise, the multi-thread program may have a great
456454
performance degradation. To enable multi-cpu support, you should bring up an
457-
SMP dump-capture kernel and specify maxcpus/nr_cpus, disable_cpu_apicid=[X]
458-
options while loading it.
455+
SMP dump-capture kernel and specify maxcpus/nr_cpus options while loading it.
459456

460457
* For s390x there are two kdump modes: If a ELF header is specified with
461458
the elfcorehdr= kernel parameter, it is used by the kdump kernel as it

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1095,15 +1095,6 @@
10951095
Disable TLBIE instruction. Currently does not work
10961096
with KVM, with HASH MMU, or with coherent accelerators.
10971097

1098-
disable_cpu_apicid= [X86,APIC,SMP]
1099-
Format: <int>
1100-
The number of initial APIC ID for the
1101-
corresponding CPU to be disabled at boot,
1102-
mostly used for the kdump 2nd kernel to
1103-
disable BSP to wake up multiple CPUs without
1104-
causing system reset or hang due to sending
1105-
INIT from AP to BSP.
1106-
11071098
disable_ddw [PPC/PSERIES,EARLY]
11081099
Disable Dynamic DMA Window support. Use this
11091100
to workaround buggy firmware.

Documentation/arch/x86/topology.rst

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -47,17 +47,21 @@ AMD nomenclature for package is 'Node'.
4747

4848
Package-related topology information in the kernel:
4949

50-
- cpuinfo_x86.x86_max_cores:
50+
- topology_num_threads_per_package()
5151

52-
The number of cores in a package. This information is retrieved via CPUID.
52+
The number of threads in a package.
5353

54-
- cpuinfo_x86.x86_max_dies:
54+
- topology_num_cores_per_package()
5555

56-
The number of dies in a package. This information is retrieved via CPUID.
56+
The number of cores in a package.
57+
58+
- topology_max_dies_per_package()
59+
60+
The maximum number of dies in a package.
5761

5862
- cpuinfo_x86.topo.die_id:
5963

60-
The physical ID of the die. This information is retrieved via CPUID.
64+
The physical ID of the die.
6165

6266
- cpuinfo_x86.topo.pkg_id:
6367

@@ -96,16 +100,6 @@ are SMT- or CMT-type threads.
96100
AMDs nomenclature for a CMT core is "Compute Unit". The kernel always uses
97101
"core".
98102

99-
Core-related topology information in the kernel:
100-
101-
- smp_num_siblings:
102-
103-
The number of threads in a core. The number of threads in a package can be
104-
calculated by::
105-
106-
threads_per_package = cpuinfo_x86.x86_max_cores * smp_num_siblings
107-
108-
109103
Threads
110104
=======
111105
A thread is a single scheduling unit. It's the equivalent to a logical Linux

arch/x86/events/amd/core.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -579,7 +579,7 @@ static void amd_pmu_cpu_starting(int cpu)
579579
if (!x86_pmu.amd_nb_constraints)
580580
return;
581581

582-
nb_id = topology_die_id(cpu);
582+
nb_id = topology_amd_node_id(cpu);
583583
WARN_ON_ONCE(nb_id == BAD_APICID);
584584

585585
for_each_online_cpu(i) {

arch/x86/events/intel/cstate.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -834,7 +834,7 @@ static int __init cstate_init(void)
834834
}
835835

836836
if (has_cstate_pkg) {
837-
if (topology_max_die_per_package() > 1) {
837+
if (topology_max_dies_per_package() > 1) {
838838
err = perf_pmu_register(&cstate_pkg_pmu,
839839
"cstate_die", -1);
840840
} else {

arch/x86/events/intel/uncore.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1893,7 +1893,7 @@ static int __init intel_uncore_init(void)
18931893
return -ENODEV;
18941894

18951895
__uncore_max_dies =
1896-
topology_max_packages() * topology_max_die_per_package();
1896+
topology_max_packages() * topology_max_dies_per_package();
18971897

18981898
id = x86_match_cpu(intel_uncore_match);
18991899
if (!id) {

arch/x86/events/intel/uncore_nhmex.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1221,8 +1221,8 @@ void nhmex_uncore_cpu_init(void)
12211221
uncore_nhmex = true;
12221222
else
12231223
nhmex_uncore_mbox.event_descs = wsmex_uncore_mbox_events;
1224-
if (nhmex_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
1225-
nhmex_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
1224+
if (nhmex_uncore_cbox.num_boxes > topology_num_cores_per_package())
1225+
nhmex_uncore_cbox.num_boxes = topology_num_cores_per_package();
12261226
uncore_msr_uncores = nhmex_msr_uncores;
12271227
}
12281228
/* end of Nehalem-EX uncore support */

arch/x86/events/intel/uncore_snb.c

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -364,8 +364,8 @@ static struct intel_uncore_type *snb_msr_uncores[] = {
364364
void snb_uncore_cpu_init(void)
365365
{
366366
uncore_msr_uncores = snb_msr_uncores;
367-
if (snb_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
368-
snb_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
367+
if (snb_uncore_cbox.num_boxes > topology_num_cores_per_package())
368+
snb_uncore_cbox.num_boxes = topology_num_cores_per_package();
369369
}
370370

371371
static void skl_uncore_msr_init_box(struct intel_uncore_box *box)
@@ -428,8 +428,8 @@ static struct intel_uncore_type *skl_msr_uncores[] = {
428428
void skl_uncore_cpu_init(void)
429429
{
430430
uncore_msr_uncores = skl_msr_uncores;
431-
if (skl_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
432-
skl_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
431+
if (skl_uncore_cbox.num_boxes > topology_num_cores_per_package())
432+
skl_uncore_cbox.num_boxes = topology_num_cores_per_package();
433433
snb_uncore_arb.ops = &skl_uncore_msr_ops;
434434
}
435435

arch/x86/events/intel/uncore_snbep.c

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1172,8 +1172,8 @@ static struct intel_uncore_type *snbep_msr_uncores[] = {
11721172

11731173
void snbep_uncore_cpu_init(void)
11741174
{
1175-
if (snbep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
1176-
snbep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
1175+
if (snbep_uncore_cbox.num_boxes > topology_num_cores_per_package())
1176+
snbep_uncore_cbox.num_boxes = topology_num_cores_per_package();
11771177
uncore_msr_uncores = snbep_msr_uncores;
11781178
}
11791179

@@ -1406,7 +1406,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
14061406
*/
14071407
for (i = 0; i < 8; i++) {
14081408
if (nodeid == GIDNIDMAP(gidnid, i)) {
1409-
if (topology_max_die_per_package() > 1)
1409+
if (topology_max_dies_per_package() > 1)
14101410
die_id = i;
14111411
else
14121412
die_id = topology_phys_to_logical_pkg(i);
@@ -1845,8 +1845,8 @@ static struct intel_uncore_type *ivbep_msr_uncores[] = {
18451845

18461846
void ivbep_uncore_cpu_init(void)
18471847
{
1848-
if (ivbep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
1849-
ivbep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
1848+
if (ivbep_uncore_cbox.num_boxes > topology_num_cores_per_package())
1849+
ivbep_uncore_cbox.num_boxes = topology_num_cores_per_package();
18501850
uncore_msr_uncores = ivbep_msr_uncores;
18511851
}
18521852

@@ -2917,8 +2917,8 @@ static bool hswep_has_limit_sbox(unsigned int device)
29172917

29182918
void hswep_uncore_cpu_init(void)
29192919
{
2920-
if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
2921-
hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
2920+
if (hswep_uncore_cbox.num_boxes > topology_num_cores_per_package())
2921+
hswep_uncore_cbox.num_boxes = topology_num_cores_per_package();
29222922

29232923
/* Detect 6-8 core systems with only two SBOXes */
29242924
if (hswep_has_limit_sbox(HSWEP_PCU_DID))
@@ -3280,8 +3280,8 @@ static struct event_constraint bdx_uncore_pcu_constraints[] = {
32803280

32813281
void bdx_uncore_cpu_init(void)
32823282
{
3283-
if (bdx_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
3284-
bdx_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
3283+
if (bdx_uncore_cbox.num_boxes > topology_num_cores_per_package())
3284+
bdx_uncore_cbox.num_boxes = topology_num_cores_per_package();
32853285
uncore_msr_uncores = bdx_msr_uncores;
32863286

32873287
/* Detect systems with no SBOXes */

arch/x86/events/rapl.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -674,7 +674,7 @@ static const struct attribute_group *rapl_attr_update[] = {
674674

675675
static int __init init_rapl_pmus(void)
676676
{
677-
int maxdie = topology_max_packages() * topology_max_die_per_package();
677+
int maxdie = topology_max_packages() * topology_max_dies_per_package();
678678
size_t size;
679679

680680
size = sizeof(*rapl_pmus) + maxdie * sizeof(struct rapl_pmu *);

0 commit comments

Comments
 (0)