Skip to content

Commit fdfa588

Browse files
dianderswilldeacon
authored andcommitted
arm64: smp: smp_send_stop() and crash_smp_send_stop() should try non-NMI first
When testing hard lockup handling on my sc7180-trogdor-lazor device with pseudo-NMI enabled, with serial console enabled and with kgdb disabled, I found that the stack crawls printed to the serial console ended up as a jumbled mess. After rebooting, the pstore-based console looked fine though. Also, enabling kgdb to trap the panic made the console look fine and avoided the mess. After a bit of tracking down, I came to the conclusion that this was what was happening: 1. The panic path was stopping all other CPUs with panic_other_cpus_shutdown(). 2. At least one of those other CPUs was in the middle of printing to the serial console and holding the console port's lock, which is grabbed with "irqsave". ...but since we were stopping with an NMI we didn't care about the "irqsave" and interrupted anyway. 3. Since we stopped the CPU while it was holding the lock it would never release it. 4. All future calls to output to the console would end up failing to get the lock in qcom_geni_serial_console_write(). This isn't _totally_ unexpected at panic time but it's a code path that's not well tested, hard to get right, and apparently doesn't work terribly well on the Qualcomm geni serial driver. The Qualcomm geni serial driver was fixed to be a bit better in commit 9e957a1 ("serial: qcom-geni: Don't cancel/abort if we can't get the port lock") but it's nice not to get into this situation in the first place. Taking a page from what x86 appears to do in native_stop_other_cpus(), do this: 1. First, try to stop other CPUs with a normal IPI and wait a second. This gives them a chance to leave critical sections. 2. If CPUs fail to stop then retry with an NMI, but give a much lower timeout since there's no good reason for a CPU not to react quickly to a NMI. This works well and avoids the corrupted console and (presumably) could help avoid other similar issues. In order to do this, we need to do a little re-organization of our IPIs since we don't have any more free IDs. Do what was suggested in previous conversations and combine "stop" and "crash stop". That frees up an IPI so now we can have a "stop" and "stop NMI". In order to do this we also need a slight change in the way we keep track of which CPUs still need to be stopped. We need to know specifically which CPUs haven't stopped yet when we fall back to NMI but in the "crash stop" case the "cpu_online_mask" isn't updated as CPUs go down. This is why that code path had an atomic of the number of CPUs left. Solve this by also updating the "cpu_online_mask" for crash stops. All of the above lets us combine the logic for "stop" and "crash stop" code, which appeared to have a bunch of arbitrary implementation differences. Aside from the above change where we try a normal IPI and then an NMI, the combined function has a few subtle differences: * In the normal smp_send_stop(), if we fail to stop one or more CPUs then we won't include the current CPU (the one running smp_send_stop()) in the error message. * In crash_smp_send_stop(), if we fail to stop some CPUs we'll print the CPUs that we failed to stop instead of printing all _but_ the current running CPU. * In crash_smp_send_stop(), we will now only print "SMP: stopping secondary CPUs" if (system_state <= SYSTEM_RUNNING). Fixes: d740251 ("arm64: smp: IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI") Signed-off-by: Douglas Anderson <[email protected]> Link: https://lore.kernel.org/r/20240821145353.v3.1.Id4817adef610302554b8aa42b090d57270dc119c@changeid Signed-off-by: Will Deacon <[email protected]>
1 parent 93b81ab commit fdfa588

File tree

1 file changed

+97
-63
lines changed

1 file changed

+97
-63
lines changed

arch/arm64/kernel/smp.c

Lines changed: 97 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ enum ipi_msg_type {
6868
IPI_RESCHEDULE,
6969
IPI_CALL_FUNC,
7070
IPI_CPU_STOP,
71-
IPI_CPU_CRASH_STOP,
71+
IPI_CPU_STOP_NMI,
7272
IPI_TIMER,
7373
IPI_IRQ_WORK,
7474
NR_IPI,
@@ -85,6 +85,8 @@ static int ipi_irq_base __ro_after_init;
8585
static int nr_ipi __ro_after_init = NR_IPI;
8686
static struct irq_desc *ipi_desc[MAX_IPI] __ro_after_init;
8787

88+
static bool crash_stop;
89+
8890
static void ipi_setup(int cpu);
8991

9092
#ifdef CONFIG_HOTPLUG_CPU
@@ -821,7 +823,7 @@ static const char *ipi_types[MAX_IPI] __tracepoint_string = {
821823
[IPI_RESCHEDULE] = "Rescheduling interrupts",
822824
[IPI_CALL_FUNC] = "Function call interrupts",
823825
[IPI_CPU_STOP] = "CPU stop interrupts",
824-
[IPI_CPU_CRASH_STOP] = "CPU stop (for crash dump) interrupts",
826+
[IPI_CPU_STOP_NMI] = "CPU stop NMIs",
825827
[IPI_TIMER] = "Timer broadcast interrupts",
826828
[IPI_IRQ_WORK] = "IRQ work interrupts",
827829
[IPI_CPU_BACKTRACE] = "CPU backtrace interrupts",
@@ -865,9 +867,9 @@ void arch_irq_work_raise(void)
865867
}
866868
#endif
867869

868-
static void __noreturn local_cpu_stop(void)
870+
static void __noreturn local_cpu_stop(unsigned int cpu)
869871
{
870-
set_cpu_online(smp_processor_id(), false);
872+
set_cpu_online(cpu, false);
871873

872874
local_daif_mask();
873875
sdei_mask_local_cpu();
@@ -881,21 +883,26 @@ static void __noreturn local_cpu_stop(void)
881883
*/
882884
void __noreturn panic_smp_self_stop(void)
883885
{
884-
local_cpu_stop();
886+
local_cpu_stop(smp_processor_id());
885887
}
886888

887-
#ifdef CONFIG_KEXEC_CORE
888-
static atomic_t waiting_for_crash_ipi = ATOMIC_INIT(0);
889-
#endif
890-
891889
static void __noreturn ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs)
892890
{
893891
#ifdef CONFIG_KEXEC_CORE
892+
/*
893+
* Use local_daif_mask() instead of local_irq_disable() to make sure
894+
* that pseudo-NMIs are disabled. The "crash stop" code starts with
895+
* an IRQ and falls back to NMI (which might be pseudo). If the IRQ
896+
* finally goes through right as we're timing out then the NMI could
897+
* interrupt us. It's better to prevent the NMI and let the IRQ
898+
* finish since the pt_regs will be better.
899+
*/
900+
local_daif_mask();
901+
894902
crash_save_cpu(regs, cpu);
895903

896-
atomic_dec(&waiting_for_crash_ipi);
904+
set_cpu_online(cpu, false);
897905

898-
local_irq_disable();
899906
sdei_mask_local_cpu();
900907

901908
if (IS_ENABLED(CONFIG_HOTPLUG_CPU))
@@ -960,14 +967,12 @@ static void do_handle_IPI(int ipinr)
960967
break;
961968

962969
case IPI_CPU_STOP:
963-
local_cpu_stop();
964-
break;
965-
966-
case IPI_CPU_CRASH_STOP:
967-
if (IS_ENABLED(CONFIG_KEXEC_CORE)) {
970+
case IPI_CPU_STOP_NMI:
971+
if (IS_ENABLED(CONFIG_KEXEC_CORE) && crash_stop) {
968972
ipi_cpu_crash_stop(cpu, get_irq_regs());
969-
970973
unreachable();
974+
} else {
975+
local_cpu_stop(cpu);
971976
}
972977
break;
973978

@@ -1022,8 +1027,7 @@ static bool ipi_should_be_nmi(enum ipi_msg_type ipi)
10221027
return false;
10231028

10241029
switch (ipi) {
1025-
case IPI_CPU_STOP:
1026-
case IPI_CPU_CRASH_STOP:
1030+
case IPI_CPU_STOP_NMI:
10271031
case IPI_CPU_BACKTRACE:
10281032
case IPI_KGDB_ROUNDUP:
10291033
return true;
@@ -1136,79 +1140,109 @@ static inline unsigned int num_other_online_cpus(void)
11361140

11371141
void smp_send_stop(void)
11381142
{
1143+
static unsigned long stop_in_progress;
1144+
cpumask_t mask;
11391145
unsigned long timeout;
11401146

1141-
if (num_other_online_cpus()) {
1142-
cpumask_t mask;
1147+
/*
1148+
* If this cpu is the only one alive at this point in time, online or
1149+
* not, there are no stop messages to be sent around, so just back out.
1150+
*/
1151+
if (num_other_online_cpus() == 0)
1152+
goto skip_ipi;
11431153

1144-
cpumask_copy(&mask, cpu_online_mask);
1145-
cpumask_clear_cpu(smp_processor_id(), &mask);
1154+
/* Only proceed if this is the first CPU to reach this code */
1155+
if (test_and_set_bit(0, &stop_in_progress))
1156+
return;
11461157

1147-
if (system_state <= SYSTEM_RUNNING)
1148-
pr_crit("SMP: stopping secondary CPUs\n");
1149-
smp_cross_call(&mask, IPI_CPU_STOP);
1150-
}
1158+
/*
1159+
* Send an IPI to all currently online CPUs except the CPU running
1160+
* this code.
1161+
*
1162+
* NOTE: we don't do anything here to prevent other CPUs from coming
1163+
* online after we snapshot `cpu_online_mask`. Ideally, the calling code
1164+
* should do something to prevent other CPUs from coming up. This code
1165+
* can be called in the panic path and thus it doesn't seem wise to
1166+
* grab the CPU hotplug mutex ourselves. Worst case:
1167+
* - If a CPU comes online as we're running, we'll likely notice it
1168+
* during the 1 second wait below and then we'll catch it when we try
1169+
* with an NMI (assuming NMIs are enabled) since we re-snapshot the
1170+
* mask before sending an NMI.
1171+
* - If we leave the function and see that CPUs are still online we'll
1172+
* at least print a warning. Especially without NMIs this function
1173+
* isn't foolproof anyway so calling code will just have to accept
1174+
* the fact that there could be cases where a CPU can't be stopped.
1175+
*/
1176+
cpumask_copy(&mask, cpu_online_mask);
1177+
cpumask_clear_cpu(smp_processor_id(), &mask);
11511178

1152-
/* Wait up to one second for other CPUs to stop */
1179+
if (system_state <= SYSTEM_RUNNING)
1180+
pr_crit("SMP: stopping secondary CPUs\n");
1181+
1182+
/*
1183+
* Start with a normal IPI and wait up to one second for other CPUs to
1184+
* stop. We do this first because it gives other processors a chance
1185+
* to exit critical sections / drop locks and makes the rest of the
1186+
* stop process (especially console flush) more robust.
1187+
*/
1188+
smp_cross_call(&mask, IPI_CPU_STOP);
11531189
timeout = USEC_PER_SEC;
11541190
while (num_other_online_cpus() && timeout--)
11551191
udelay(1);
11561192

1157-
if (num_other_online_cpus())
1193+
/*
1194+
* If CPUs are still online, try an NMI. There's no excuse for this to
1195+
* be slow, so we only give them an extra 10 ms to respond.
1196+
*/
1197+
if (num_other_online_cpus() && ipi_should_be_nmi(IPI_CPU_STOP_NMI)) {
1198+
smp_rmb();
1199+
cpumask_copy(&mask, cpu_online_mask);
1200+
cpumask_clear_cpu(smp_processor_id(), &mask);
1201+
1202+
pr_info("SMP: retry stop with NMI for CPUs %*pbl\n",
1203+
cpumask_pr_args(&mask));
1204+
1205+
smp_cross_call(&mask, IPI_CPU_STOP_NMI);
1206+
timeout = USEC_PER_MSEC * 10;
1207+
while (num_other_online_cpus() && timeout--)
1208+
udelay(1);
1209+
}
1210+
1211+
if (num_other_online_cpus()) {
1212+
smp_rmb();
1213+
cpumask_copy(&mask, cpu_online_mask);
1214+
cpumask_clear_cpu(smp_processor_id(), &mask);
1215+
11581216
pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
1159-
cpumask_pr_args(cpu_online_mask));
1217+
cpumask_pr_args(&mask));
1218+
}
11601219

1220+
skip_ipi:
11611221
sdei_mask_local_cpu();
11621222
}
11631223

11641224
#ifdef CONFIG_KEXEC_CORE
11651225
void crash_smp_send_stop(void)
11661226
{
1167-
static int cpus_stopped;
1168-
cpumask_t mask;
1169-
unsigned long timeout;
1170-
11711227
/*
11721228
* This function can be called twice in panic path, but obviously
11731229
* we execute this only once.
1230+
*
1231+
* We use this same boolean to tell whether the IPI we send was a
1232+
* stop or a "crash stop".
11741233
*/
1175-
if (cpus_stopped)
1234+
if (crash_stop)
11761235
return;
1236+
crash_stop = 1;
11771237

1178-
cpus_stopped = 1;
1238+
smp_send_stop();
11791239

1180-
/*
1181-
* If this cpu is the only one alive at this point in time, online or
1182-
* not, there are no stop messages to be sent around, so just back out.
1183-
*/
1184-
if (num_other_online_cpus() == 0)
1185-
goto skip_ipi;
1186-
1187-
cpumask_copy(&mask, cpu_online_mask);
1188-
cpumask_clear_cpu(smp_processor_id(), &mask);
1189-
1190-
atomic_set(&waiting_for_crash_ipi, num_other_online_cpus());
1191-
1192-
pr_crit("SMP: stopping secondary CPUs\n");
1193-
smp_cross_call(&mask, IPI_CPU_CRASH_STOP);
1194-
1195-
/* Wait up to one second for other CPUs to stop */
1196-
timeout = USEC_PER_SEC;
1197-
while ((atomic_read(&waiting_for_crash_ipi) > 0) && timeout--)
1198-
udelay(1);
1199-
1200-
if (atomic_read(&waiting_for_crash_ipi) > 0)
1201-
pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
1202-
cpumask_pr_args(&mask));
1203-
1204-
skip_ipi:
1205-
sdei_mask_local_cpu();
12061240
sdei_handler_abort();
12071241
}
12081242

12091243
bool smp_crash_stop_failed(void)
12101244
{
1211-
return (atomic_read(&waiting_for_crash_ipi) > 0);
1245+
return num_other_online_cpus() != 0;
12121246
}
12131247
#endif
12141248

0 commit comments

Comments
 (0)