kernel/smp: add a Kconfig to disable validation of `_current_cpu` #81042

ycsin · 2024-11-07T04:27:34Z

The assert validation was added in eefd3da to ensure that _current_cpu is used safely. However, since _current_cpu is used frequently in the kernel/scheduler, especially when SMP is enabled, the assertion has a negative impact on the performance.

This commit introduces a Kconfig to disable just the assert validation of the _current_cpu, without having to disable CONFIG_ASSERT entirely.

Benchmark

Show

latency_measure

scheduler

sys_kernel

ycsin · 2024-11-07T04:29:44Z

cc @srv-meta @akabaev

include/zephyr/kernel_structs.h

npitre · 2024-11-07T04:39:30Z

This is a bit dubious to me. If you care about performance then you should
disable assertions entirely. If you care about catching bugs with assertions
then you shouldn't be selective like this as according to Murphy's law
you'll miss a real bug the moment you start doing this.

ycsin · 2024-11-07T05:11:54Z

This is a bit dubious to me. If you care about performance then you should disable assertions entirely. If you care about catching bugs with assertions then you shouldn't be selective like this as according to Murphy's law you'll miss a real bug the moment you start doing this.

I agree with your concern.

But because of the difference this single test makes, and the test is there to make sure that _current_cpu isn't used in the wrong place (I hope testing/static analyzer/compiler catches this instead of runtime), I can't help but wonder if it's worth getting added to the list of CONFIG_SOMETHING_VALIDATE

cfriedt · 2024-11-07T12:30:32Z

I agree with @ycsin, but for slightly different reasons. Infrastructure actually depends on general assertions enabled to catch the 1-in-a-million type bugs where it would be physically impossible to reproduce by a human in 100 man-years of time. In those situations, we only have post-mortem analysis. However, production systems also need to be as fast as reasonably possible.

If sufficient testing has been done (both upstream and in pre-production or shadow tiers) to verify that SMP is bug-free (e.g. to some SLA like 99.999% of uptime), then I could see it being pretty useful to want to tune performance and eliminate this hotspot, while keeping general assertions enabled.

cfriedt · 2024-11-07T12:37:43Z

kernel/Kconfig.smp

I think the wording of this could be changed slightly. The general default y should suffice... although I do wish we had some canned phrases for "say y here, unless you really know what you're doing" without sounding condescending.

Suggested change

Use of the _current_cpu pointer cannot be done safely in a preemptible

context. If a thread is preempted and migrates to another CPU, the old

CPU record will be wrong. Add a validation assert to catch incorrect

usages of `_current_cpu`.

*If you do not know what is this, leave this selected.

The _current_cpu pointer should only ever be used in non-preemptible

contexts. When assertions are enabled in SMP systems, this option enables

extra runtime checks to validate _current_cpu usage.

cfriedt · 2024-11-07T12:40:08Z

kernel/Kconfig.smp

SMP can be enabled even with only 1 core, which is maybe not intuitive. I think it's for the situation with different RTOSes running on different cores.

Suggested change

default y

default y if MP_MAX_NUM_CPUS > 1

The assertion is compiled when CONFIG_SMP=y, this Kconfig is to simply compile out that assertion completely so I guess the number of CPUs shouldn't matter

Had another look in the morning and I get what you meant now - the test is necessary only when there's more than 1 CPUs, and can be disabled otherwise.

Another thought - maybe we should just change the CONFIG_SMP here to CONFIG_MP_MAX_NUM_CPUS > 1?

zephyr/include/zephyr/kernel_structs.h

Lines 254 to 268 in 6798064

#ifdef CONFIG_SMP

/* True if the current context can be preempted and migrated to

* another SMP CPU.

*/

bool z_smp_cpu_mobile(void);

#define _current_cpu ({ __ASSERT_NO_MSG(!z_smp_cpu_mobile()); \

arch_curr_cpu(); })

#define _current k_sched_current_thread_query()

#else

#define _current_cpu (&_kernel.cpus[0])

#define _current _kernel.cpus[0].current

#endif

include/zephyr/kernel_structs.h

npitre

Guys, sorry, you have it wrong.

does preemption matters, if arch_curr_cpu() is just a single instruction?

Of course it matters. What is the relevance of the returned value if one
instruction later you are preempted and moved to a different CPU? It is not
about retrieving the value itself, but more about the context in which the
value is used.

Imagine you get the current CPU pointer, then an IRQ happens with a reschedule,
and execution is resumed on another CPU, and the reason you wanted the current
CPU pointer was for getting at the current thread pointer. You'll get a pointer
for the thread that is now running on the CPU that used to be current before
preemption happened, not the new one.

The assertion ensures preemption was disabled. Removing this assertion could
allow very hard to reproduce bugs because you'd need to have the exact timing
for the race to happen, and then the buggy result would be random. So this
could go unnoticed for a long time and impossible to track.

I could see it being pretty useful to want to tune performance and
eliminate this hotspot, while keeping general assertions enabled.

In that case, either you look at what makes the assertion so costly (there
is no fundamental reason why it should be so prominent on the radar),
or you get rid of the hot spot altogether. If this is because of _current
usage then the right fix is to work on _current implementation just like
it is done elsewhere (e.g. dedicate a CPU register to it, store and retrieve
it from the TLS, etc). But weakening a safeguard for a very subtle,
unfrequent but nasty bug is not the way to go.

You'd prefer for compile time static analysis to take care of such issue?
I'm all for that as well. But this should come as a replacement not
just an open wish.

Otherwise, what will be the next hot spot on the radar? Will it justify
yet another assertion special exclusion config option? If it was done once,
it can be done twice. Where would such a trend stop?

npitre · 2024-11-07T20:12:56Z

Some observations about the assertion cost:

_current_cpu calls z_smp_cpu_mobile() in its assertion test.
This is the only user. No particular reason why this has to be an out-of-line
function. Inlining it would help with performance.
z_smp_cpu_mobile() determines whether preemption is possible by doing an
arch_irq_lock(), inspecting the key flag, and then unlocking. This is
quite a roundabout way of doing things.
In addition, arch_is_in_isr() is evaluated since no thread preemption
is achieved when processing IRQ handlers even if IRQs are not disabled.
Maybe it should be evaluated only if IRQs are known to be disabled?
And to top it off, arch_is_in_isr() for RISC-V goes like this:

        unsigned int key = arch_irq_lock();
        bool ret = arch_curr_cpu()->nested != 0U;
        arch_irq_unlock(key);
        return ret;

That makes for a lot of IRQ locking/unlocking. The whole thing could be
optimized with an arch_is_preemptible() implementation. On RISC-V, if this
weren't for PMP support, we could potentially leverage MSTATUS.MPP for that
purpose and turn this into a single inline asm statement with a test.

One question comes to mind: is there any _current_cpu usage that always
happens in IRQ handler context where IRQs are not disabled around it?
If not then the requirement for the assertion could be simplified to
"IRQs must always be disabled" or !arch_irq_enabled(). And on RISC-V
this would look like:

static inline bool arch_is_irq_enabled(void)
{
        unsigned long status;

        inline asm ("csrr %0, mstatus" : "=r" (status));
        return (status & MSTATUS_IEN) != 0;
}

That's 3 instructions if you count the CSRR, the test and the branch for
the assertion. Hardly a noticeable overhead anymore.

This kind of analysis is needed before dropping safeguards is considered.

ycsin · 2024-11-08T04:38:43Z

Of course it matters. What is the relevance of the returned value if one instruction later you are preempted and moved to a different CPU? It is not about retrieving the value itself, but more about the context in which the value is used.

Imagine you get the current CPU pointer, then an IRQ happens with a reschedule, and execution is resumed on another CPU, and the reason you wanted the current CPU pointer was for getting at the current thread pointer. You'll get a pointer for the thread that is now running on the CPU that used to be current before preemption happened, not the new one.

The assertion ensures preemption was disabled. Removing this assertion could
allow very hard to reproduce bugs because you'd need to have the exact timing
for the race to happen, and then the buggy result would be random. So this
could go unnoticed for a long time and impossible to track.

AFAICT, the assertion only guarantees that the context isn't preemptible at the very instant where _current_cpu is used, which is crucial if arch_curr_cpu() is more than one instruction and we do not want it to be preempted.

However, it can't guarantee the relevance of curr_cpu_id to the CPU that execute that line for the following example:

_cpu_t *curr_cpu;
int key;

key = arch_irq_lock();
...
curr_cpu = _current_cpu;
...
arch_irq_unlock(key);
...
int curr_cpu_id = curr_cpu->id;
...

I'm trying to make the point that the assertion test is not going to eliminate issues arising from preemption, instead it helps to make sure that the context is not preemptible as it is called, reducing racy issues is an indirect effect. But this is more like a code path/branch problem that can/should be eliminated early on with sufficient (coverage) tests, after that it should be safe to disable that runtime test to boost performance.

This PR isn't really advocating patches to randomly disable runtime tests, but provide a mean to disable this usage-related assertion test in runtime after sufficient testing are done.

This kind of analysis is needed before dropping safeguards is considered.

I must admit that I didn't try to make the test less painful, but IMHO these are two different topics that do not conflict - one tries to make the test lighter, so that less people notice it; the other one provides a choice to disable the test altogether, if it is not required anymore.

That said, this PR should be merged only if we agree that this runtime assertion test is there to ensure the correctness of usage (codepath-related) because we don't have an easy way to guarantee that otherwise. But sufficient coverage testing should render it redundant and therefore safe to be disabled.

The assert validation was added in eefd3da to ensure that `_current_cpu` is used safely. However, since `_current_cpu` is used frequently in the kernel/scheduler, especially when SMP is enabled, the assertion has a negative impact on the performance. This commit introduces a Kconfig to disable just the assert validation of the `_current_cpu`, without having to disable `CONFIG_ASSERT` entirely. Signed-off-by: Yong Cong Sin <[email protected]>

andyross

IMHO this is too complicated. "_current_cpu" is a kernel-only API. It has a validation layer around it because messing up and calling arch_curr_cpu() in contexts that can migrate is an easy mistake to make. And early in the SMP era this found real bugs.

But this code is all mature now. I don't see why we shouldn't just update all the users to use the underlying arch API directly.

Alternative framing of the argument: we had an API, it could be used dangerously so we wrapped it in a macro. But that macro has costs, so we now wrap it in a kconfig. Just undo both layers of wrapping, and if someone really wants this feature back let's let put it back as a single kconfig in the arch_curr_cpu() API.

ycsin · 2024-11-14T03:18:04Z

Just undo both layers of wrapping, and if someone really wants this feature back let's let put it back as a single kconfig in the arch_curr_cpu() API.

The z_smp_cpu_mobile() calls arch_is_in_isr() which then calls arch_curr_proc() in most of the implementation, placing z_smp_cpu_mobile() inside arch_curr_proc() causes infinite recursion (#81322). Unwrapping requires us to copy the implementation of arch_is_in_isr() into arch_curr_proc(), and pass its output into z_smp_cpu_mobile(bool nested).

github-actions · 2025-01-14T00:32:13Z

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

ycsin added the area: Architectures label Nov 7, 2024

ycsin requested review from andyross, cfriedt, dcpleung, luchnikov, nashif and peter-mitsis November 7, 2024 04:28

ycsin added the area: SMP Symmetric multiprocessing label Nov 7, 2024

ycsin marked this pull request as ready for review November 7, 2024 04:29

zephyrbot added the area: Kernel label Nov 7, 2024

zephyrbot requested review from TaiJuWu, ceolin and npitre November 7, 2024 04:30

zephyrbot assigned andyross and peter-mitsis Nov 7, 2024

ycsin commented Nov 7, 2024

View reviewed changes

include/zephyr/kernel_structs.h Outdated Show resolved Hide resolved

ycsin force-pushed the pr/CONFIG_CURRENT_CPU_VALIDATE branch 2 times, most recently from e2f1b39 to 1bfcf5c Compare November 7, 2024 05:17

cfriedt reviewed Nov 7, 2024

View reviewed changes

ycsin commented Nov 7, 2024

View reviewed changes

include/zephyr/kernel_structs.h Outdated Show resolved Hide resolved

npitre requested changes Nov 7, 2024

View reviewed changes

ycsin force-pushed the pr/CONFIG_CURRENT_CPU_VALIDATE branch from 1bfcf5c to 0d18407 Compare November 8, 2024 04:50

andyross requested changes Nov 12, 2024

View reviewed changes

github-actions bot added the Stale label Jan 14, 2025

github-actions bot closed this Jan 28, 2025

-	  Use of the _current_cpu pointer cannot be done safely in a preemptible
-	  context. If a thread is preempted and migrates to another CPU, the old
-	  CPU record will be wrong. Add a validation assert to catch incorrect
-	  usages of `_current_cpu`.
-	  *If you do not know what is this, leave this selected.
+	  The _current_cpu pointer should only ever be used in non-preemptible
+	  contexts. When assertions are enabled in SMP systems, this option enables
+	  extra runtime checks to validate _current_cpu usage.

	#ifdef CONFIG_SMP

	/* True if the current context can be preempted and migrated to
	* another SMP CPU.
	*/
	bool z_smp_cpu_mobile(void);

	#define _current_cpu ({ __ASSERT_NO_MSG(!z_smp_cpu_mobile()); \
	arch_curr_cpu(); })
	#define _current k_sched_current_thread_query()

	#else
	#define _current_cpu (&_kernel.cpus[0])
	#define _current _kernel.cpus[0].current
	#endif

kernel/smp: add a Kconfig to disable validation of _current_cpu #81042

kernel/smp: add a Kconfig to disable validation of _current_cpu #81042

Uh oh!

Conversation

ycsin commented Nov 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark

latency_measure

scheduler

sys_kernel

Uh oh!

ycsin commented Nov 7, 2024

Uh oh!

Uh oh!

npitre commented Nov 7, 2024

Uh oh!

ycsin commented Nov 7, 2024

Uh oh!

cfriedt commented Nov 7, 2024

Uh oh!

cfriedt Nov 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cfriedt Nov 7, 2024

Choose a reason for hiding this comment

Uh oh!

ycsin Nov 7, 2024

Choose a reason for hiding this comment

Uh oh!

ycsin Nov 8, 2024

Choose a reason for hiding this comment

Uh oh!

ycsin Nov 8, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

npitre left a comment

Choose a reason for hiding this comment

Uh oh!

npitre commented Nov 7, 2024

Uh oh!

ycsin commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andyross left a comment

Choose a reason for hiding this comment

Uh oh!

ycsin commented Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

kernel/smp: add a Kconfig to disable validation of `_current_cpu` #81042

kernel/smp: add a Kconfig to disable validation of `_current_cpu` #81042

ycsin commented Nov 7, 2024 •

edited

Loading

cfriedt Nov 7, 2024 •

edited

Loading

ycsin commented Nov 8, 2024 •

edited

Loading

ycsin commented Nov 14, 2024 •

edited

Loading