Skip to content

Conversation

@ycsin
Copy link
Member

@ycsin ycsin commented Nov 7, 2024

The assert validation was added in eefd3da to ensure that _current_cpu is used safely. However, since _current_cpu is used frequently in the kernel/scheduler, especially when SMP is enabled, the assertion has a negative impact on the performance.

This commit introduces a Kconfig to disable just the assert validation of the _current_cpu, without having to disable CONFIG_ASSERT entirely.


Benchmark

Show

latency_measure

newplot
newplot (1)
newplot (2)
newplot (3)
newplot (4)
newplot (5)

scheduler

newplot (6)
newplot (7)
newplot (8)

sys_kernel

newplot (9)
newplot (10)

@ycsin ycsin added the area: SMP Symmetric multiprocessing label Nov 7, 2024
@ycsin
Copy link
Member Author

ycsin commented Nov 7, 2024

cc @srv-meta @akabaev

@ycsin ycsin marked this pull request as ready for review November 7, 2024 04:29
@npitre
Copy link

npitre commented Nov 7, 2024

This is a bit dubious to me. If you care about performance then you should
disable assertions entirely. If you care about catching bugs with assertions
then you shouldn't be selective like this as according to Murphy's law
you'll miss a real bug the moment you start doing this.

@ycsin
Copy link
Member Author

ycsin commented Nov 7, 2024

This is a bit dubious to me. If you care about performance then you should disable assertions entirely. If you care about catching bugs with assertions then you shouldn't be selective like this as according to Murphy's law you'll miss a real bug the moment you start doing this.

I agree with your concern.

But because of the difference this single test makes, and the test is there to make sure that _current_cpu isn't used in the wrong place (I hope testing/static analyzer/compiler catches this instead of runtime), I can't help but wonder if it's worth getting added to the list of CONFIG_SOMETHING_VALIDATE

@ycsin ycsin force-pushed the pr/CONFIG_CURRENT_CPU_VALIDATE branch 2 times, most recently from e2f1b39 to 1bfcf5c Compare November 7, 2024 05:17
@cfriedt
Copy link
Member

cfriedt commented Nov 7, 2024

I agree with @ycsin, but for slightly different reasons. Infrastructure actually depends on general assertions enabled to catch the 1-in-a-million type bugs where it would be physically impossible to reproduce by a human in 100 man-years of time. In those situations, we only have post-mortem analysis. However, production systems also need to be as fast as reasonably possible.

If sufficient testing has been done (both upstream and in pre-production or shadow tiers) to verify that SMP is bug-free (e.g. to some SLA like 99.999% of uptime), then I could see it being pretty useful to want to tune performance and eliminate this hotspot, while keeping general assertions enabled.

Comment on lines 134 to 139
Copy link
Member

@cfriedt cfriedt Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the wording of this could be changed slightly. The general default y should suffice... although I do wish we had some canned phrases for "say y here, unless you really know what you're doing" without sounding condescending.

Suggested change
Use of the _current_cpu pointer cannot be done safely in a preemptible
context. If a thread is preempted and migrates to another CPU, the old
CPU record will be wrong. Add a validation assert to catch incorrect
usages of `_current_cpu`.
*If you do not know what is this, leave this selected.
The _current_cpu pointer should only ever be used in non-preemptible
contexts. When assertions are enabled in SMP systems, this option enables
extra runtime checks to validate _current_cpu usage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SMP can be enabled even with only 1 core, which is maybe not intuitive. I think it's for the situation with different RTOSes running on different cores.

Suggested change
default y
default y if MP_MAX_NUM_CPUS > 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion is compiled when CONFIG_SMP=y, this Kconfig is to simply compile out that assertion completely so I guess the number of CPUs shouldn't matter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had another look in the morning and I get what you meant now - the test is necessary only when there's more than 1 CPUs, and can be disabled otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought - maybe we should just change the CONFIG_SMP here to CONFIG_MP_MAX_NUM_CPUS > 1?

#ifdef CONFIG_SMP
/* True if the current context can be preempted and migrated to
* another SMP CPU.
*/
bool z_smp_cpu_mobile(void);
#define _current_cpu ({ __ASSERT_NO_MSG(!z_smp_cpu_mobile()); \
arch_curr_cpu(); })
#define _current k_sched_current_thread_query()
#else
#define _current_cpu (&_kernel.cpus[0])
#define _current _kernel.cpus[0].current
#endif

Copy link

@npitre npitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guys, sorry, you have it wrong.

does preemption matters, if arch_curr_cpu() is just a single instruction?

Of course it matters. What is the relevance of the returned value if one
instruction later you are preempted and moved to a different CPU? It is not
about retrieving the value itself, but more about the context in which the
value is used.

Imagine you get the current CPU pointer, then an IRQ happens with a reschedule,
and execution is resumed on another CPU, and the reason you wanted the current
CPU pointer was for getting at the current thread pointer. You'll get a pointer
for the thread that is now running on the CPU that used to be current before
preemption happened, not the new one.

The assertion ensures preemption was disabled. Removing this assertion could
allow very hard to reproduce bugs because you'd need to have the exact timing
for the race to happen, and then the buggy result would be random. So this
could go unnoticed for a long time and impossible to track.

I could see it being pretty useful to want to tune performance and
eliminate this hotspot, while keeping general assertions enabled.

In that case, either you look at what makes the assertion so costly (there
is no fundamental reason why it should be so prominent on the radar),
or you get rid of the hot spot altogether. If this is because of _current
usage then the right fix is to work on _current implementation just like
it is done elsewhere (e.g. dedicate a CPU register to it, store and retrieve
it from the TLS, etc). But weakening a safeguard for a very subtle,
unfrequent but nasty bug is not the way to go.

You'd prefer for compile time static analysis to take care of such issue?
I'm all for that as well. But this should come as a replacement not
just an open wish.

Otherwise, what will be the next hot spot on the radar? Will it justify
yet another assertion special exclusion config option? If it was done once,
it can be done twice. Where would such a trend stop?

@npitre
Copy link

npitre commented Nov 7, 2024

Some observations about the assertion cost:

  • _current_cpu calls z_smp_cpu_mobile() in its assertion test.
    This is the only user. No particular reason why this has to be an out-of-line
    function. Inlining it would help with performance.

  • z_smp_cpu_mobile() determines whether preemption is possible by doing an
    arch_irq_lock(), inspecting the key flag, and then unlocking. This is
    quite a roundabout way of doing things.

  • In addition, arch_is_in_isr() is evaluated since no thread preemption
    is achieved when processing IRQ handlers even if IRQs are not disabled.
    Maybe it should be evaluated only if IRQs are known to be disabled?

  • And to top it off, arch_is_in_isr() for RISC-V goes like this:

        unsigned int key = arch_irq_lock();
        bool ret = arch_curr_cpu()->nested != 0U;
        arch_irq_unlock(key);
        return ret;

That makes for a lot of IRQ locking/unlocking. The whole thing could be
optimized with an arch_is_preemptible() implementation. On RISC-V, if this
weren't for PMP support, we could potentially leverage MSTATUS.MPP for that
purpose and turn this into a single inline asm statement with a test.

One question comes to mind: is there any _current_cpu usage that always
happens in IRQ handler context where IRQs are not disabled around it?
If not then the requirement for the assertion could be simplified to
"IRQs must always be disabled" or !arch_irq_enabled(). And on RISC-V
this would look like:

static inline bool arch_is_irq_enabled(void)
{
        unsigned long status;

        inline asm ("csrr %0, mstatus" : "=r" (status));
        return (status & MSTATUS_IEN) != 0;
}

That's 3 instructions if you count the CSRR, the test and the branch for
the assertion. Hardly a noticeable overhead anymore.

This kind of analysis is needed before dropping safeguards is considered.

@ycsin
Copy link
Member Author

ycsin commented Nov 8, 2024

Of course it matters. What is the relevance of the returned value if one instruction later you are preempted and moved to a different CPU? It is not about retrieving the value itself, but more about the context in which the value is used.

Imagine you get the current CPU pointer, then an IRQ happens with a reschedule, and execution is resumed on another CPU, and the reason you wanted the current CPU pointer was for getting at the current thread pointer. You'll get a pointer for the thread that is now running on the CPU that used to be current before preemption happened, not the new one.

The assertion ensures preemption was disabled. Removing this assertion could
allow very hard to reproduce bugs because you'd need to have the exact timing
for the race to happen, and then the buggy result would be random. So this
could go unnoticed for a long time and impossible to track.

AFAICT, the assertion only guarantees that the context isn't preemptible at the very instant where _current_cpu is used, which is crucial if arch_curr_cpu() is more than one instruction and we do not want it to be preempted.

However, it can't guarantee the relevance of curr_cpu_id to the CPU that execute that line for the following example:

_cpu_t *curr_cpu;
int key;

key = arch_irq_lock();
...
curr_cpu = _current_cpu;
...
arch_irq_unlock(key);
...
int curr_cpu_id = curr_cpu->id;
...

I'm trying to make the point that the assertion test is not going to eliminate issues arising from preemption, instead it helps to make sure that the context is not preemptible as it is called, reducing racy issues is an indirect effect. But this is more like a code path/branch problem that can/should be eliminated early on with sufficient (coverage) tests, after that it should be safe to disable that runtime test to boost performance.

This PR isn't really advocating patches to randomly disable runtime tests, but provide a mean to disable this usage-related assertion test in runtime after sufficient testing are done.

This kind of analysis is needed before dropping safeguards is considered.

I must admit that I didn't try to make the test less painful, but IMHO these are two different topics that do not conflict - one tries to make the test lighter, so that less people notice it; the other one provides a choice to disable the test altogether, if it is not required anymore.

That said, this PR should be merged only if we agree that this runtime assertion test is there to ensure the correctness of usage (codepath-related) because we don't have an easy way to guarantee that otherwise. But sufficient coverage testing should render it redundant and therefore safe to be disabled.

The assert validation was added in eefd3da to ensure that
`_current_cpu` is used safely. However, since `_current_cpu`
is used frequently in the kernel/scheduler, especially when
SMP is enabled, the assertion has a negative impact on the
performance.

This commit introduces a Kconfig to disable just the assert
validation of the `_current_cpu`, without having to disable
`CONFIG_ASSERT` entirely.

Signed-off-by: Yong Cong Sin <[email protected]>
@ycsin ycsin force-pushed the pr/CONFIG_CURRENT_CPU_VALIDATE branch from 1bfcf5c to 0d18407 Compare November 8, 2024 04:50
Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO this is too complicated. "_current_cpu" is a kernel-only API. It has a validation layer around it because messing up and calling arch_curr_cpu() in contexts that can migrate is an easy mistake to make. And early in the SMP era this found real bugs.

But this code is all mature now. I don't see why we shouldn't just update all the users to use the underlying arch API directly.

Alternative framing of the argument: we had an API, it could be used dangerously so we wrapped it in a macro. But that macro has costs, so we now wrap it in a kconfig. Just undo both layers of wrapping, and if someone really wants this feature back let's let put it back as a single kconfig in the arch_curr_cpu() API.

@ycsin
Copy link
Member Author

ycsin commented Nov 14, 2024

Just undo both layers of wrapping, and if someone really wants this feature back let's let put it back as a single kconfig in the arch_curr_cpu() API.

The z_smp_cpu_mobile() calls arch_is_in_isr() which then calls arch_curr_proc() in most of the implementation, placing z_smp_cpu_mobile() inside arch_curr_proc() causes infinite recursion (#81322). Unwrapping requires us to copy the implementation of arch_is_in_isr() into arch_curr_proc(), and pass its output into z_smp_cpu_mobile(bool nested).

@github-actions
Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label Jan 14, 2025
@github-actions github-actions bot closed this Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants