Skip to content

Conversation

dcpleung
Copy link
Member

@dcpleung dcpleung commented Oct 3, 2025

This allows adding the CPU ID to the number of NOPs in the custom arch_spin_relax(). With the same number of NOPs for all CPUs, it is possible to have them all doing RCW transactions at the same time over and over again if they enter and exit the spin relax loop at the same time. This behavior has been observed when doing lots of context switching, like in the SMP switching stress test. So adds a new kconfig to fine tune the relax loop behavior if needed. The new kconfig allows adding the CPU ID to the number of NOPs which will add some minimal offsetting to workaround the above mentioned situation.

@dcpleung dcpleung linked an issue Oct 3, 2025 that may be closed by this pull request
1 task
@dcpleung
Copy link
Member Author

dcpleung commented Oct 3, 2025

Compliance check fails due to have spaces on second line of a #if.

@dcpleung dcpleung marked this pull request as ready for review October 3, 2025 22:12
@zephyrbot zephyrbot added area: Boards/SoCs area: Tests Issues related to a particular existing or missing test platform: Intel ADSP Intel Audio platforms area: Kernel labels Oct 3, 2025
register uint32_t remaining = CONFIG_SOC_SERIES_INTEL_ADSP_ACE_NUM_SPIN_RELAX_NOPS;

#if defined(CONFIG_SOC_SERIES_INTEL_ADSP_ACE_NUM_SPIN_RELAX_NOPS_ADD_CPU_ID)
remaining += arch_proc_id();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without going too deeply into this - is this enough? The variable is decremented by 4, so maybe arch_proc_id() * 4?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it, though I feel like always adding 20 NOPs on CPU #5 seems too much.

nashif
nashif previously approved these changes Oct 7, 2025
peter-mitsis
peter-mitsis previously approved these changes Oct 7, 2025
This allows adding the CPU ID to the number of NOPs in
the custom arch_spin_relax(). With the same number of NOPs
for all CPUs, it is possible to have them all doing RCW
transactions at the same time over and over again if they
enter and exit the spin relax loop at the same time.
This behavior has been observed when doing lots of context
switching, like in the SMP switching stress test. So adds
a new kconfig to fine tune the relax loop behavior if
needed. The new kconfig allows adding the CPU ID to
the number of NOPs which will add some minimal offsetting
to workaround the above mentioned situation.

Signed-off-by: Daniel Leung <[email protected]>
It has been observed that, during the switching stress test,
the context switching becomes very slow due to enough CPUs
doing RCW transaction on hardware bus (e.g. spin locks).
Since the number of NOPs are the same for all CPUs, they are
simply entering and exiting the relax loop at the same time,
and hitting the bus with RCW transactions at the same time.
Not exactly a deadlock but it slows down the execution
enough to result in the test timing out. The SoC layer has
added an option to offset the number of NOPs per CPU by
adding the CPU to the number of NOPs. So enabling it for
the Intel ADSP boards to workaround the above mentioned
issue. I haven't encountered another slowdown after turning
on this option. So hopefully this lowers the probability of
that happening such that a simple retry can pass the test.

Signed-off-by: Daniel Leung <[email protected]>
@dcpleung dcpleung dismissed stale reviews from peter-mitsis and nashif via bcc3c19 October 8, 2025 20:31
@dcpleung dcpleung force-pushed the intel_adsp/more_relax_nops branch from 4bafba1 to bcc3c19 Compare October 8, 2025 20:31
@dcpleung
Copy link
Member Author

dcpleung commented Oct 8, 2025

Just to make checkpatch happy.

Copy link

sonarqubecloud bot commented Oct 8, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: Boards/SoCs area: Kernel area: Tests Issues related to a particular existing or missing test platform: Intel ADSP Intel Audio platforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ace20_lnl: kernel.multiprocessing.smp test fail

6 participants