-
Notifications
You must be signed in to change notification settings - Fork 8k
soc: intel_adsp/ace: allows more spin relax loop per CPU #96989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
soc: intel_adsp/ace: allows more spin relax loop per CPU #96989
Conversation
Compliance check fails due to have spaces on second line of a |
register uint32_t remaining = CONFIG_SOC_SERIES_INTEL_ADSP_ACE_NUM_SPIN_RELAX_NOPS; | ||
|
||
#if defined(CONFIG_SOC_SERIES_INTEL_ADSP_ACE_NUM_SPIN_RELAX_NOPS_ADD_CPU_ID) | ||
remaining += arch_proc_id(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without going too deeply into this - is this enough? The variable is decremented by 4, so maybe arch_proc_id() * 4
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it, though I feel like always adding 20 NOPs on CPU #5 seems too much.
This allows adding the CPU ID to the number of NOPs in the custom arch_spin_relax(). With the same number of NOPs for all CPUs, it is possible to have them all doing RCW transactions at the same time over and over again if they enter and exit the spin relax loop at the same time. This behavior has been observed when doing lots of context switching, like in the SMP switching stress test. So adds a new kconfig to fine tune the relax loop behavior if needed. The new kconfig allows adding the CPU ID to the number of NOPs which will add some minimal offsetting to workaround the above mentioned situation. Signed-off-by: Daniel Leung <[email protected]>
It has been observed that, during the switching stress test, the context switching becomes very slow due to enough CPUs doing RCW transaction on hardware bus (e.g. spin locks). Since the number of NOPs are the same for all CPUs, they are simply entering and exiting the relax loop at the same time, and hitting the bus with RCW transactions at the same time. Not exactly a deadlock but it slows down the execution enough to result in the test timing out. The SoC layer has added an option to offset the number of NOPs per CPU by adding the CPU to the number of NOPs. So enabling it for the Intel ADSP boards to workaround the above mentioned issue. I haven't encountered another slowdown after turning on this option. So hopefully this lowers the probability of that happening such that a simple retry can pass the test. Signed-off-by: Daniel Leung <[email protected]>
4bafba1
to
bcc3c19
Compare
Just to make checkpatch happy. |
|
This allows adding the CPU ID to the number of NOPs in the custom arch_spin_relax(). With the same number of NOPs for all CPUs, it is possible to have them all doing RCW transactions at the same time over and over again if they enter and exit the spin relax loop at the same time. This behavior has been observed when doing lots of context switching, like in the SMP switching stress test. So adds a new kconfig to fine tune the relax loop behavior if needed. The new kconfig allows adding the CPU ID to the number of NOPs which will add some minimal offsetting to workaround the above mentioned situation.