Harmonize entry points #17

robamu · 2025-03-25T10:48:49Z

One thing I am not sure about is how to best determine the CPU ID. The MPIDR register is implementation defined, and the individual affinity fields might have completely different meanings depending on the actual implementation (https://developer.arm.com/documentation/ddi0406/b/System-Level-Architecture/Virtual-Memory-System-Architecture--VMSA-/CP15-registers-for-a-VMSA-implementation/c0--Multiprocessor-Affinity-Register--MPIDR-?lang=en#CIHIEGCJ) . The previous implementation in the Cortex-A run-time, which reads the least significant 2 bits of affinity 0, was just the Zynq7000 specific implementation of the MPIDR register. I replaced this with reading all the bits in affinity 0 as well.

Should we still keep this implementation where we assume that a value of 0 in affinity 0 field means CPU 0?

jonathanpallant · 2025-03-25T14:26:32Z

cortex-r-rt/src/lib.rs

    _default_start:
+        // only allow cpu0 through for initialization
+        // Read MPIDR
+        mrc	    p15,0,r1,c0,c0,5


I didn't write it down, and there's no tool to help, but I've been trying for format the assembly like:

some_label: bl thing cmp r1, #0 <- comma and space between arguments ^^^^^^^^ <- eight characters, with spaces for padding ^^^^ <- four spaces ```

jonathanpallant · 2025-03-25T14:29:25Z

cortex-r-rt/src/lib.rs

+    wait_loop:
+        wfe
+        // When Core 0 emits a SEV, the other cores will wake up.
+        // Load CPU ID, we are CPU0


You are not CPU 0, because if you were CPU 0 you'd skip this loop and go straight to initialize?

jonathanpallant · 2025-03-25T14:29:49Z

cortex-r-rt/src/lib.rs

+    wait_loop:
+        wfe
+        // When Core 0 emits a SEV, the other cores will wake up.
+        // Load CPU ID, we are CPU0


As previous comment

jonathanpallant · 2025-03-25T14:30:34Z

It may also be better to use numeric labels like 1 and branch targets like 1f (go forward to the label 1) or 1b (go backward to the label 1), because then you always get a unique symbol name generated. I worry that with a label like initialize that might be visible to the linker and clash with any other C / no_mangle function of the same name.

jonathanpallant · 2025-03-25T14:34:32Z

cortex-r-rt/src/lib.rs

+        mrc	    p15,0,r1,c0,c0,5
+        // Extract CPU ID bits by reading affinity level 0.
+        // For single-core systems, this should always be 0.
+        and	    r1, r1, #0xFF


Maybe we should check the bottom 16 bits are zero (affinity level 0 and affinity level 1)?

jonathanpallant · 2025-03-25T14:35:58Z

cortex-r-rt/src/lib.rs

+        // Load CPU ID, we are CPU0
+        mrc	    p15,0,r0,c0,c0,5
+        // Extract CPU ID bits.
+        and	    r0, r0, #0x3


https://developer.arm.com/documentation/100026/0104/System-Control/AArch32-register-descriptions/Multiprocessor-Affinity-Register?lang=en suggests the bottom 24 bits are important. But actually we might be running on a cluster than isn't cluster 0, so I would select maybe the bottom 16 bits?

robamu · 2025-03-26T09:30:30Z

I could check out the datasheets of some other Cortex-A devices about MPIDR usage, e.g. STM32MP, or Ultrascale+

jonathanpallant · 2025-03-26T09:59:39Z

I could check out the datasheets of some other Cortex-A devices about MPIDR usage, e.g. STM32MP, or Ultrascale+

That would be great. I think it's OK to have a default for 80% of the people, if there's a clear mechanism to override the default for specific use-cases. Like, 'Hey, if MPIDR is 0x4000_0100 then that's my default boot core - spin all the others'. I don't know how to plug user-specified constants into the runtime though. Environment variables? Eww.

robamu · 2025-03-26T16:59:26Z

There is

https://github.com/ARM-software/arm-trusted-firmware/blob/master/include/arch/aarch32/arch.h where the CPU Mask is 0xFF.
https://github.com/ARM-software/arm-trusted-firmware/blob/master/include/arch/aarch64/arch.h where the CPU Mask is 0xFF.
Cortex-A53 manual: https://developer.arm.com/documentation/ddi0500/j/?lang=en p. 83 where the CPU ID is also the last byte.

I have not found a system with clusters yet, but wouldn´t each cluster most probably run a different software anyway?

jonathanpallant · 2025-03-26T18:35:13Z

I was thinking the bottom byte might be for hyper threading (then cores, then clusters), but now I think about it, Arm doesn't do that. So the bottom byte is probably fine then.

jonathanpallant · 2025-03-26T18:37:40Z

I'm going to try and make a multi-core example to verify this all works as expected.

AN536 defaults to only creating a single CPU; this is the equivalent of the way the real FPGA image usually runs with the second Cortex-R52 held in halt via the initial SCC CFG_REG0 register setting. You can create the second CPU with -smp 2; both CPUs will then start execution immediately on startup.

(https://www.qemu.org/docs/master/system/arm/mps2.html)

jonathanpallant · 2025-03-26T18:45:42Z

criticalup run cargo run --target=armv8r-none-eabihf -- -smp 2

This doesn't work because the second CPU never waits on the wfe. I've observed this before - I don't know if it's a qemu-system-arm bug, or if there's always an event pending, or what.

Perhaps spinning on a shared variable is safer than a WFE?

wait:
        wfe
        ldr     r0, =shared_variable
        beq     wait

Then if they can WFE, they'll save power, and if not, it'll still work.

Does QEMU emulate WFE on Cortex-A SMP systems?

robamu · 2025-03-27T09:09:25Z

Hmm, I don't know. I have not found anything in the QEMU docs, so this is probably somewhere in the source code..
The core never waiting on wfe seems weird. Spinning on a shared variable as a safety feature? If this just happens in QEMU / because of a QEMU bug, maybe a an loop in the test app boot core method also works?

There are definitely alternatives to WFE. I think oftentimes, this logic will be overriden for specific CPU families. For example, special handling is required for the zynq7000. Maybe spinning would be a simpler default solution? Requires one shared / global variable though.

robamu · 2025-03-27T09:36:34Z

I looked for some more resources:

Code:

ARM trusted firmware BL1: https://github.com/ARM-software/arm-trusted-firmware/blob/master/bl1/aarch32/bl1_entrypoint.S
Cortex-A53 Aarch32 boot: https://github.com/Xilinx/embeddedsw/blob/master/lib/bsp/standalone/src/arm/ARMv8/32bit/gcc/boot.S
STM32MP1 boot project: https://github.com/4ms/mp1-boot/blob/main/src/startup.s
U-Boot Armv8: https://source.denx.de/u-boot/u-boot/-/blob/master/arch/arm/cpu/armv8/start.S?ref_type=heads
U-Boot Armv7: https://source.denx.de/u-boot/u-boot/-/blob/master/arch/arm/cpu/armv7/start.S?ref_type=heads

Docs:

STM32MPU wiki on this: https://wiki.st.com/stm32mpu/wiki/STM32_MPU_ROM_code_overview#Secondary_core_boot
The Ultrascale+ has a dedicated Platform Management Unit (PMU). I've only skimmed the docs, but it appears that this unit is reponsible for the layered boot up procedure of the Dual Core R5F or the DUal/Quad core Cortex-A53. I would assume that it just lets one processor boot, and the other cores and then woken by the SMP system if needed. The FSBL can only be run on core 0 of A53, or core 0 of R5F or R5F lockstep according to their FSBL wiki (https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18842019/Zynq+UltraScale+FSBL#ZynqUltraScale+FSBL-OnwhatallprocessorcorescanFSBLrunon?).

Summary:

Zynq7000 boot seems to directly use the MPIDR, but I override the default boot routine anyway for other reasons. device specific bootup for secondary cores
STM32MP ROM code parks all processors except the primary core inside an infinite loop and uses a device specific bootup routine for the secondary core
Ultrascale+ has a dedicated platform management unit for handling the bootup. Startup code can only run on dedicated cores.

Maybe it would be a better idea to just ignore the MPIDR/SMP aspect in the provided default boot routine, and assume that either the startup routine is overrident if this is an issue, or the HW takes care of just executing the startup code with one core.. But then, boot_core does not really make much sense anymore.

jonathanpallant · 2025-03-27T14:56:42Z

perhaps you're right - just let users add a bit of code to handle multi-core start up. They can always call our default start-up routine once they've worked out which core they're on.

So we can close this PR, but what should we do with multi-core support in cortex-a-rt?

robamu · 2025-03-27T17:46:40Z

The support could be removed and it could be changed to also use kmain again, similar to the Cortex-R runtime

robamu · 2025-04-02T15:04:18Z

Replaced by #22

robamu force-pushed the harmonize-entry-points branch 2 times, most recently from 248831f to 0293e3e Compare March 25, 2025 11:00

jonathanpallant reviewed Mar 25, 2025

View reviewed changes

robamu force-pushed the harmonize-entry-points branch 2 times, most recently from 040977f to 6235f2e Compare March 26, 2025 09:26

harmonize entry points

ad06b54

robamu force-pushed the harmonize-entry-points branch from 6235f2e to ad06b54 Compare March 26, 2025 09:28

robamu closed this Apr 2, 2025

Harmonize entry points #17

Harmonize entry points #17

Uh oh!

Conversation

robamu commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonathanpallant Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

jonathanpallant Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

jonathanpallant Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

jonathanpallant commented Mar 25, 2025

Uh oh!

jonathanpallant Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

jonathanpallant Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

robamu commented Mar 26, 2025

Uh oh!

jonathanpallant commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robamu commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonathanpallant commented Mar 26, 2025

Uh oh!

jonathanpallant commented Mar 26, 2025

Uh oh!

jonathanpallant commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robamu commented Mar 27, 2025

Uh oh!

robamu commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonathanpallant commented Mar 27, 2025

Uh oh!

robamu commented Mar 27, 2025

Uh oh!

robamu commented Apr 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

robamu commented Mar 25, 2025 •

edited

Loading

jonathanpallant commented Mar 26, 2025 •

edited

Loading

robamu commented Mar 26, 2025 •

edited

Loading

jonathanpallant commented Mar 26, 2025 •

edited

Loading

robamu commented Mar 27, 2025 •

edited

Loading