-
Notifications
You must be signed in to change notification settings - Fork 26
microblaze: Fix -Os right shift optimization is allowed into delay slot #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: zephyr-gcc-14.3.0
Are you sure you want to change the base?
Conversation
During picolibc testing, it's found that `-Os` produces assembly code that compiler squeezes into a single delay slot. Thus, only the first instruction (the one in delay slot) emitted by this optimization is executed and the rest is skipped. This is a regression introduced by applying Microblaze gcc patches zephyrproject-rtos#24. This patch is a 14.3.0 equivalent of zephyrproject-rtos#37. Signed-off-by: Alp Sayin <[email protected]>
4f30e26
to
80f3cbf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please open an sdk-ng PR that pulls this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I lack the expertise on the microblaze arch to properly review this; but, based on the description provided, it looks reasonable.
Will do, I'll need an SDK to run ALL the tests as before anyways. All zephyr tests + picolibc test-suite.
Honestly, it's just assembly delay-slot out-of-order execution nonsense. Honestly I'm no compiler/linker expert either I just found out what's causing the issue, made some educated guesses and flipped a switch. Machine instructions for branch could be suffixed with a If I remember correctly, when this optimization isn't enabled (no Apart from that; When The problem here is that the emission of the loop is marked as And because And so result of this patch is that a no-delay-branch machine instruction ( |
The patch looks good; didn't we do the same thing in the old SDK? Picolibc testing only builds microblazeel, it doesn't have crt0 code, semihosting code or a qemu script to run them. Let's get those implemented so we can get better test coverage before SDK 0.18 gets released. And, yes, running the picolibc tests during SDK build would be awesome. All that requires is access to the relevant qemu binary; picolibc contains all of the necessary code otherwise. |
There is currently QEMU 8.2.2 included in the sdk-build Docker image for libc testing (used by the LLVM build scripts). We could re-use this for GCC libc testing as well. |
To get the picolibc tests working, we'll need primitive crt0 and semihosting support that works with this qemu instance. I attempted to build that llvm version, but it's all a yocto mess, which makes compiling locally difficult. Is there a simple git repository with the qemu source code I can just pull and build? |
@keith-packard the qemu binary should still be inside the SDK produced here? no?:
For some time I've been using https://github.com/alpsayin/picolibc/tree/microblaze-hack-rebased to run the picolibc tests.
We never got to merge the fix ¯_(ツ)_/¯ |
Would you believe I didn't remember having done that? So many CPUs, so little time... Picolibc CI can either consume a Zephyr toolchain or a hand-built one from https://github.com/picolibc/picolibc-ci-tools. In a pinch, it can also use pre-built bits in a tarball it can download. I suspect it will be easiest to do the latter for now? Let's find a way to hand picolibc CI a toolchain and get that running. |
Not really sure what you mean by this; do you mean the one in the Zephyr SDK host tools, which used to be indeed a Yocto mess until I moved all the patches back to the fork.
https://github.com/zephyrproject-rtos/qemu/tree/zephyr-qemu-v10.0.2 |
I wasn't remembering that picolibc already had all of the necessary bits to run the test suite and expected I'd need to debug picolibc, gcc and qemu all together, so I thought I needed to build them all from source locally. Now that @alpsayin reminded me about the existing picolibc support, none of that is required. |
227ce1e
to
80f3cbf
Compare
microblaze: Fix -Os right shift optimization is allowed into delay slot
During picolibc testing, it's found that
-Os
produces assembly code thatcompiler squeezes into a single delay slot. Thus, only the first
instruction (the one in delay slot) emitted by this optimization is
executed and the rest is skipped.
This is a regression introduced by applying Microblaze gcc patches #24.
This patch is a 14.3.0 equivalent of #37.
Signed-off-by: Alp Sayin [email protected]
During picolibc testing, it's found that
-Os
produces assembly code that compilersqueezes into a single delay slot. Thus, only the first instruction emitted by this optimization is run and the rest is skipped.
Optimization is generated by
But
arith
type is NOT disallowed from going into delay slot (see below):gcc/gcc/config/microblaze/microblaze.md
Line 466 in 428d8d7
"Optimized" code is between [191b8-191c8]
where
operands
are:As a result, this code returns a
iy
(r24
) value of (whatever was in r24) - 1023`The fix is simple. I've redeclared this size-optimization as
multi
which isgcc/gcc/config/microblaze/microblaze.md
Line 2070 in 428d8d7
gcc/gcc/config/microblaze/microblaze.md
Line 2483 in 428d8d7
For context, non-size-optimized code is
N
copies of:And as per the above optimization rule, if
N <= 5
we'll still get 5 copies ofsra
instruction.Originally tested via zephyrproject-rtos/sdk-ng#647
Fix seems to have worked, new disassembly as below:
else
but also somehow loop 20x single bit right shift */else
no delay slot */After the jump
CC @keith-packard you can grab a copy of the toolchain from https://github.com/zephyrproject-rtos/sdk-ng/actions/runs/11407925329?pr=647 if it's of any interest.
p.s. new disassembly again but
monospace
cuz it hurts my eyes:compared to: