-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[AArch64] Skip storing of stack arguments when lowering tail calls #126735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-aarch64 Author: Guy David (guy-david) ChangesThis issue starts in the selection DAG and causes the backend to emit the following for a trivial tail call: I'm not too sure that checking for immutability of a specific stack object is a good enough of a gurantee, because as soon a tail-call is done lowering, This can be extended to the ARM backend as well. Full diff: https://github.com/llvm/llvm-project/pull/126735.diff 5 Files Affected:
|
| dyn_cast<FrameIndexSDNode>(LoadNode->getBasePtr())) { | ||
| MachineFrameInfo &MFI = MF.getFrameInfo(); | ||
| int FI = FINode->getIndex(); | ||
| if (LoadNode->getMemoryVT() == VA.getValVT() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should add a test for i1 args, whose memory type is different than their in-registers type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, probably needs a couple tests for zeroext/sext args to make sure we do the right thing for smaller-than-32-bits args. In particular, we should make sure that when caller/callee don't agree on sign extension, that we still emit the caller-required part of that, and when they do agree we may omit it.
|
Do we need the equivalent optimization in GlobalISel (AArch64CallLowering.cpp)? |
b08450d to
9a6844c
Compare
2e45964 to
c234686
Compare
|
We do, with |
c234686 to
e7d3666
Compare
fc34c7f to
86b3cef
Compare
21aa89f to
3f9c68b
Compare
3f9c68b to
6171655
Compare
6171655 to
a9e7579
Compare
|
@arsenm Any suggestion how to deal with GlobalISel generates: These truncations don't appear in SDAG and may not preserve the loaded value so we can't simply skip over them. |
The GlobalISel call lowering should essentially mirror what the DAG did. It probably should not have started by emitting the extending load, and emitted a non-extending s8 load with the assert_zext on the s8 value |
a9e7579 to
39a148d
Compare
|
Wasn't able to implement a full solution for GlobalISel and the lowering still emits a load-store for I thought of pushing through with KnownBits, but unfortunately an |
|
For the record, we discussed this offline and the current solution is fine for GlobalISel as an intermediate step. |
When possible, do not emit trivial load and stores to the same offset on the stack.
39a148d to
8acf1b3
Compare
…lvm#126735) This issue starts in the selection DAG and causes the backend to emit the following for a trivial tail call: ``` ldr w8, [sp] str w8, [sp] b func ``` I'm not too sure that checking for immutability of a specific stack object is a good enough of a gurantee, because as soon a tail-call is done lowering,`setHasTailCall()` is called and in that case perhaps a pass is allowed to change the value of the object in-memory? This can be extended to the ARM backend as well. Removed the `tailcall` keyword from a few other test assets, I'm assuming their original intent was left intact. (cherry picked from commit 4d4b7cc)
This issue starts in the selection DAG and causes the backend to emit the following for a trivial tail call:
I'm not too sure that checking for immutability of a specific stack object is a good enough of a gurantee, because as soon a tail-call is done lowering,
setHasTailCall()is called and in that case perhaps a pass is allowed to change the value of the object in-memory?This can be extended to the ARM backend as well.
Removed the
tailcallkeyword from a few other test assets, I'm assuming their original intent was left intact.