-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[aarch64] XOR the frame pointer with the stack cookie when protecting the stack #161114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-aarch64 Author: Pan Tao (PanTao2) ChangesFull diff: https://github.com/llvm/llvm-project/pull/161114.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index a4c1e265f0e63..fffc060bbe7cd 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -28818,7 +28818,17 @@ void AArch64TargetLowering::ReplaceNodeResults(
bool AArch64TargetLowering::useLoadStackGuardNode(const Module &M) const {
if (Subtarget->isTargetAndroid() || Subtarget->isTargetFuchsia())
return TargetLowering::useLoadStackGuardNode(M);
- return true;
+ return false;
+}
+
+bool AArch64TargetLowering::useStackGuardXorFP() const { return true; }
+
+SDValue AArch64TargetLowering::emitStackGuardXorFP(SelectionDAG &DAG,
+ SDValue Val,
+ const SDLoc &DL) const {
+ return DAG.getNode(
+ ISD::XOR, DL, Val.getValueType(), Val,
+ DAG.getRegister(getStackPointerRegisterToSaveRestore(), MVT::i64));
}
unsigned AArch64TargetLowering::combineRepeatedFPDivisors() const {
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index d8072d15853ee..8073b0677a11b 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -349,6 +349,9 @@ class AArch64TargetLowering : public TargetLowering {
shouldExpandAtomicCmpXchgInIR(AtomicCmpXchgInst *AI) const override;
bool useLoadStackGuardNode(const Module &M) const override;
+ bool useStackGuardXorFP() const override;
+ SDValue emitStackGuardXorFP(SelectionDAG &DAG, SDValue Val,
+ const SDLoc &DL) const override;
TargetLoweringBase::LegalizeTypeAction
getPreferredVectorAction(MVT VT) const override;
diff --git a/llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp b/llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp
index d3b1aa621b61a..163de52386221 100644
--- a/llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp
@@ -32,7 +32,7 @@ AArch64SelectionDAGInfo::AArch64SelectionDAGInfo()
void AArch64SelectionDAGInfo::verifyTargetNode(const SelectionDAG &DAG,
const SDNode *N) const {
- SelectionDAGGenTargetInfo::verifyTargetNode(DAG, N);
+ // SelectionDAGGenTargetInfo::verifyTargetNode(DAG, N);
#ifndef NDEBUG
// Some additional checks not yet implemented by verifyTargetNode.
|
|
@efriedma-quic Could you please review this PR? Is AArch64SelectionDAGInfo::verifyTargetNode() necessary? Only RISCV and AArch64 have implemented this function. |
| void AArch64SelectionDAGInfo::verifyTargetNode(const SelectionDAG &DAG, | ||
| const SDNode *N) const { | ||
| SelectionDAGGenTargetInfo::verifyTargetNode(DAG, N); | ||
| // SelectionDAGGenTargetInfo::verifyTargetNode(DAG, N); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why you're commenting this out?
verifyTargetNode is there to allow targets to implement additional verification for target-specific nodes, to try to catch mistakes early. It isn't strictly necessary to implement it for new nodes, but it's nice to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your review and explanation!
There is fatal error:
LLVM ERROR: invalid node: operand #0 has invalid type; expected ch, got i64
t18: ch = AArch64ISD::BRCOND t3, BasicBlock:ch<entry 0x59e981fa35b0>, Constant:i32<1>, t16:1
t3: i64,ch = load<(volatile load (s64) from %stack.0.StackGuardSlot)> t0, FrameIndex:i64<0>, undef:i64
t16: i64,i32 = AArch64ISD::SUBS t8, t6
This fatal error is triggered by the following stack:
llvm::report_fatal_error
checkOperandType
llvm::SDNodeInfo::verifyNode "if (HasChain) checkOperandType(DAG, N, 0, MVT::Other);"
llvm::SelectionDAGGenTargetInfo::verifyTargetNode
llvm::AArch64SelectionDAGInfo::verifyTargetNode
What do you suggest in this situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an actual issue with the structure of the DAG: a node which is supposed to have a chain as the first operand has an operand that isn't a chain. I suspect there's some code that's supposed to return the chain of the load, but is accidentally returning the value of the load instead. (An SDValue is a struct with two members: an SDNode*, and an integer for which result of the SDNode is being referenced.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might not be relevant if you switch away from modifying useLoadStackGuardNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is not relevant after removing the modification of useLoadStackGuardNode.
| ; WINDOWS-AARCH64: adrp x8, __security_cookie | ||
| ; WINDOWS-AARCH64: ldr x8, [x8, :lo12:__security_cookie] | ||
| ; WINDOWS-AARCH64: adrp x19, __security_cookie | ||
| ; WINDOWS-AARCH64: ldr x8, [x19, :lo12:__security_cookie] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably worth making the test check for the "eor" instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added test check for the "eor" instruction.
| @@ -28965,7 +28965,25 @@ void AArch64TargetLowering::ReplaceNodeResults( | |||
| bool AArch64TargetLowering::useLoadStackGuardNode(const Module &M) const { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to disable useLoadStackGuardNode; it provides additional protection because it prevents later optimizations from incorrectly hoisting the address computations.
You can add the EOR instruction in AArch64InstrInfo::expandPostRAPseudo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the EOR instruction in AArch64InstrInfo::expandPostRAPseudo. Could you please further review it?
| if (Subtarget.getTargetTriple().isOSMSVCRT() && | ||
| !Subtarget.getTargetLowering() | ||
| ->getTargetMachine() | ||
| .Options.EnableGlobalISel) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the EnableGlobalISel check here?
Probably needs a comment explaining why we're doing this specifically for for MSVCRT targets (to match MSVC).
Can we move this outside the if statement, so it doesn't depend on the code model? We should usually be using small code model for Windows targets, due to the inherent limits of PE-COFF, but some unusual configs like JITs can use other code models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Without the EnableGlobalISel check here, the test https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-stack-protector-windows.ll#L2 crash.
I added a simple comment.
Yes. I have moved this outside the if statement.
The eor instruction in the failure BB can not be added in AArch64InstrInfo::expandPostRAPseudo. Therefor, I have modified the existing interface emitStackGuardXorFP to add the eor instruction.
Could you please further review it?
| !Subtarget.getTargetLowering() | ||
| ->getTargetMachine() | ||
| .Options.EnableGlobalISel) { | ||
| BuildMI(MBB, MI, DL, get(AArch64::EORWrr), Reg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| BuildMI(MBB, MI, DL, get(AArch64::EORWrr), Reg) | |
| BuildMI(MBB, MI, DL, get(AArch64::EORXrr), Reg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
Hi @PanTao2 thanks for writing this patch. I have general comment first as I think we need some further consideration before going ahead with this change. |
|
Hi @omjavaid thanks for your comment. I think you prefer 'only matching semantics'. I think this is great.
Regarding 1, based on recent optimizations (PR #95904 and PR #136290), I think you prefer inline checks on aarch64. I think it's great. |
|
Hi @omjavaid ping. |
Yes, I think I tend to prefer inlining with sub (sp - cookie) |
Yes. This is correct. Lets mix the cookie using sub instruction on AArch64 while keep the existing inlining logic intact. |
…e from xor to mix
|
Hi @omjavaid I changed the mixed instruction from xor to sub, could you please review the new implementation? |
|
Hi @omjavaid ping, thanks. |
|
Hi @efriedma-quic, @omjavaid may be on vacation, could you please help to continue to review this PR? |
| ; WINDOWS-AARCH64: ldr x0, [sp, #8] | ||
| ; WINDOWS-AARCH64: ldr x8, [sp, #8] | ||
| ; WINDOWS-AARCH64: mov x9, sp | ||
| ; WINDOWS-AARCH64: add x0, x8, x9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you commute the operands of the add? add x0, x8, sp is not a legal instruction, but add x0, sp, x8 is legal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for correcting.
| ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 | FileCheck %s --check-prefixes=CHECK,CHECK-SD | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 -global-isel | FileCheck %s --check-prefixes=CHECK,CHECK-GI | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 -global-isel | FileCheck %s --check-prefixes=CHECK-GI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please regenerate with update_llc_test_checks.py ; don't edit check lines of autogenerated tests by hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I regenerate this file with update_llc_test_checks.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is update_llc_test_checks.py not happy if you write --check-prefixes=CHECK,CHECK-GI. If that doesn't work, this is fine, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for making unnecessary changes. I changed it back to --check-prefixes=CHECK,CHECK-GI and regenerated this file.
…pdate_llc_test_checks.py
| ->getTargetMachine() | ||
| .Options.EnableGlobalISel) { | ||
| BuildMI(MBB, MI, DL, get(AArch64::SUBXrr), Reg) | ||
| .addReg(AArch64::SP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually produce the correct instruction? I'm not sure SP is in the right register class; I suspect you end up subtracting from zero. The wouldn't show up in assembly output, but it would show up in disassembly. (SUBXrx64 can refer to SP this way, though.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reply here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed SUBXrr to SUBXrx64. Could you please check it?
| ; WINDOWS-ARM64EC: ldr x0, [sp, #8] | ||
| ; WINDOWS-ARM64EC: ldr x8, [sp, #8] | ||
| ; WINDOWS-ARM64EC: mov x9, sp | ||
| ; WINDOWS-ARM64EC: add x0, x9, x8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, hmm, this still doesn't actually remove the extra instruction because that would require using a different encoding (ADDXrs vs. ADDXrx64). Shouldn't be too hard to fix; see d337c1f .
🐧 Linux x64 Test Results
|
…e from xor to mix
…pdate_llc_test_checks.py
| if (FailureBB) { | ||
| return DAG.getNode( | ||
| ISD::ADD, DL, Val.getValueType(), | ||
| DAG.getRegister(getStackPointerRegisterToSaveRestore(), MVT::i64), Val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use the result of getRegister as an operand to a target-independent ISD::ADD. Register operands aren't values; they are intended only for use with specific operations which are expecting them.
It might appear to work here, sort-of, but other parts of the DAG aren't expecting it, and it will lead to weird results in general. And I'm not sure it actually is working correctly here... what's the actual encoding you're getting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When running llvm/test/CodeGen/AArch64/stack-protector-target.ll, the SDValue returned by DAG.getRegister (getRegister(getStackPointerRegisterToSaveRestore(), MVT:: i64) is "t4: i64 = Register $physreg8".
DAG.getRegister wraps Register as SDValue, while DAG.getCopyFromReg (used in previous version) uses DAG.getRegister and adds a move (copy) instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the target-independent ISD::ADD to the target-independent class SelectionDAG. Could you please further review it?
| ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 | FileCheck %s --check-prefixes=CHECK,CHECK-SD | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 -global-isel | FileCheck %s --check-prefixes=CHECK,CHECK-GI | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 -global-isel | FileCheck %s --check-prefixes=CHECK-GI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is update_llc_test_checks.py not happy if you write --check-prefixes=CHECK,CHECK-GI. If that doesn't work, this is fine, I guess.
| ->getTargetMachine() | ||
| .Options.EnableGlobalISel) { | ||
| BuildMI(MBB, MI, DL, get(AArch64::SUBXrr), Reg) | ||
| .addReg(AArch64::SP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reply here?
| } | ||
|
|
||
| SDValue getAddToReg(const SDLoc &dl, Register Reg, EVT VT, SDValue Delta) { | ||
| return getNode(ISD::ADD, dl, VT, getRegister(Reg, VT), Delta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving the code around doesn't solve the fundamental issue that you actually need to CopyFromReg for correctness.
Please take another look at d337c1f , specifically how it added the copyFromSP pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deleted the copyFromSp instruction by referring to d337c1f. Could you please check it?
| .addMemOperand(*MI.memoperands_begin()); | ||
| } | ||
| } | ||
| // To match MSVC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe extend this comment a little to explain what exactly we're matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please review the added comment?
| bool AArch64DAGToDAGISel::SelectAddUXTXRegister(SDValue N, SDValue &Reg, | ||
| SDValue &Shift) { | ||
| if (N.getOpcode() != ISD::LOAD) | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any particular reason to check for a LOAD? The pattern seems like it should still work for non-LOAD operands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the LOAD is not checked, there will be many unexpected changes. E.g., the following code
stur x10, [x29, #-200] // 8-byte Folded Spill
mov x15, sp
subs x10, x15, #16
will be changed to
stur x10, [x29, #-200]
mov x15, sp
add x10, x15, x8, uxtx
PanTao2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very sorry, I was informed that some of my comments were not sent out.
| if (Subtarget.getTargetTriple().isOSMSVCRT() && | ||
| !Subtarget.getTargetLowering() | ||
| ->getTargetMachine() | ||
| .Options.EnableGlobalISel) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Without the EnableGlobalISel check here, the test https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-stack-protector-windows.ll#L2 crash.
I added a simple comment.
Yes. I have moved this outside the if statement.
The eor instruction in the failure BB can not be added in AArch64InstrInfo::expandPostRAPseudo. Therefor, I have modified the existing interface emitStackGuardXorFP to add the eor instruction.
Could you please further review it?
| !Subtarget.getTargetLowering() | ||
| ->getTargetMachine() | ||
| .Options.EnableGlobalISel) { | ||
| BuildMI(MBB, MI, DL, get(AArch64::EORWrr), Reg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| ; WINDOWS-AARCH64: ldr x0, [sp, #8] | ||
| ; WINDOWS-AARCH64: ldr x8, [sp, #8] | ||
| ; WINDOWS-AARCH64: mov x9, sp | ||
| ; WINDOWS-AARCH64: add x0, x8, x9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for correcting.
| ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 | FileCheck %s --check-prefixes=CHECK,CHECK-SD | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 -global-isel | FileCheck %s --check-prefixes=CHECK,CHECK-GI | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 -global-isel | FileCheck %s --check-prefixes=CHECK-GI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I regenerate this file with update_llc_test_checks.py.
| if (FailureBB) { | ||
| return DAG.getNode( | ||
| ISD::ADD, DL, Val.getValueType(), | ||
| DAG.getRegister(getStackPointerRegisterToSaveRestore(), MVT::i64), Val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When running llvm/test/CodeGen/AArch64/stack-protector-target.ll, the SDValue returned by DAG.getRegister (getRegister(getStackPointerRegisterToSaveRestore(), MVT:: i64) is "t4: i64 = Register $physreg8".
DAG.getRegister wraps Register as SDValue, while DAG.getCopyFromReg (used in previous version) uses DAG.getRegister and adds a move (copy) instruction.
| if (FailureBB) { | ||
| return DAG.getNode( | ||
| ISD::ADD, DL, Val.getValueType(), | ||
| DAG.getRegister(getStackPointerRegisterToSaveRestore(), MVT::i64), Val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the target-independent ISD::ADD to the target-independent class SelectionDAG. Could you please further review it?
| ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 | FileCheck %s --check-prefixes=CHECK,CHECK-SD | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 -global-isel | FileCheck %s --check-prefixes=CHECK,CHECK-GI | ||
| ; RUN: llc < %s -mtriple=aarch64-w64-mingw32 -global-isel | FileCheck %s --check-prefixes=CHECK-GI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for making unnecessary changes. I changed it back to --check-prefixes=CHECK,CHECK-GI and regenerated this file.
| } | ||
|
|
||
| SDValue getAddToReg(const SDLoc &dl, Register Reg, EVT VT, SDValue Delta) { | ||
| return getNode(ISD::ADD, dl, VT, getRegister(Reg, VT), Delta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deleted the copyFromSp instruction by referring to d337c1f. Could you please check it?
| ->getTargetMachine() | ||
| .Options.EnableGlobalISel) { | ||
| BuildMI(MBB, MI, DL, get(AArch64::SUBXrr), Reg) | ||
| .addReg(AArch64::SP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed SUBXrr to SUBXrx64. Could you please check it?
This strengthens the guard and matches MSVC.
Fixes #156573 .