Skip to content

Conversation

@PanTao2
Copy link
Contributor

@PanTao2 PanTao2 commented Sep 29, 2025

This strengthens the guard and matches MSVC.

Fixes #156573 .

@llvmbot
Copy link
Member

llvmbot commented Sep 29, 2025

@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-backend-aarch64

Author: Pan Tao (PanTao2)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/161114.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+11-1)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.h (+3)
  • (modified) llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp (+1-1)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index a4c1e265f0e63..fffc060bbe7cd 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -28818,7 +28818,17 @@ void AArch64TargetLowering::ReplaceNodeResults(
 bool AArch64TargetLowering::useLoadStackGuardNode(const Module &M) const {
   if (Subtarget->isTargetAndroid() || Subtarget->isTargetFuchsia())
     return TargetLowering::useLoadStackGuardNode(M);
-  return true;
+  return false;
+}
+
+bool AArch64TargetLowering::useStackGuardXorFP() const { return true; }
+
+SDValue AArch64TargetLowering::emitStackGuardXorFP(SelectionDAG &DAG,
+                                                   SDValue Val,
+                                                   const SDLoc &DL) const {
+  return DAG.getNode(
+      ISD::XOR, DL, Val.getValueType(), Val,
+      DAG.getRegister(getStackPointerRegisterToSaveRestore(), MVT::i64));
 }
 
 unsigned AArch64TargetLowering::combineRepeatedFPDivisors() const {
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index d8072d15853ee..8073b0677a11b 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -349,6 +349,9 @@ class AArch64TargetLowering : public TargetLowering {
   shouldExpandAtomicCmpXchgInIR(AtomicCmpXchgInst *AI) const override;
 
   bool useLoadStackGuardNode(const Module &M) const override;
+  bool useStackGuardXorFP() const override;
+  SDValue emitStackGuardXorFP(SelectionDAG &DAG, SDValue Val,
+                              const SDLoc &DL) const override;
   TargetLoweringBase::LegalizeTypeAction
   getPreferredVectorAction(MVT VT) const override;
 
diff --git a/llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp b/llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp
index d3b1aa621b61a..163de52386221 100644
--- a/llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp
@@ -32,7 +32,7 @@ AArch64SelectionDAGInfo::AArch64SelectionDAGInfo()
 
 void AArch64SelectionDAGInfo::verifyTargetNode(const SelectionDAG &DAG,
                                                const SDNode *N) const {
-  SelectionDAGGenTargetInfo::verifyTargetNode(DAG, N);
+  //  SelectionDAGGenTargetInfo::verifyTargetNode(DAG, N);
 
 #ifndef NDEBUG
   // Some additional checks not yet implemented by verifyTargetNode.

@PanTao2
Copy link
Contributor Author

PanTao2 commented Oct 10, 2025

@efriedma-quic Could you please review this PR? Is AArch64SelectionDAGInfo::verifyTargetNode() necessary? Only RISCV and AArch64 have implemented this function.

void AArch64SelectionDAGInfo::verifyTargetNode(const SelectionDAG &DAG,
const SDNode *N) const {
SelectionDAGGenTargetInfo::verifyTargetNode(DAG, N);
// SelectionDAGGenTargetInfo::verifyTargetNode(DAG, N);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you're commenting this out?

verifyTargetNode is there to allow targets to implement additional verification for target-specific nodes, to try to catch mistakes early. It isn't strictly necessary to implement it for new nodes, but it's nice to have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review and explanation!
There is fatal error:

LLVM ERROR: invalid node: operand #0 has invalid type; expected ch, got i64
t18: ch = AArch64ISD::BRCOND t3, BasicBlock:ch<entry 0x59e981fa35b0>, Constant:i32<1>, t16:1
  t3: i64,ch = load<(volatile load (s64) from %stack.0.StackGuardSlot)> t0, FrameIndex:i64<0>, undef:i64
  t16: i64,i32 = AArch64ISD::SUBS t8, t6

This fatal error is triggered by the following stack:
llvm::report_fatal_error
checkOperandType
llvm::SDNodeInfo::verifyNode "if (HasChain) checkOperandType(DAG, N, 0, MVT::Other);"
llvm::SelectionDAGGenTargetInfo::verifyTargetNode
llvm::AArch64SelectionDAGInfo::verifyTargetNode

What do you suggest in this situation?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an actual issue with the structure of the DAG: a node which is supposed to have a chain as the first operand has an operand that isn't a chain. I suspect there's some code that's supposed to return the chain of the load, but is accidentally returning the value of the load instead. (An SDValue is a struct with two members: an SDNode*, and an integer for which result of the SDNode is being referenced.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not be relevant if you switch away from modifying useLoadStackGuardNode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is not relevant after removing the modification of useLoadStackGuardNode.

; WINDOWS-AARCH64: adrp x8, __security_cookie
; WINDOWS-AARCH64: ldr x8, [x8, :lo12:__security_cookie]
; WINDOWS-AARCH64: adrp x19, __security_cookie
; WINDOWS-AARCH64: ldr x8, [x19, :lo12:__security_cookie]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth making the test check for the "eor" instruction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added test check for the "eor" instruction.

@@ -28965,7 +28965,25 @@ void AArch64TargetLowering::ReplaceNodeResults(
bool AArch64TargetLowering::useLoadStackGuardNode(const Module &M) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to disable useLoadStackGuardNode; it provides additional protection because it prevents later optimizations from incorrectly hoisting the address computations.

You can add the EOR instruction in AArch64InstrInfo::expandPostRAPseudo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the EOR instruction in AArch64InstrInfo::expandPostRAPseudo. Could you please further review it?

if (Subtarget.getTargetTriple().isOSMSVCRT() &&
!Subtarget.getTargetLowering()
->getTargetMachine()
.Options.EnableGlobalISel) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the EnableGlobalISel check here?

Probably needs a comment explaining why we're doing this specifically for for MSVCRT targets (to match MSVC).

Can we move this outside the if statement, so it doesn't depend on the code model? We should usually be using small code model for Windows targets, due to the inherent limits of PE-COFF, but some unusual configs like JITs can use other code models.

@llvmbot llvmbot added backend:X86 llvm:SelectionDAG SelectionDAGISel as well labels Oct 17, 2025
!Subtarget.getTargetLowering()
->getTargetMachine()
.Options.EnableGlobalISel) {
BuildMI(MBB, MI, DL, get(AArch64::EORWrr), Reg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
BuildMI(MBB, MI, DL, get(AArch64::EORWrr), Reg)
BuildMI(MBB, MI, DL, get(AArch64::EORXrr), Reg)

}
// To match MSVC
if (Subtarget.getTargetTriple().isOSMSVCRT() &&
!Subtarget.getTargetLowering()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some reason the isTargetMachO() check is in AArch64TargetLowering::useStackGuardXorFP, but not here?

@omjavaid
Copy link
Contributor

Hi @PanTao2 thanks for writing this patch. I have general comment first as I think we need some further consideration before going ahead with this change.
On Windows MSVC does this a bit differently on AArch64 vs x64. MSVC uses push pop helpers for /GS checks on AArch64 while on x64 it just inlines the checks and calls security_check_cookie at the end.
Also note MSVC on AArch64 mixes the cookie using sub (sp - cookie) but x64 uses xor (cookie ^ rsp) both do the same job but with diff tradeoffs. XOR is smaller and reversible and sub fits better with overall Arm's stack arithmetic. So if the goal here is to exactly mimic MSVC behavior maybe worth thinking which one is closer or better for LLVM side.... maybe just match the semantics not the exact instr choice what do you think?

@PanTao2
Copy link
Contributor Author

PanTao2 commented Oct 22, 2025

Hi @omjavaid thanks for your comment. I think you prefer 'only matching semantics'. I think this is great.
My understanding of the difference between aarch64 and x86_64 on Windows MSVC, please correct me if there are any errors.

  1. MSVC uses push pop helpers (calls _security{push|pop}_cookie) on aarch64, while it inlines the checks on x86_64.
  2. MSVC mixes the cookie using sub (sp - cookie) on aarch64, while it mixes the cookie using xor (cookie ^ rsp) on x86_64.

Regarding 1, based on recent optimizations (PR #95904 and PR #136290), I think you prefer inline checks on aarch64. I think it's great.
Regarding 2, thanks for your reminding, I think this is great that using sub (sp - cookie) for fitting better with overall Arm’s stack arithmetic.
What do you think of changing the implementation of this PR to inline sub (sp - cookie)?

@PanTao2
Copy link
Contributor Author

PanTao2 commented Oct 29, 2025

Hi @omjavaid ping.

@omjavaid
Copy link
Contributor

What do you think of changing the implementation of this PR to inline sub (sp - cookie)?

Yes, I think I tend to prefer inlining with sub (sp - cookie)

@omjavaid
Copy link
Contributor

My understanding of the difference between aarch64 and x86_64 on Windows MSVC, please correct me if there are any errors.

  1. MSVC uses push pop helpers (calls _security{push|pop}_cookie) on aarch64, while it inlines the checks on x86_64.
  2. MSVC mixes the cookie using sub (sp - cookie) on aarch64, while it mixes the cookie using xor (cookie ^ rsp) on x86_64.

Yes. This is correct. Lets mix the cookie using sub instruction on AArch64 while keep the existing inlining logic intact.

@PanTao2
Copy link
Contributor Author

PanTao2 commented Nov 3, 2025

Hi @omjavaid I changed the mixed instruction from xor to sub, could you please review the new implementation?

@PanTao2
Copy link
Contributor Author

PanTao2 commented Nov 10, 2025

Hi @omjavaid ping, thanks.

@PanTao2
Copy link
Contributor Author

PanTao2 commented Nov 13, 2025

Hi @efriedma-quic, @omjavaid may be on vacation, could you please help to continue to review this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[stack guard] XOR the frame pointer with the stack cookie on Windows aarch64

4 participants