Skip to content

Conversation

@MacDue
Copy link
Member

@MacDue MacDue commented May 20, 2025

The conditional variants of SMSTART/STOP currently take the current PStateSM as a variadic value. This is not supported by the verification added in #140472 (which requires variadic values to be of type Register or RegisterMask), so this patch splits the the conditional variants into new COND_ nodes, where these extra parameters are fixed arguments.

Suggested in #140472 (comment)

Part of #140472.

…OP (NFC)

The conditional variants of SMSTART/STOP currently take the current
PStateSM as a variadic value. This is not supported by the verification
added in llvm#140472 (which requires variadic values to be of type Register
or RegisterMask), so this patch splits the the conditional variants into
new `COND_` nodes, where these extra parameters are fixed arguments.

Suggested in llvm#140472 (comment)

Part of llvm#140472.
@llvmbot
Copy link
Member

llvmbot commented May 20, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Benjamin Maxwell (MacDue)

Changes

The conditional variants of SMSTART/STOP currently take the current PStateSM as a variadic value. This is not supported by the verification added in #140472 (which requires variadic values to be of type Register or RegisterMask), so this patch splits the the conditional variants into new COND_ nodes, where these extra parameters are fixed arguments.

Suggested in #140472 (comment)

Part of #140472.


Full diff: https://github.com/llvm/llvm-project/pull/140711.diff

4 Files Affected:

  • (modified) llvm/docs/AArch64SME.rst (+8-6)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+15-13)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.h (+2)
  • (modified) llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td (+18-10)
diff --git a/llvm/docs/AArch64SME.rst b/llvm/docs/AArch64SME.rst
index b5a01cb204b81..ac8ce32ddb9e6 100644
--- a/llvm/docs/AArch64SME.rst
+++ b/llvm/docs/AArch64SME.rst
@@ -213,12 +213,14 @@ Instruction Selection Nodes
 
 .. code-block:: none
 
-  AArch64ISD::SMSTART Chain, [SM|ZA|Both], CurrentState, ExpectedState[, RegMask]
-  AArch64ISD::SMSTOP  Chain, [SM|ZA|Both], CurrentState, ExpectedState[, RegMask]
-
-The ``SMSTART/SMSTOP`` nodes take ``CurrentState`` and ``ExpectedState`` operand for
-the case of a conditional SMSTART/SMSTOP. The instruction will only be executed
-if CurrentState != ExpectedState.
+  AArch64ISD::SMSTART Chain, [SM|ZA|Both][, RegMask]
+  AArch64ISD::SMSTOP  Chain, [SM|ZA|Both][, RegMask]
+  AArch64ISD::COND_SMSTART Chain, [SM|ZA|Both], CurrentState, ExpectedState[, RegMask]
+  AArch64ISD::COND_SMSTOP  Chain, [SM|ZA|Both], CurrentState, ExpectedState[, RegMask]
+
+The ``COND_SMSTART/COND_SMSTOP`` nodes additionally take ``CurrentState`` and
+``ExpectedState``, in this case the instruction will only be executed if
+``CurrentState != ExpectedState``.
 
 When ``CurrentState`` and ``ExpectedState`` can be evaluated at compile-time
 (i.e. they are both constants) then an unconditional ``smstart/smstop``
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 293292d47dd48..d1000dd64bdf7 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -2726,6 +2726,8 @@ const char *AArch64TargetLowering::getTargetNodeName(unsigned Opcode) const {
     MAKE_CASE(AArch64ISD::VG_RESTORE)
     MAKE_CASE(AArch64ISD::SMSTART)
     MAKE_CASE(AArch64ISD::SMSTOP)
+    MAKE_CASE(AArch64ISD::COND_SMSTART)
+    MAKE_CASE(AArch64ISD::COND_SMSTOP)
     MAKE_CASE(AArch64ISD::RESTORE_ZA)
     MAKE_CASE(AArch64ISD::RESTORE_ZT)
     MAKE_CASE(AArch64ISD::SAVE_ZT)
@@ -6033,14 +6035,12 @@ SDValue AArch64TargetLowering::LowerINTRINSIC_VOID(SDValue Op,
     return DAG.getNode(
         AArch64ISD::SMSTART, DL, MVT::Other,
         Op->getOperand(0), // Chain
-        DAG.getTargetConstant((int32_t)(AArch64SVCR::SVCRZA), DL, MVT::i32),
-        DAG.getConstant(AArch64SME::Always, DL, MVT::i64));
+        DAG.getTargetConstant((int32_t)(AArch64SVCR::SVCRZA), DL, MVT::i32));
   case Intrinsic::aarch64_sme_za_disable:
     return DAG.getNode(
         AArch64ISD::SMSTOP, DL, MVT::Other,
         Op->getOperand(0), // Chain
-        DAG.getTargetConstant((int32_t)(AArch64SVCR::SVCRZA), DL, MVT::i32),
-        DAG.getConstant(AArch64SME::Always, DL, MVT::i64));
+        DAG.getTargetConstant((int32_t)(AArch64SVCR::SVCRZA), DL, MVT::i32));
   }
 }
 
@@ -8913,18 +8913,22 @@ SDValue AArch64TargetLowering::changeStreamingMode(SelectionDAG &DAG, SDLoc DL,
   SDValue RegMask = DAG.getRegisterMask(TRI->getSMStartStopCallPreservedMask());
   SDValue MSROp =
       DAG.getTargetConstant((int32_t)AArch64SVCR::SVCRSM, DL, MVT::i32);
-  SDValue ConditionOp = DAG.getTargetConstant(Condition, DL, MVT::i64);
-  SmallVector<SDValue> Ops = {Chain, MSROp, ConditionOp};
+  SmallVector<SDValue> Ops = {Chain, MSROp};
+  unsigned Opcode;
   if (Condition != AArch64SME::Always) {
+    SDValue ConditionOp = DAG.getTargetConstant(Condition, DL, MVT::i64);
+    Opcode = Enable ? AArch64ISD::COND_SMSTART : AArch64ISD::COND_SMSTOP;
     assert(PStateSM && "PStateSM should be defined");
+    Ops.push_back(ConditionOp);
     Ops.push_back(PStateSM);
+  } else {
+    Opcode = Enable ? AArch64ISD::SMSTART : AArch64ISD::SMSTOP;
   }
   Ops.push_back(RegMask);
 
   if (InGlue)
     Ops.push_back(InGlue);
 
-  unsigned Opcode = Enable ? AArch64ISD::SMSTART : AArch64ISD::SMSTOP;
   return DAG.getNode(Opcode, DL, DAG.getVTList(MVT::Other, MVT::Glue), Ops);
 }
 
@@ -9189,9 +9193,8 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
 
   if (DisableZA)
     Chain = DAG.getNode(
-        AArch64ISD::SMSTOP, DL, MVT::Other, Chain,
-        DAG.getTargetConstant((int32_t)(AArch64SVCR::SVCRZA), DL, MVT::i32),
-        DAG.getConstant(AArch64SME::Always, DL, MVT::i64));
+        AArch64ISD::SMSTOP, DL, DAG.getVTList(MVT::Other, MVT::Glue), Chain,
+        DAG.getTargetConstant((int32_t)(AArch64SVCR::SVCRZA), DL, MVT::i32));
 
   // Adjust the stack pointer for the new arguments...
   // These operations are automatically eliminated by the prolog/epilog pass
@@ -9668,9 +9671,8 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
   if (CallAttrs.requiresEnablingZAAfterCall())
     // Unconditionally resume ZA.
     Result = DAG.getNode(
-        AArch64ISD::SMSTART, DL, MVT::Other, Result,
-        DAG.getTargetConstant((int32_t)(AArch64SVCR::SVCRZA), DL, MVT::i32),
-        DAG.getConstant(AArch64SME::Always, DL, MVT::i64));
+        AArch64ISD::SMSTART, DL, DAG.getVTList(MVT::Other, MVT::Glue), Result,
+        DAG.getTargetConstant((int32_t)(AArch64SVCR::SVCRZA), DL, MVT::i32));
 
   if (ShouldPreserveZT0)
     Result =
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index c1e6d70099fa5..59a9d7d179778 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -73,6 +73,8 @@ enum NodeType : unsigned {
 
   SMSTART,
   SMSTOP,
+  COND_SMSTART,
+  COND_SMSTOP,
   RESTORE_ZA,
   RESTORE_ZT,
   SAVE_ZT,
diff --git a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
index 363ecee49c0f2..e7482da001074 100644
--- a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
@@ -10,12 +10,20 @@
 //
 //===----------------------------------------------------------------------===//
 
-def AArch64_smstart : SDNode<"AArch64ISD::SMSTART", SDTypeProfile<0, 2,
-                             [SDTCisInt<0>, SDTCisInt<0>]>,
+def AArch64_smstart : SDNode<"AArch64ISD::SMSTART", SDTypeProfile<0, 1,
+                             [SDTCisInt<0>]>,
                              [SDNPHasChain, SDNPSideEffect, SDNPVariadic,
                               SDNPOptInGlue, SDNPOutGlue]>;
-def AArch64_smstop  : SDNode<"AArch64ISD::SMSTOP", SDTypeProfile<0, 2,
-                             [SDTCisInt<0>, SDTCisInt<0>]>,
+def AArch64_smstop  : SDNode<"AArch64ISD::SMSTOP", SDTypeProfile<0, 1,
+                             [SDTCisInt<0>]>,
+                             [SDNPHasChain, SDNPSideEffect, SDNPVariadic,
+                              SDNPOptInGlue, SDNPOutGlue]>;
+def AArch64_cond_smstart : SDNode<"AArch64ISD::COND_SMSTART", SDTypeProfile<0, 3,
+                             [SDTCisInt<0>, SDTCisInt<1>, SDTCisInt<2>]>,
+                             [SDNPHasChain, SDNPSideEffect, SDNPVariadic,
+                              SDNPOptInGlue, SDNPOutGlue]>;
+def AArch64_cond_smstop  : SDNode<"AArch64ISD::COND_SMSTOP", SDTypeProfile<0, 3,
+                             [SDTCisInt<0>, SDTCisInt<1>, SDTCisInt<2>]>,
                              [SDNPHasChain, SDNPSideEffect, SDNPVariadic,
                               SDNPOptInGlue, SDNPOutGlue]>;
 def AArch64_restore_za : SDNode<"AArch64ISD::RESTORE_ZA", SDTypeProfile<0, 3,
@@ -305,15 +313,15 @@ def MSRpstatePseudo :
   let Defs = [VG];
 }
 
-def : Pat<(AArch64_smstart (i32 svcr_op:$pstate), (i64 timm0_31:$condition)),
-          (MSRpstatePseudo svcr_op:$pstate, 0b1, timm0_31:$condition)>;
-def : Pat<(AArch64_smstop (i32 svcr_op:$pstate), (i64 timm0_31:$condition)),
-          (MSRpstatePseudo svcr_op:$pstate, 0b0, timm0_31:$condition)>;
+def : Pat<(AArch64_cond_smstart (i32 svcr_op:$pstate), (i64 timm0_31:$condition), (i64 GPR64:$pstatesm)),
+          (MSRpstatePseudo svcr_op:$pstate, 0b1, timm0_31:$condition, GPR64:$pstatesm)>;
+def : Pat<(AArch64_cond_smstop (i32 svcr_op:$pstate), (i64 timm0_31:$condition), (i64 GPR64:$pstatesm)),
+          (MSRpstatePseudo svcr_op:$pstate, 0b0, timm0_31:$condition, GPR64:$pstatesm)>;
 
 // Unconditional start/stop
-def : Pat<(AArch64_smstart (i32 svcr_op:$pstate), (i64 /*AArch64SME::Always*/0)),
+def : Pat<(AArch64_smstart (i32 svcr_op:$pstate)),
           (MSRpstatesvcrImm1 svcr_op:$pstate, 0b1)>;
-def : Pat<(AArch64_smstop (i32 svcr_op:$pstate), (i64 /*AArch64SME::Always*/0)),
+def : Pat<(AArch64_smstop (i32 svcr_op:$pstate)),
           (MSRpstatesvcrImm1 svcr_op:$pstate, 0b0)>;
 
 

@MacDue MacDue merged commit 1a08aa2 into llvm:main May 21, 2025
14 checks passed
@MacDue MacDue deleted the split_smstop_start branch May 21, 2025 09:06
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 21, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-sie-win running on sie-win-worker while building llvm at step 4 "clean-build-dir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/46/builds/17130

Here is the relevant piece of the build log for the reference
Step 4 (clean-build-dir) failure: Delete failed. (failure)
Step 6 (build-unified-tree) failure: build (failure)
...
[425/4925] Building CXX object tools\clang\utils\TableGen\CMakeFiles\clang-tblgen.dir\RISCVVEmitter.cpp.obj
[426/4925] Copying clang's avx10_2satcvtintrin.h...
[427/4925] Building CXX object tools\clang\utils\TableGen\CMakeFiles\clang-tblgen.dir\TableGen.cpp.obj
[428/4925] Copying clang's avx2intrin.h...
[429/4925] Building IntrinsicsARM.h...
[430/4925] Copying clang's avx512bf16intrin.h...
[431/4925] Building IntrinsicsAArch64.h...
[432/4925] Copying clang's avx512bitalgintrin.h...
[433/4925] Building IntrinsicsDirectX.h...
[434/4925] Linking CXX executable bin\FileCheck.exe
FAILED: bin/FileCheck.exe 
cmd.exe /C "cd . && "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" -E vs_link_exe --intdir=utils\FileCheck\CMakeFiles\FileCheck.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~1\2019\BUILDT~1\VC\Tools\MSVC\1429~1.301\bin\HostX64\x64\link.exe /nologo utils\FileCheck\CMakeFiles\FileCheck.dir\FileCheck.cpp.obj utils\FileCheck\CMakeFiles\FileCheck.dir\__\__\resources\windows_version_resource.rc.res  /out:bin\FileCheck.exe /implib:lib\FileCheck.lib /pdb:bin\FileCheck.pdb /version:0.0 /machine:x64 /STACK:10000000 /INCREMENTAL:NO /subsystem:console  lib\LLVMFileCheck.lib  lib\LLVMSupport.lib  psapi.lib  shell32.lib  ole32.lib  uuid.lib  advapi32.lib  ws2_32.lib  ntdll.lib  delayimp.lib  -delayload:shell32.dll  -delayload:ole32.dll  lib\LLVMDemangle.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
LINK: command "C:\PROGRA~2\MICROS~1\2019\BUILDT~1\VC\Tools\MSVC\1429~1.301\bin\HostX64\x64\link.exe /nologo utils\FileCheck\CMakeFiles\FileCheck.dir\FileCheck.cpp.obj utils\FileCheck\CMakeFiles\FileCheck.dir\__\__\resources\windows_version_resource.rc.res /out:bin\FileCheck.exe /implib:lib\FileCheck.lib /pdb:bin\FileCheck.pdb /version:0.0 /machine:x64 /STACK:10000000 /INCREMENTAL:NO /subsystem:console lib\LLVMFileCheck.lib lib\LLVMSupport.lib psapi.lib shell32.lib ole32.lib uuid.lib advapi32.lib ws2_32.lib ntdll.lib delayimp.lib -delayload:shell32.dll -delayload:ole32.dll lib\LLVMDemangle.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTFILE:bin\FileCheck.exe.manifest" failed (exit code 1104) with the following output:
LINK : fatal error LNK1104: cannot open file 'bin\FileCheck.exe'

[435/4925] Copying clang's avx512bwintrin.h...
[436/4925] Copying clang's avx512cdintrin.h...
[437/4925] Copying clang's avx512dqintrin.h...
[438/4925] Building IntrinsicsBPF.h...
[439/4925] Copying clang's avx512fintrin.h...
[440/4925] Building IntrinsicsAMDGPU.h...
[441/4925] Building IntrinsicsMips.h...
[442/4925] Copying clang's avx512fp16intrin.h...
[443/4925] Building IntrinsicsHexagon.h...
[444/4925] Copying clang's avx512ifmaintrin.h...
[445/4925] Building IntrinsicsNVPTX.h...
[446/4925] Building IntrinsicsLoongArch.h...
[447/4925] Copying clang's avx512ifmavlintrin.h...
[448/4925] Building IntrinsicsR600.h...
[449/4925] Building IntrinsicImpl.inc...
[450/4925] Copying clang's avx512vbmi2intrin.h...
[451/4925] Building IntrinsicsRISCV.h...
[452/4925] Copying clang's avx512vbmiintrin.h...
[453/4925] Copying clang's avx512vbmivlintrin.h...
[454/4925] Copying clang's avx512vlbf16intrin.h...
[455/4925] Copying clang's avx512vlbitalgintrin.h...
[456/4925] Copying clang's avx512vlbwintrin.h...
[457/4925] Building IntrinsicsS390.h...
[458/4925] Copying clang's avx512vlcdintrin.h...
[459/4925] Building IntrinsicsVE.h...
[460/4925] Copying clang's avx512vldqintrin.h...
[461/4925] Copying clang's avx512vlfp16intrin.h...
[462/4925] Building IntrinsicsSPIRV.h...
[463/4925] Copying clang's avx512vlintrin.h...
[464/4925] Building IntrinsicsPowerPC.h...
[465/4925] Building IntrinsicsWebAssembly.h...
[466/4925] Copying clang's avx512vlvbmi2intrin.h...
[467/4925] Building IntrinsicsXCore.h...
[468/4925] Copying clang's avx512vlvnniintrin.h...

MacDue added a commit that referenced this pull request May 28, 2025
This continues s-barannikov's work TableGen-erating SDNode descriptions. 
This takes the initial patch from #119709 and moves documentation and the
rest of the AArch64ISD nodes to TableGen. Some issues were found by the
generated SDNode verification added in this patch. These issues have been 
described and fixed in the following PRs:

- #140706 
- #140711 
- #140713 
- #140715

---------

Co-authored-by: Sergei Barannikov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants