Skip to content

Conversation

@davemgreen
Copy link
Collaborator

FeatureUseScalarIncVL is a tuning feature, used to control whether addvl or add+cnt is used. It was previously added as a dependency for FeatureSVE2, an architecture feature but this can be seen as a layering violation. The main disadvantage is that -use-scalar-inc-vl cannot be used without disabling sve2 and all dependant features.

This patch attempt to break that, enabling the feature if SVE2 or SME is present and the use-scalar-inc-vl is not otherwise specified, but not requiring a hard dependency between the two. (Unfortunately I had to check if the value was present in the features string to check that).

It also inverts the meaning of the feature to DontUseScalarIncVL, so that the default state is to use addvl, hopefully the more sensible default going forward.

@llvmbot
Copy link
Member

llvmbot commented Nov 1, 2024

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

FeatureUseScalarIncVL is a tuning feature, used to control whether addvl or add+cnt is used. It was previously added as a dependency for FeatureSVE2, an architecture feature but this can be seen as a layering violation. The main disadvantage is that -use-scalar-inc-vl cannot be used without disabling sve2 and all dependant features.

This patch attempt to break that, enabling the feature if SVE2 or SME is present and the use-scalar-inc-vl is not otherwise specified, but not requiring a hard dependency between the two. (Unfortunately I had to check if the value was present in the features string to check that).

It also inverts the meaning of the feature to DontUseScalarIncVL, so that the default state is to use addvl, hopefully the more sensible default going forward.


Full diff: https://github.com/llvm/llvm-project/pull/114568.diff

6 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64Features.td (+5-5)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/AArch64/AArch64Subtarget.cpp (+7)
  • (modified) llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems-i32.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems.ll (+3-1)
  • (modified) llvm/test/CodeGen/AArch64/sve-vl-arith.ll (+2-1)
diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index 642976cd3ea076..29da5e700854b5 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -358,12 +358,9 @@ def FeatureTHE : ExtensionWithMArch<"the", "THE", "FEAT_THE",
 //  Armv9.0 Architecture Extensions
 //===----------------------------------------------------------------------===//
 
-def FeatureUseScalarIncVL : SubtargetFeature<"use-scalar-inc-vl",
-  "UseScalarIncVL", "true", "Prefer inc/dec over add+cnt">;
-
 def FeatureSVE2 : ExtensionWithMArch<"sve2", "SVE2", "FEAT_SVE2",
   "Enable Scalable Vector Extension 2 (SVE2) instructions",
-  [FeatureSVE, FeatureUseScalarIncVL]>;
+  [FeatureSVE]>;
 
 def FeatureSVE2AES : ExtensionWithMArch<"sve2-aes", "SVE2AES",
   "FEAT_SVE_AES, FEAT_SVE_PMULL128",
@@ -403,7 +400,7 @@ def FeatureRME : Extension<"rme", "RME", "FEAT_RME",
   "Enable Realm Management Extension">;
 
 def FeatureSME : ExtensionWithMArch<"sme", "SME", "FEAT_SME",
-  "Enable Scalable Matrix Extension (SME)", [FeatureBF16, FeatureUseScalarIncVL]>;
+  "Enable Scalable Matrix Extension (SME)", [FeatureBF16]>;
 
 def FeatureSMEF64F64 : ExtensionWithMArch<"sme-f64f64", "SMEF64F64", "FEAT_SME_F64F64",
   "Enable Scalable Matrix Extension (SME) F64F64 instructions", [FeatureSME]>;
@@ -797,6 +794,9 @@ def FeatureUseFixedOverScalableIfEqualCost: SubtargetFeature<"use-fixed-over-sca
   "UseFixedOverScalableIfEqualCost", "true",
   "Prefer fixed width loop vectorization over scalable if the cost-model assigns equal costs">;
 
+def FeatureDontUseScalarIncVL : SubtargetFeature<"dont-use-scalar-inc-vl",
+  "DontUseScalarIncVL", "true", "Prefer add+cnt over inc/dec">;
+
 //===----------------------------------------------------------------------===//
 // Architectures.
 //
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index 424848252f6aa6..ff11224842e7bf 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -375,9 +375,9 @@ def UseNegativeImmediates
     : Predicate<"false">, AssemblerPredicate<(all_of (not FeatureNoNegativeImmediates)),
                                              "NegativeImmediates">;
 
-def UseScalarIncVL : Predicate<"Subtarget->useScalarIncVL()">;
+def UseScalarIncVL : Predicate<"!Subtarget->dontUseScalarIncVL()">;
 
-def NoUseScalarIncVL : Predicate<"!Subtarget->useScalarIncVL()">;
+def NoUseScalarIncVL : Predicate<"Subtarget->dontUseScalarIncVL()">;
 
 def UseSVEFPLD1R : Predicate<"!Subtarget->noSVEFPLD1R()">;
 
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index 736d57e6ae2fd9..09dff6265cbacb 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -335,6 +335,13 @@ void AArch64Subtarget::initializeProperties(bool HasMinSize) {
 
   if (AArch64MinimumJumpTableEntries.getNumOccurrences() > 0 || !HasMinSize)
     MinimumJumpTableEntries = AArch64MinimumJumpTableEntries;
+
+  // If only SVE is present (not SVE2 or SME) and DontUseScalarIncVL is not
+  // otherwise set, enable it by default.
+  if (!hasSVE2() && !hasSME()) {
+    if (!getFeatureString().contains("dont-use-scalar-inc-vl"))
+      DontUseScalarIncVL = true;
+  }
 }
 
 AArch64Subtarget::AArch64Subtarget(const Triple &TT, StringRef CPU,
diff --git a/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems-i32.ll b/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems-i32.ll
index 5062a43da931f8..1855d3673f92c7 100644
--- a/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems-i32.ll
+++ b/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems-i32.ll
@@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s -check-prefix=NO_SCALAR_INC
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=+use-scalar-inc-vl < %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=-dont-use-scalar-inc-vl < %s | FileCheck %s
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 < %s | FileCheck %s
 
 ; INCB
diff --git a/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems.ll b/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems.ll
index 8f0a9eac87c27c..bf812823e86a99 100644
--- a/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems.ll
+++ b/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems.ll
@@ -1,8 +1,10 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=+use-scalar-inc-vl < %s | FileCheck %s -check-prefix=USE_SCALAR_INC
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=-dont-use-scalar-inc-vl < %s | FileCheck %s -check-prefix=USE_SCALAR_INC
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 < %s | FileCheck %s -check-prefix=USE_SCALAR_INC
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sme -force-streaming < %s | FileCheck %s -check-prefix=USE_SCALAR_INC
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 -mattr=+dont-use-scalar-inc-vl < %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sme -mattr=+dont-use-scalar-inc-vl -force-streaming < %s | FileCheck %s
 
 ;
 ; CNTB
diff --git a/llvm/test/CodeGen/AArch64/sve-vl-arith.ll b/llvm/test/CodeGen/AArch64/sve-vl-arith.ll
index de2af590acd1e2..4ddb2fdf56787b 100644
--- a/llvm/test/CodeGen/AArch64/sve-vl-arith.ll
+++ b/llvm/test/CodeGen/AArch64/sve-vl-arith.ll
@@ -1,7 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -verify-machineinstrs < %s | FileCheck %s -check-prefix=NO_SCALAR_INC
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=+use-scalar-inc-vl -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=-dont-use-scalar-inc-vl -verify-machineinstrs < %s | FileCheck %s
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 -mattr=+dont-use-scalar-inc-vl -verify-machineinstrs < %s | FileCheck %s -check-prefix=NO_SCALAR_INC
 
 define <vscale x 8 x i16> @inch_vec(<vscale x 8 x i16> %a) {
 ; NO_SCALAR_INC-LABEL: inch_vec:

"UseFixedOverScalableIfEqualCost", "true",
"Prefer fixed width loop vectorization over scalable if the cost-model assigns equal costs">;

def FeatureDontUseScalarIncVL : SubtargetFeature<"dont-use-scalar-inc-vl",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you implement this without changing the feature name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that should be OK. It should at least make the patch simple and I think just involves inverting all the logic, so turning the option on when we have SVE2/SME (as opposed to turning it off when we have only SVE-1). Having it default to addvl felt like a better default. Do you have a particular reason to not want to change it?

(I'm not sure if dont-use-scalar-inc-vl is the best name, to be honest. It might be better to have something like preferAddCntToAddvl to avoid the negatives and make it a little more explicit)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the double negative in -mattr=-dont-use-scalar-inc-vl confusing and the current logic is a bit simpler (although I'm not particularly happy with the explicit partial string matching, which I guess is the downside of doing it this way). A name change would avoid the double negative, but thinking about this; why don't we just make it an LLVM option rather than a feature?

That way, we can do something like:

if (ForceScalarIncVL.getNumOccurrences())
  UseScalarIncVL = ForceScalarIncVL.getValue();
else if (hasSVE2() || hasSME())
  UseScalarIncVL = true;
// else UseScalarIncVL = false; (would be default)

We can still make use of the UseScalarIncVL Predicate in TableGen, it just wouldn't be defined through a target feature.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the string matching isn't the greatest. We only do it once though, and I believe this is the only way to check if a feature was specified. I assume that it was made a target feature so that we could add it to certain cpus if it is useful. I'm not 100% sure we won't find that would still be useful in some cases, but we have progressed a bit in time now and it hasn't come up yet.

@davemgreen davemgreen force-pushed the gh-a64-DontUseScalarIncVL branch from 3f55cc6 to 08b4b33 Compare November 4, 2024 13:22
FeatureUseScalarIncVL is a tuning feature, used to control whether addvl or
add+cnt is used. It was previously added as a dependency for FeatureSVE2, an
architecture feature but this can be seen as a layering violation. The main
disadvantage is that -use-scalar-inc-vl cannot be used without disabling sve2
and all dependant features.

This patch now replaces that with an option that if unset defaults to hasSVE ||
hasSME, but is otherwise overriden by the option. The hope is that no cpus will
rely on the tuning feature (or we can readdit if needed.
@davemgreen davemgreen force-pushed the gh-a64-DontUseScalarIncVL branch from 08b4b33 to c8c4fc2 Compare November 6, 2024 15:51
return DefaultSVETFOpts;
}

bool useScalarIncVL() const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe add a comment to explain what useScalarIncVL means?

// If SVE2 or SME is present (we are not SVE-1 only) and UseScalarIncVL
// is not otherwise set, enable it by default.
if (UseScalarIncVL.getNumOccurrences())
return UseScalarIncVL;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still recommend creating a separate variable for it in the Subtarget that gets initialised with the code you have here. That way a CPU could possibly override it in AArch64Subtarget::initializeProperties (if there is ever a need for that) and then the call to useScalarIncVL() can be inlined rather than ending up as a call. But this is just me nitpicking a bit :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - We would probably want go the other way and reintroduce the target features if we needed it for a specific CPU. The preferred way is for tuning features to control the options, which are applied to given cpus that they are relevant to.

@davemgreen davemgreen closed this Nov 29, 2024
@davemgreen davemgreen deleted the gh-a64-DontUseScalarIncVL branch November 29, 2024 13:39
@davemgreen
Copy link
Collaborator Author

This was 1d6d073.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants