-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[AArch64] Disconnect FeatureUseScalarIncVL from FeatureSVE2. #114568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-backend-aarch64 Author: David Green (davemgreen) ChangesFeatureUseScalarIncVL is a tuning feature, used to control whether addvl or add+cnt is used. It was previously added as a dependency for FeatureSVE2, an architecture feature but this can be seen as a layering violation. The main disadvantage is that -use-scalar-inc-vl cannot be used without disabling sve2 and all dependant features. This patch attempt to break that, enabling the feature if SVE2 or SME is present and the use-scalar-inc-vl is not otherwise specified, but not requiring a hard dependency between the two. (Unfortunately I had to check if the value was present in the features string to check that). It also inverts the meaning of the feature to DontUseScalarIncVL, so that the default state is to use addvl, hopefully the more sensible default going forward. Full diff: https://github.com/llvm/llvm-project/pull/114568.diff 6 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index 642976cd3ea076..29da5e700854b5 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -358,12 +358,9 @@ def FeatureTHE : ExtensionWithMArch<"the", "THE", "FEAT_THE",
// Armv9.0 Architecture Extensions
//===----------------------------------------------------------------------===//
-def FeatureUseScalarIncVL : SubtargetFeature<"use-scalar-inc-vl",
- "UseScalarIncVL", "true", "Prefer inc/dec over add+cnt">;
-
def FeatureSVE2 : ExtensionWithMArch<"sve2", "SVE2", "FEAT_SVE2",
"Enable Scalable Vector Extension 2 (SVE2) instructions",
- [FeatureSVE, FeatureUseScalarIncVL]>;
+ [FeatureSVE]>;
def FeatureSVE2AES : ExtensionWithMArch<"sve2-aes", "SVE2AES",
"FEAT_SVE_AES, FEAT_SVE_PMULL128",
@@ -403,7 +400,7 @@ def FeatureRME : Extension<"rme", "RME", "FEAT_RME",
"Enable Realm Management Extension">;
def FeatureSME : ExtensionWithMArch<"sme", "SME", "FEAT_SME",
- "Enable Scalable Matrix Extension (SME)", [FeatureBF16, FeatureUseScalarIncVL]>;
+ "Enable Scalable Matrix Extension (SME)", [FeatureBF16]>;
def FeatureSMEF64F64 : ExtensionWithMArch<"sme-f64f64", "SMEF64F64", "FEAT_SME_F64F64",
"Enable Scalable Matrix Extension (SME) F64F64 instructions", [FeatureSME]>;
@@ -797,6 +794,9 @@ def FeatureUseFixedOverScalableIfEqualCost: SubtargetFeature<"use-fixed-over-sca
"UseFixedOverScalableIfEqualCost", "true",
"Prefer fixed width loop vectorization over scalable if the cost-model assigns equal costs">;
+def FeatureDontUseScalarIncVL : SubtargetFeature<"dont-use-scalar-inc-vl",
+ "DontUseScalarIncVL", "true", "Prefer add+cnt over inc/dec">;
+
//===----------------------------------------------------------------------===//
// Architectures.
//
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index 424848252f6aa6..ff11224842e7bf 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -375,9 +375,9 @@ def UseNegativeImmediates
: Predicate<"false">, AssemblerPredicate<(all_of (not FeatureNoNegativeImmediates)),
"NegativeImmediates">;
-def UseScalarIncVL : Predicate<"Subtarget->useScalarIncVL()">;
+def UseScalarIncVL : Predicate<"!Subtarget->dontUseScalarIncVL()">;
-def NoUseScalarIncVL : Predicate<"!Subtarget->useScalarIncVL()">;
+def NoUseScalarIncVL : Predicate<"Subtarget->dontUseScalarIncVL()">;
def UseSVEFPLD1R : Predicate<"!Subtarget->noSVEFPLD1R()">;
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index 736d57e6ae2fd9..09dff6265cbacb 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -335,6 +335,13 @@ void AArch64Subtarget::initializeProperties(bool HasMinSize) {
if (AArch64MinimumJumpTableEntries.getNumOccurrences() > 0 || !HasMinSize)
MinimumJumpTableEntries = AArch64MinimumJumpTableEntries;
+
+ // If only SVE is present (not SVE2 or SME) and DontUseScalarIncVL is not
+ // otherwise set, enable it by default.
+ if (!hasSVE2() && !hasSME()) {
+ if (!getFeatureString().contains("dont-use-scalar-inc-vl"))
+ DontUseScalarIncVL = true;
+ }
}
AArch64Subtarget::AArch64Subtarget(const Triple &TT, StringRef CPU,
diff --git a/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems-i32.ll b/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems-i32.ll
index 5062a43da931f8..1855d3673f92c7 100644
--- a/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems-i32.ll
+++ b/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems-i32.ll
@@ -1,6 +1,6 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s -check-prefix=NO_SCALAR_INC
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=+use-scalar-inc-vl < %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=-dont-use-scalar-inc-vl < %s | FileCheck %s
; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 < %s | FileCheck %s
; INCB
diff --git a/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems.ll b/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems.ll
index 8f0a9eac87c27c..bf812823e86a99 100644
--- a/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems.ll
+++ b/llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems.ll
@@ -1,8 +1,10 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=+use-scalar-inc-vl < %s | FileCheck %s -check-prefix=USE_SCALAR_INC
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=-dont-use-scalar-inc-vl < %s | FileCheck %s -check-prefix=USE_SCALAR_INC
; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 < %s | FileCheck %s -check-prefix=USE_SCALAR_INC
; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sme -force-streaming < %s | FileCheck %s -check-prefix=USE_SCALAR_INC
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 -mattr=+dont-use-scalar-inc-vl < %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sme -mattr=+dont-use-scalar-inc-vl -force-streaming < %s | FileCheck %s
;
; CNTB
diff --git a/llvm/test/CodeGen/AArch64/sve-vl-arith.ll b/llvm/test/CodeGen/AArch64/sve-vl-arith.ll
index de2af590acd1e2..4ddb2fdf56787b 100644
--- a/llvm/test/CodeGen/AArch64/sve-vl-arith.ll
+++ b/llvm/test/CodeGen/AArch64/sve-vl-arith.ll
@@ -1,7 +1,8 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -verify-machineinstrs < %s | FileCheck %s -check-prefix=NO_SCALAR_INC
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=+use-scalar-inc-vl -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=-dont-use-scalar-inc-vl -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 -mattr=+dont-use-scalar-inc-vl -verify-machineinstrs < %s | FileCheck %s -check-prefix=NO_SCALAR_INC
define <vscale x 8 x i16> @inch_vec(<vscale x 8 x i16> %a) {
; NO_SCALAR_INC-LABEL: inch_vec:
|
| "UseFixedOverScalableIfEqualCost", "true", | ||
| "Prefer fixed width loop vectorization over scalable if the cost-model assigns equal costs">; | ||
|
|
||
| def FeatureDontUseScalarIncVL : SubtargetFeature<"dont-use-scalar-inc-vl", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you implement this without changing the feature name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that should be OK. It should at least make the patch simple and I think just involves inverting all the logic, so turning the option on when we have SVE2/SME (as opposed to turning it off when we have only SVE-1). Having it default to addvl felt like a better default. Do you have a particular reason to not want to change it?
(I'm not sure if dont-use-scalar-inc-vl is the best name, to be honest. It might be better to have something like preferAddCntToAddvl to avoid the negatives and make it a little more explicit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the double negative in -mattr=-dont-use-scalar-inc-vl confusing and the current logic is a bit simpler (although I'm not particularly happy with the explicit partial string matching, which I guess is the downside of doing it this way). A name change would avoid the double negative, but thinking about this; why don't we just make it an LLVM option rather than a feature?
That way, we can do something like:
if (ForceScalarIncVL.getNumOccurrences())
UseScalarIncVL = ForceScalarIncVL.getValue();
else if (hasSVE2() || hasSME())
UseScalarIncVL = true;
// else UseScalarIncVL = false; (would be default)
We can still make use of the UseScalarIncVL Predicate in TableGen, it just wouldn't be defined through a target feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the string matching isn't the greatest. We only do it once though, and I believe this is the only way to check if a feature was specified. I assume that it was made a target feature so that we could add it to certain cpus if it is useful. I'm not 100% sure we won't find that would still be useful in some cases, but we have progressed a bit in time now and it hasn't come up yet.
3f55cc6 to
08b4b33
Compare
FeatureUseScalarIncVL is a tuning feature, used to control whether addvl or add+cnt is used. It was previously added as a dependency for FeatureSVE2, an architecture feature but this can be seen as a layering violation. The main disadvantage is that -use-scalar-inc-vl cannot be used without disabling sve2 and all dependant features. This patch now replaces that with an option that if unset defaults to hasSVE || hasSME, but is otherwise overriden by the option. The hope is that no cpus will rely on the tuning feature (or we can readdit if needed.
08b4b33 to
c8c4fc2
Compare
| return DefaultSVETFOpts; | ||
| } | ||
|
|
||
| bool useScalarIncVL() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe add a comment to explain what useScalarIncVL means?
| // If SVE2 or SME is present (we are not SVE-1 only) and UseScalarIncVL | ||
| // is not otherwise set, enable it by default. | ||
| if (UseScalarIncVL.getNumOccurrences()) | ||
| return UseScalarIncVL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still recommend creating a separate variable for it in the Subtarget that gets initialised with the code you have here. That way a CPU could possibly override it in AArch64Subtarget::initializeProperties (if there is ever a need for that) and then the call to useScalarIncVL() can be inlined rather than ending up as a call. But this is just me nitpicking a bit :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - We would probably want go the other way and reintroduce the target features if we needed it for a specific CPU. The preferred way is for tuning features to control the options, which are applied to given cpus that they are relevant to.
|
This was 1d6d073. |
FeatureUseScalarIncVL is a tuning feature, used to control whether addvl or add+cnt is used. It was previously added as a dependency for FeatureSVE2, an architecture feature but this can be seen as a layering violation. The main disadvantage is that -use-scalar-inc-vl cannot be used without disabling sve2 and all dependant features.
This patch attempt to break that, enabling the feature if SVE2 or SME is present and the use-scalar-inc-vl is not otherwise specified, but not requiring a hard dependency between the two. (Unfortunately I had to check if the value was present in the features string to check that).
It also inverts the meaning of the feature to DontUseScalarIncVL, so that the default state is to use addvl, hopefully the more sensible default going forward.