-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[AArch64] Remove wrong processor feature #151289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[AArch64] Remove wrong processor feature #151289
Conversation
f76a5c5
to
30f404e
Compare
@llvm/pr-subscribers-backend-aarch64 Author: Tomer Shafir (tomershafir) ChangesFull diff: https://github.com/llvm/llvm-project/pull/151289.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64Processors.td b/llvm/lib/Target/AArch64/AArch64Processors.td
index adc984ad795af..fa037df531028 100644
--- a/llvm/lib/Target/AArch64/AArch64Processors.td
+++ b/llvm/lib/Target/AArch64/AArch64Processors.td
@@ -318,7 +318,6 @@ def TuneAppleA7 : SubtargetFeature<"apple-a7", "ARMProcFamily", "AppleA7",
FeatureFuseAES, FeatureFuseCryptoEOR,
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
- FeatureZCRegMoveFPR64,
FeatureZCZeroing,
FeatureZCZeroingFPWorkaround]>;
@@ -332,7 +331,6 @@ def TuneAppleA10 : SubtargetFeature<"apple-a10", "ARMProcFamily", "AppleA10",
FeatureFuseCryptoEOR,
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
- FeatureZCRegMoveFPR64,
FeatureZCZeroing]>;
def TuneAppleA11 : SubtargetFeature<"apple-a11", "ARMProcFamily", "AppleA11",
@@ -345,7 +343,6 @@ def TuneAppleA11 : SubtargetFeature<"apple-a11", "ARMProcFamily", "AppleA11",
FeatureFuseCryptoEOR,
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
- FeatureZCRegMoveFPR64,
FeatureZCZeroing]>;
def TuneAppleA12 : SubtargetFeature<"apple-a12", "ARMProcFamily", "AppleA12",
@@ -358,7 +355,6 @@ def TuneAppleA12 : SubtargetFeature<"apple-a12", "ARMProcFamily", "AppleA12",
FeatureFuseCryptoEOR,
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
- FeatureZCRegMoveFPR64,
FeatureZCZeroing]>;
def TuneAppleA13 : SubtargetFeature<"apple-a13", "ARMProcFamily", "AppleA13",
@@ -371,7 +367,6 @@ def TuneAppleA13 : SubtargetFeature<"apple-a13", "ARMProcFamily", "AppleA13",
FeatureFuseCryptoEOR,
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
- FeatureZCRegMoveFPR64,
FeatureZCZeroing]>;
def TuneAppleA14 : SubtargetFeature<"apple-a14", "ARMProcFamily", "AppleA14",
@@ -389,7 +384,6 @@ def TuneAppleA14 : SubtargetFeature<"apple-a14", "ARMProcFamily", "AppleA14",
FeatureFuseLiterals,
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
- FeatureZCRegMoveFPR64,
FeatureZCZeroing]>;
def TuneAppleA15 : SubtargetFeature<"apple-a15", "ARMProcFamily", "AppleA15",
@@ -407,7 +401,6 @@ def TuneAppleA15 : SubtargetFeature<"apple-a15", "ARMProcFamily", "AppleA15",
FeatureFuseLiterals,
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
- FeatureZCRegMoveFPR64,
FeatureZCZeroing]>;
def TuneAppleA16 : SubtargetFeature<"apple-a16", "ARMProcFamily", "AppleA16",
@@ -425,7 +418,6 @@ def TuneAppleA16 : SubtargetFeature<"apple-a16", "ARMProcFamily", "AppleA16",
FeatureFuseLiterals,
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
- FeatureZCRegMoveFPR64,
FeatureZCZeroing]>;
def TuneAppleA17 : SubtargetFeature<"apple-a17", "ARMProcFamily", "AppleA17",
@@ -443,7 +435,6 @@ def TuneAppleA17 : SubtargetFeature<"apple-a17", "ARMProcFamily", "AppleA17",
FeatureFuseLiterals,
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
- FeatureZCRegMoveFPR64,
FeatureZCZeroing]>;
def TuneAppleM4 : SubtargetFeature<"apple-m4", "ARMProcFamily", "AppleM4",
@@ -460,7 +451,6 @@ def TuneAppleM4 : SubtargetFeature<"apple-m4", "ARMProcFamily", "AppleM4",
FeatureFuseCryptoEOR,
FeatureFuseLiterals,
FeatureZCRegMoveGPR64,
- FeatureZCRegMoveFPR64,
FeatureZCZeroing
]>;
diff --git a/llvm/test/CodeGen/AArch64/arm64-zero-cycle-regmov-fpr.ll b/llvm/test/CodeGen/AArch64/arm64-zero-cycle-regmov-fpr.ll
deleted file mode 100644
index f422f96f33495..0000000000000
--- a/llvm/test/CodeGen/AArch64/arm64-zero-cycle-regmov-fpr.ll
+++ /dev/null
@@ -1,103 +0,0 @@
-; RUN: llc < %s -mtriple=arm64-linux-gnu | FileCheck %s -check-prefixes=NOTCPU-LINUX --match-full-lines
-; RUN: llc < %s -mtriple=arm64-apple-macosx -mcpu=generic | FileCheck %s -check-prefixes=NOTCPU-APPLE --match-full-lines
-; RUN: llc < %s -mtriple=arm64-apple-macosx -mcpu=apple-m1 | FileCheck %s -check-prefixes=CPU --match-full-lines
-; RUN: llc < %s -mtriple=arm64-apple-macosx -mcpu=apple-m1 -mattr=-zcm-fpr64 | FileCheck %s -check-prefixes=NOTATTR --match-full-lines
-; RUN: llc < %s -mtriple=arm64-apple-macosx -mattr=+zcm-fpr64 | FileCheck %s -check-prefixes=ATTR --match-full-lines
-
-define void @zero_cycle_regmov_FPR32(float %a, float %b, float %c, float %d) {
-entry:
-; CHECK-LABEL: t:
-; NOTCPU-LINUX: fmov s0, s2
-; NOTCPU-LINUX: fmov s1, s3
-; NOTCPU-LINUX: fmov [[REG2:s[0-9]+]], s3
-; NOTCPU-LINUX: fmov [[REG1:s[0-9]+]], s2
-; NOTCPU-LINUX-NEXT: bl {{_?foo_float}}
-; NOTCPU-LINUX: fmov s0, [[REG1]]
-; NOTCPU-LINUX: fmov s1, [[REG2]]
-
-; NOTCPU-APPLE: fmov s0, s2
-; NOTCPU-APPLE: fmov s1, s3
-; NOTCPU-APPLE: fmov [[REG2:s[0-9]+]], s3
-; NOTCPU-APPLE: fmov [[REG1:s[0-9]+]], s2
-; NOTCPU-APPLE-NEXT: bl {{_?foo_float}}
-; NOTCPU-APPLE: fmov s0, [[REG1]]
-; NOTCPU-APPLE: fmov s1, [[REG2]]
-
-; CPU: fmov [[REG2:d[0-9]+]], d3
-; CPU: fmov [[REG1:d[0-9]+]], d2
-; CPU: fmov d0, d2
-; CPU: fmov d1, d3
-; CPU-NEXT: bl {{_?foo_float}}
-; CPU: fmov d0, [[REG1]]
-; CPU: fmov d1, [[REG2]]
-
-; NOTATTR: fmov [[REG2:s[0-9]+]], s3
-; NOTATTR: fmov [[REG1:s[0-9]+]], s2
-; NOTATTR: fmov s0, s2
-; NOTATTR: fmov s1, s3
-; NOTATTR-NEXT: bl {{_?foo_float}}
-; NOTATTR: fmov s0, [[REG1]]
-; NOTATTR: fmov s1, [[REG2]]
-
-; ATTR: fmov d0, d2
-; ATTR: fmov d1, d3
-; ATTR: fmov [[REG2:d[0-9]+]], d3
-; ATTR: fmov [[REG1:d[0-9]+]], d2
-; ATTR-NEXT: bl {{_?foo_float}}
-; ATTR: fmov d0, [[REG1]]
-; ATTR: fmov d1, [[REG2]]
- %call = call float @foo_float(float %c, float %d)
- %call1 = call float @foo_float(float %c, float %d)
- unreachable
-}
-
-declare float @foo_float(float, float)
-
-define void @zero_cycle_regmov_FPR16(half %a, half %b, half %c, half %d) {
-entry:
-; CHECK-LABEL: t:
-; NOTCPU-LINUX: fmov s0, s2
-; NOTCPU-LINUX: fmov s1, s3
-; NOTCPU-LINUX: fmov [[REG2:s[0-9]+]], s3
-; NOTCPU-LINUX: fmov [[REG1:s[0-9]+]], s2
-; NOTCPU-LINUX-NEXT: bl {{_?foo_half}}
-; NOTCPU-LINUX: fmov s0, [[REG1]]
-; NOTCPU-LINUX: fmov s1, [[REG2]]
-
-; NOTCPU-APPLE: fmov s0, s2
-; NOTCPU-APPLE: fmov s1, s3
-; NOTCPU-APPLE: fmov [[REG2:s[0-9]+]], s3
-; NOTCPU-APPLE: fmov [[REG1:s[0-9]+]], s2
-; NOTCPU-APPLE-NEXT: bl {{_?foo_half}}
-; NOTCPU-APPLE: fmov s0, [[REG1]]
-; NOTCPU-APPLE: fmov s1, [[REG2]]
-
-; CPU: fmov [[REG2:d[0-9]+]], d3
-; CPU: fmov [[REG1:d[0-9]+]], d2
-; CPU: fmov d0, d2
-; CPU: fmov d1, d3
-; CPU-NEXT: bl {{_?foo_half}}
-; CPU: fmov d0, [[REG1]]
-; CPU: fmov d1, [[REG2]]
-
-; NOTATTR: fmov [[REG2:s[0-9]+]], s3
-; NOTATTR: fmov [[REG1:s[0-9]+]], s2
-; NOTATTR: fmov s0, s2
-; NOTATTR: fmov s1, s3
-; NOTATTR-NEXT: bl {{_?foo_half}}
-; NOTATTR: fmov s0, [[REG1]]
-; NOTATTR: fmov s1, [[REG2]]
-
-; ATTR: fmov d0, d2
-; ATTR: fmov d1, d3
-; ATTR: fmov [[REG2:d[0-9]+]], d3
-; ATTR: fmov [[REG1:d[0-9]+]], d2
-; ATTR-NEXT: bl {{_?foo_half}}
-; ATTR: fmov d0, [[REG1]]
-; ATTR: fmov d1, [[REG2]]
- %call = call half @foo_half(half %c, half %d)
- %call1 = call half @foo_half(half %c, half %d)
- unreachable
-}
-
-declare half @foo_half(half, half)
|
No opinion on the change itself but your PR description does need more detail. A link to the change that added it would be good too. |
30f404e
to
31d6c6f
Compare
@DavidSpickett Done |
Ok, better, but why is the feature wrong? If I do a downstream merge and see that this change made my benchmarks go one way or the other, that's what I'd be looking for in the commit message. Perhaps the intent was good you just want to implement it differently, perhaps you found it made zero difference and doesn't need to be there, whatever the reason is. |
Done @jroelofs Ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds OK to me if it how the hardware works and someone from Apple agrees. It might be worth keeping the testing though.
@@ -1,103 +0,0 @@ | |||
; RUN: llc < %s -mtriple=arm64-linux-gnu | FileCheck %s -check-prefixes=NOTCPU-LINUX --match-full-lines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this loses all testing for the feature. Can we keep just the testing for default and +zcm-fpr64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, gonna keep NOTCPU-APPLE
, NOTCPU-LINUX
, ATTR
`fmov dX, dY` is not a preferred instruction. Previously introduced by: llvm#144152
31d6c6f
to
ae5084a
Compare
I agree with David here, even after the edits. The description in the commit message needs to say "why" we are making the change, as in: why is this feature wrong and not preferred? That's something that often cannot be inferred from the "what" of the rest of the patch. |
fmov dX, dY
is not a preferred instruction.Previously introduced by: #144152