-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[AArch64] Have isel just do neg directly #145185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-backend-aarch64 Author: AZero13 (AZero13) ChangesThis means we do not have to rely on register coalescing to fix this, which matches gcc outputting things like neg on -O0 Full diff: https://github.com/llvm/llvm-project/pull/145185.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64FastISel.cpp b/llvm/lib/Target/AArch64/AArch64FastISel.cpp
index 9d74bb5a8661d..85f9140f34e68 100644
--- a/llvm/lib/Target/AArch64/AArch64FastISel.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FastISel.cpp
@@ -1201,6 +1201,19 @@ Register AArch64FastISel::emitAddSub(bool UseAdd, MVT RetVT, const Value *LHS,
SI->getOpcode() == Instruction::AShr )
std::swap(LHS, RHS);
+ // Special case: sub 0, x -> neg x (use zero register directly)
+ if (!UseAdd && isa<Constant>(LHS) && cast<Constant>(LHS)->isNullValue()) {
+ Register RHSReg = getRegForValue(RHS);
+ if (!RHSReg)
+ return Register();
+
+ if (NeedExtend)
+ RHSReg = emitIntExt(SrcVT, RHSReg, RetVT, IsZExt);
+
+ Register ZeroReg = RetVT == MVT::i64 ? AArch64::XZR : AArch64::WZR;
+ return emitAddSub_rr(UseAdd, RetVT, ZeroReg, RHSReg, SetFlags, WantResult);
+ }
+
Register LHSReg = getRegForValue(LHS);
if (!LHSReg)
return Register();
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
index d55ff5acb3dca..cd6e7aa300043 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -4409,6 +4409,22 @@ MachineInstr *AArch64InstructionSelector::emitAddSub(
assert((Size == 32 || Size == 64) && "Expected a 32-bit or 64-bit type only");
bool Is32Bit = Size == 32;
+ // Special case: sub 0, x -> neg x (use zero register directly)
+ // Check if this is a SUB operation by examining the base register-register opcode
+ unsigned BaseOpc = AddrModeAndSizeToOpcode[2][Is32Bit];
+ bool IsSubtraction = (BaseOpc == AArch64::SUBWrr || BaseOpc == AArch64::SUBXrr ||
+ BaseOpc == AArch64::SUBSWrr || BaseOpc == AArch64::SUBSXrr);
+ if (IsSubtraction) {
+ if (auto LHSImm = getIConstantVRegValWithLookThrough(LHS.getReg(), MRI)) {
+ if (LHSImm->Value.isZero()) {
+ // Replace LHS with the appropriate zero register
+ Register ZeroReg = Is32Bit ? AArch64::WZR : AArch64::XZR;
+ MachineOperand ZeroMO = MachineOperand::CreateReg(ZeroReg, false);
+ return emitInstr(BaseOpc, {Dst}, {ZeroMO, RHS}, MIRBuilder);
+ }
+ }
+ }
+
// INSTRri form with positive arithmetic immediate.
if (auto Fns = selectArithImmed(RHS))
return emitInstr(AddrModeAndSizeToOpcode[0][Is32Bit], {Dst}, {LHS},
diff --git a/llvm/test/CodeGen/AArch64/instruct-neg.ll b/llvm/test/CodeGen/AArch64/instruct-neg.ll
new file mode 100644
index 0000000000000..4d60fb768901e
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/instruct-neg.ll
@@ -0,0 +1,11 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=aarch64-linux-gnu -O0 -fast-isel -fast-isel-abort=1 -verify-machineinstrs < %s | FileCheck %s --check-prefix=CHECK
+
+define i32 @negb(i32 %b) {
+; CHECK-LABEL: negb:
+; CHECK: // %bb.0:
+; CHECK-NEXT: neg w0, w0
+; CHECK-NEXT: ret
+ %sub = sub nsw i32 0, %b
+ ret i32 %sub
+}
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
a1fb407 to
09e056e
Compare
This means we do not have to rely on register coalescing to fix this, which matches gcc outputting things like neg on -O0
|
Hi - Can you give some context on why you are interested in fast-isel? We don't use it by default at -O0 any more (it might still get used by some jits?). We don't tend to do a lot of extra work in fast-isel, and the work we add needs to balance the compile time impact against any runtime performance it would bring. |
|
It's just bare minimum to get the asm to show as neg. It's not really performance-related as any enhancements is just maybe one instruction less at best. |
This means we do not have to rely on register coalescing to fix this, which matches gcc outputting things like neg on -O0