-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[X86] Use pseudo instructions to zero registers in buildClearRegister
#163358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ensure buildClearRegister uses the correct opcode for each register class: - For YMM (VR256), use VPXORYrr - For ZMM (VR512), use VPXORDZrr This avoids illegal register/opcode pairings and MachineVerifier errors when clearing call-used registers under -fzero-call-used-regs. Fixes: llvm#163053
@llvm/pr-subscribers-backend-x86 Author: Abhishek Kaushik (abhishek-kaushik22) ChangesEnsure
This avoids illegal register/opcode pairings and machine verifier errors when clearing call-used registers under Fixes: #163053 Full diff: https://github.com/llvm/llvm-project/pull/163358.diff 2 Files Affected:
diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp
index 1d2cd39951bf4..671363d87db1b 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.cpp
+++ b/llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -10818,8 +10818,8 @@ void X86InstrInfo::buildClearRegister(Register Reg, MachineBasicBlock &MBB,
if (!ST.hasAVX())
return;
- // VPXOR is safe to use because it doesn't affect flags.
- BuildMI(MBB, Iter, DL, get(X86::VPXORrr), Reg)
+ // VPXORY is safe to use because it doesn't affect flags.
+ BuildMI(MBB, Iter, DL, get(X86::VPXORYrr), Reg)
.addReg(Reg, RegState::Undef)
.addReg(Reg, RegState::Undef);
} else if (X86::VR512RegClass.contains(Reg)) {
@@ -10827,8 +10827,8 @@ void X86InstrInfo::buildClearRegister(Register Reg, MachineBasicBlock &MBB,
if (!ST.hasAVX512())
return;
- // VPXORY is safe to use because it doesn't affect flags.
- BuildMI(MBB, Iter, DL, get(X86::VPXORYrr), Reg)
+ // VPXORDZ is safe to use because it doesn't affect flags.
+ BuildMI(MBB, Iter, DL, get(X86::VPXORDZrr), Reg)
.addReg(Reg, RegState::Undef)
.addReg(Reg, RegState::Undef);
} else if (X86::VK1RegClass.contains(Reg) || X86::VK2RegClass.contains(Reg) ||
diff --git a/llvm/test/CodeGen/X86/pr163053.ll b/llvm/test/CodeGen/X86/pr163053.ll
new file mode 100644
index 0000000000000..0fdb4d4be3896
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr163053.ll
@@ -0,0 +1,22 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc < %s -mtriple=x86_64-- -mcpu=x86-64-v4 -verify-machineinstrs | FileCheck %s
+
+define void @pr163053(ptr %arg, ptr %arg1) #0 {
+; CHECK-LABEL: pr163053:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; CHECK-NEXT: vmovaps %ymm0, (%rdi)
+; CHECK-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; CHECK-NEXT: vmovaps %zmm0, (%rsi)
+; CHECK-NEXT: xorl %edi, %edi
+; CHECK-NEXT: xorl %esi, %esi
+; CHECK-NEXT: vxorps %ymm0, %ymm0, %ymm0
+; CHECK-NEXT: vxorps %zmm0, %zmm0, %zmm0
+; CHECK-NEXT: vzeroupper
+; CHECK-NEXT: retq
+ store <16 x i16> zeroinitializer, ptr %arg, align 32
+ store <32 x i16> zeroinitializer, ptr %arg1, align 64
+ ret void
+}
+
+attributes #0 = { "zero-call-used-regs"="used" }
|
@phoebewang @e-kud @RKSimon should this function also handle the VR128X and VR256X register classes? I feel that might also be a verifier failure, but I couldn't write a test that triggers that. |
Thanks @phoebewang for the suggestion! |
Yes please.
Yes, we need better coverage - please can you try adding x86-64 and x86-64-v3 RUNs to pr163053.ll? Maybe a AVX1 target such as sandybridge or btver2 as well. |
buildClearRegister
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - cheers
@abhishek-kaushik22 the CI failures are unrelated if you want to skip auto-merge and just push |
Can this be cherry-picked to the release/21.x branch? |
In
buildClearRegister
use the correct pseudo-opcode for each register class:VR128
, useV_SET0
VR256
, useAVX_SET0
VR512
, useAVX512_512_SET0
VK*
, useKSET0Q/KSET0W
This avoids illegal register/opcode pairings and machine verifier errors when clearing call-used registers under
-fzero-call-used-regs=used
.Fixes: #163053