-
Notifications
You must be signed in to change notification settings - Fork 15.3k
release/20.x: [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) #135191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@RKSimon What do you think about merging this PR to the release branch? |
|
@llvm/pr-subscribers-backend-x86 Author: None (llvmbot) ChangesBackport 08e080e Requested by: @RKSimon Full diff: https://github.com/llvm/llvm-project/pull/135191.diff 2 Files Affected:
diff --git a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
index 453898e132ca4..9dc392d6e9626 100644
--- a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
+++ b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
@@ -333,6 +333,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
MachineInstr &MI) {
unsigned Opc = MI.getOpcode();
MachineConstantPool *CP = MI.getParent()->getParent()->getConstantPool();
+ bool HasSSE2 = ST->hasSSE2();
bool HasSSE41 = ST->hasSSE41();
bool HasAVX2 = ST->hasAVX2();
bool HasDQI = ST->hasDQI();
@@ -394,11 +395,13 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
case X86::MOVAPDrm:
case X86::MOVAPSrm:
case X86::MOVUPDrm:
- case X86::MOVUPSrm:
+ case X86::MOVUPSrm: {
// TODO: SSE3 MOVDDUP Handling
- return FixupConstant({{X86::MOVSSrm, 1, 32, rebuildZeroUpperCst},
- {X86::MOVSDrm, 1, 64, rebuildZeroUpperCst}},
- 128, 1);
+ FixupEntry Fixups[] = {
+ {X86::MOVSSrm, 1, 32, rebuildZeroUpperCst},
+ {HasSSE2 ? X86::MOVSDrm : 0, 1, 64, rebuildZeroUpperCst}};
+ return FixupConstant(Fixups, 128, 1);
+ }
case X86::VMOVAPDrm:
case X86::VMOVAPSrm:
case X86::VMOVUPDrm:
diff --git a/llvm/test/CodeGen/X86/pr134607.ll b/llvm/test/CodeGen/X86/pr134607.ll
new file mode 100644
index 0000000000000..5e824c22e5a22
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr134607.ll
@@ -0,0 +1,20 @@
+; RUN: llc < %s -mtriple=i386-unknown-unknown -mattr=+sse -O3 | FileCheck %s --check-prefixes=X86
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-sse2,+sse -O3 | FileCheck %s --check-prefixes=X64-SSE1
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2,+sse -O3 | FileCheck %s --check-prefixes=X64-SSE2
+
+define void @store_v2f32_constant(ptr %v) {
+; X86-LABEL: store_v2f32_constant:
+; X86: # %bb.0:
+; X86-NEXT: movl 4(%esp), %eax
+; X86-NEXT: movaps {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0
+
+; X64-SSE1-LABEL: store_v2f32_constant:
+; X64-SSE1: # %bb.0:
+; X64-SSE1-NEXT: movaps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+
+; X64-SSE2-LABEL: store_v2f32_constant:
+; X64-SSE2: # %bb.0:
+; X64-SSE2-NEXT: movsd {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+ store <2 x float> <float 2.560000e+02, float 5.120000e+02>, ptr %v, align 4
+ ret void
+}
|
|
|
RKSimon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…134547) This fixes a regression I traced back to llvm@8b43c1b / llvm#79000 The regression caused an SSE2 instruction, `movsd`, to be emitted as a replacement for an SSE instruction, `movaps` despite the target potentially not supporting this instruction, such as when building with clang using `-march=pentium3`. Fixes llvm#134607 (cherry picked from commit 08e080e)
|
@RKSimon (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. |
Backport 08e080e
Requested by: @RKSimon