release/20.x: [X86] Ignore NSW when DstSVT is i32 (#131755) #147034

llvmbot · 2025-07-04T10:27:11Z

Backport 3d63191

Requested by: @nikic

llvmbot · 2025-07-04T10:27:19Z

@RKSimon What do you think about merging this PR to the release branch?

llvmbot · 2025-07-04T10:27:42Z

@llvm/pr-subscribers-backend-x86

Author: None (llvmbot)

Changes

Backport 3d63191

Requested by: @nikic

Full diff: https://github.com/llvm/llvm-project/pull/147034.diff

2 Files Affected:

(modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+2-1)
(modified) llvm/test/CodeGen/X86/vector-trunc-nowrap.ll (+86)

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 4413fbb77f415..12c40b501f627 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -20889,7 +20889,8 @@ static SDValue matchTruncateWithPACK(unsigned &PackOpcode, EVT DstVT,
     return SDValue();
 
   unsigned MinSignBits = NumSrcEltBits - NumPackedSignBits;
-  if (Flags.hasNoSignedWrap() || MinSignBits < NumSignBits) {
+  if ((Flags.hasNoSignedWrap() && DstSVT != MVT::i32) ||
+      MinSignBits < NumSignBits) {
     PackOpcode = X86ISD::PACKSS;
     return In;
   }
diff --git a/llvm/test/CodeGen/X86/vector-trunc-nowrap.ll b/llvm/test/CodeGen/X86/vector-trunc-nowrap.ll
index 2b8eedfbbdc9c..85cca4f6f9a57 100644
--- a/llvm/test/CodeGen/X86/vector-trunc-nowrap.ll
+++ b/llvm/test/CodeGen/X86/vector-trunc-nowrap.ll
@@ -1592,3 +1592,89 @@ entry:
   %1 = bitcast <8 x i8> %0 to i64
   ret i64 %1
 }
+
+define void @foo(<4 x i64> %a, <4 x i64> %b, ptr %p) "min-legal-vector-width"="256" "prefer-vector-width"="256" {
+; SSE-LABEL: foo:
+; SSE:       # %bb.0: # %entry
+; SSE-NEXT:    shufps {{.*#+}} xmm0 = xmm0[0,2],xmm1[0,2]
+; SSE-NEXT:    shufps {{.*#+}} xmm2 = xmm2[0,2],xmm3[0,2]
+; SSE-NEXT:    movaps %xmm2, 16(%rdi)
+; SSE-NEXT:    movaps %xmm0, (%rdi)
+; SSE-NEXT:    retq
+;
+; AVX1-LABEL: foo:
+; AVX1:       # %bb.0: # %entry
+; AVX1-NEXT:    vperm2f128 {{.*#+}} ymm2 = ymm0[2,3],ymm1[2,3]
+; AVX1-NEXT:    vinsertf128 $1, %xmm1, %ymm0, %ymm0
+; AVX1-NEXT:    vshufps {{.*#+}} ymm0 = ymm0[0,2],ymm2[0,2],ymm0[4,6],ymm2[4,6]
+; AVX1-NEXT:    vmovups %ymm0, (%rdi)
+; AVX1-NEXT:    vzeroupper
+; AVX1-NEXT:    retq
+;
+; AVX2-SLOW-LABEL: foo:
+; AVX2-SLOW:       # %bb.0: # %entry
+; AVX2-SLOW-NEXT:    vextractf128 $1, %ymm0, %xmm2
+; AVX2-SLOW-NEXT:    vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm2[0,2]
+; AVX2-SLOW-NEXT:    vextractf128 $1, %ymm1, %xmm2
+; AVX2-SLOW-NEXT:    vshufps {{.*#+}} xmm1 = xmm1[0,2],xmm2[0,2]
+; AVX2-SLOW-NEXT:    vmovaps %xmm1, 16(%rdi)
+; AVX2-SLOW-NEXT:    vmovaps %xmm0, (%rdi)
+; AVX2-SLOW-NEXT:    vzeroupper
+; AVX2-SLOW-NEXT:    retq
+;
+; AVX2-FAST-ALL-LABEL: foo:
+; AVX2-FAST-ALL:       # %bb.0: # %entry
+; AVX2-FAST-ALL-NEXT:    vmovaps {{.*#+}} ymm2 = [0,2,4,6,4,6,6,7]
+; AVX2-FAST-ALL-NEXT:    vpermps %ymm0, %ymm2, %ymm0
+; AVX2-FAST-ALL-NEXT:    vpermps %ymm1, %ymm2, %ymm1
+; AVX2-FAST-ALL-NEXT:    vmovaps %xmm1, 16(%rdi)
+; AVX2-FAST-ALL-NEXT:    vmovaps %xmm0, (%rdi)
+; AVX2-FAST-ALL-NEXT:    vzeroupper
+; AVX2-FAST-ALL-NEXT:    retq
+;
+; AVX2-FAST-PERLANE-LABEL: foo:
+; AVX2-FAST-PERLANE:       # %bb.0: # %entry
+; AVX2-FAST-PERLANE-NEXT:    vextractf128 $1, %ymm0, %xmm2
+; AVX2-FAST-PERLANE-NEXT:    vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm2[0,2]
+; AVX2-FAST-PERLANE-NEXT:    vextractf128 $1, %ymm1, %xmm2
+; AVX2-FAST-PERLANE-NEXT:    vshufps {{.*#+}} xmm1 = xmm1[0,2],xmm2[0,2]
+; AVX2-FAST-PERLANE-NEXT:    vmovaps %xmm1, 16(%rdi)
+; AVX2-FAST-PERLANE-NEXT:    vmovaps %xmm0, (%rdi)
+; AVX2-FAST-PERLANE-NEXT:    vzeroupper
+; AVX2-FAST-PERLANE-NEXT:    retq
+;
+; AVX512F-LABEL: foo:
+; AVX512F:       # %bb.0: # %entry
+; AVX512F-NEXT:    # kill: def $ymm0 killed $ymm0 def $zmm0
+; AVX512F-NEXT:    vinserti64x4 $1, %ymm1, %zmm0, %zmm0
+; AVX512F-NEXT:    vpmovqd %zmm0, (%rdi)
+; AVX512F-NEXT:    vzeroupper
+; AVX512F-NEXT:    retq
+;
+; AVX512VL-LABEL: foo:
+; AVX512VL:       # %bb.0: # %entry
+; AVX512VL-NEXT:    vpmovqd %ymm1, 16(%rdi)
+; AVX512VL-NEXT:    vpmovqd %ymm0, (%rdi)
+; AVX512VL-NEXT:    vzeroupper
+; AVX512VL-NEXT:    retq
+;
+; AVX512BW-LABEL: foo:
+; AVX512BW:       # %bb.0: # %entry
+; AVX512BW-NEXT:    # kill: def $ymm0 killed $ymm0 def $zmm0
+; AVX512BW-NEXT:    vinserti64x4 $1, %ymm1, %zmm0, %zmm0
+; AVX512BW-NEXT:    vpmovqd %zmm0, (%rdi)
+; AVX512BW-NEXT:    vzeroupper
+; AVX512BW-NEXT:    retq
+;
+; AVX512BWVL-LABEL: foo:
+; AVX512BWVL:       # %bb.0: # %entry
+; AVX512BWVL-NEXT:    vpmovqd %ymm1, 16(%rdi)
+; AVX512BWVL-NEXT:    vpmovqd %ymm0, (%rdi)
+; AVX512BWVL-NEXT:    vzeroupper
+; AVX512BWVL-NEXT:    retq
+entry:
+  %0 = shufflevector <4 x i64> %a, <4 x i64> %b, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
+  %1 = trunc nsw <8 x i64> %0 to <8 x i32>
+  store <8 x i32> %1, ptr %p, align 16
+  ret void
+}

We don't have PACKSS for i64->i32. Fixes: https://godbolt.org/z/qb8nxnPbK, which was introduced by ddd2f57 (cherry picked from commit 3d63191)

github-actions · 2025-07-08T00:28:28Z

@nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR.

llvmbot added this to the LLVM 20.X Release milestone Jul 4, 2025

github-project-automation bot moved this to Needs Triage in LLVM Release Status Jul 4, 2025

github-project-automation bot added this to LLVM Release Status Jul 4, 2025

llvmbot requested a review from RKSimon July 4, 2025 10:27

llvmbot added the backend:X86 label Jul 4, 2025

llvmbot mentioned this pull request Jul 4, 2025

Rust compilation error on some targets #146957

Closed

nikic force-pushed the issue146957 branch from d21a842 to c50b409 Compare July 4, 2025 12:17

nikic mentioned this pull request Jul 4, 2025

Backport #145690, #146068 and #144058 rust-lang/llvm-project#182

Merged

RKSimon approved these changes Jul 4, 2025

View reviewed changes

github-project-automation bot moved this from Needs Triage to Needs Merge in LLVM Release Status Jul 4, 2025

[X86] Ignore NSW when DstSVT is i32 (llvm#131755)

0de59a2

We don't have PACKSS for i64->i32. Fixes: https://godbolt.org/z/qb8nxnPbK, which was introduced by ddd2f57 (cherry picked from commit 3d63191)

llvmbot force-pushed the issue146957 branch from c50b409 to 0de59a2 Compare July 8, 2025 00:27

tstellar merged commit 0de59a2 into llvm:release/20.x Jul 8, 2025
7 of 9 checks passed

github-project-automation bot moved this from Needs Merge to Done in LLVM Release Status Jul 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

release/20.x: [X86] Ignore NSW when DstSVT is i32 (#131755) #147034

release/20.x: [X86] Ignore NSW when DstSVT is i32 (#131755) #147034

Uh oh!

llvmbot commented Jul 4, 2025

Uh oh!

llvmbot commented Jul 4, 2025

Uh oh!

llvmbot commented Jul 4, 2025

Uh oh!

Uh oh!

github-actions bot commented Jul 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

release/20.x: [X86] Ignore NSW when DstSVT is i32 (#131755) #147034

release/20.x: [X86] Ignore NSW when DstSVT is i32 (#131755) #147034

Uh oh!

Conversation

llvmbot commented Jul 4, 2025

Uh oh!

llvmbot commented Jul 4, 2025

Uh oh!

llvmbot commented Jul 4, 2025

Uh oh!

Uh oh!

github-actions bot commented Jul 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants