Skip to content

Conversation

@lialan
Copy link
Member

@lialan lialan commented May 13, 2025

Fixes issue: #139752

When G_SHUFFLE_VECTOR has only 1 element then it is possible the vector is decayed into a scalar.

@lialan lialan force-pushed the lialan/fix_g_shuffle_vector branch from f98edcf to 1704d82 Compare May 13, 2025 17:27
@lialan lialan marked this pull request as ready for review May 13, 2025 20:04
@lialan lialan requested a review from jayfoad May 13, 2025 20:05
@llvmbot
Copy link
Member

llvmbot commented May 13, 2025

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-amdgpu

Author: Alan Li (lialan)

Changes

Fixes issue: #139752

When G_SHUFFLE_VECTOR has only 1 element then it is possible the vector is decayed into a scalar.


Full diff: https://github.com/llvm/llvm-project/pull/139769.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp (+6-2)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/prelegalizer-combiner-shuffle.mir (+24)
diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index 5191360c7718a..3abed4d062bfa 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -420,8 +420,12 @@ void CombinerHelper::applyCombineShuffleToBuildVector(MachineInstr &MI) const {
     else
       Extracts.push_back(Unmerge2.getReg(Val - Width));
   }
-
-  Builder.buildBuildVector(MI.getOperand(0).getReg(), Extracts);
+  assert(Extracts.size() > 0 && "Expected at least one element in the shuffle");
+  if (Extracts.size() == 1) {
+    Builder.buildCopy(MI.getOperand(0).getReg(), Extracts[0]);
+  } else {
+    Builder.buildBuildVector(MI.getOperand(0).getReg(), Extracts);
+  }
   MI.eraseFromParent();
 }
 
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/prelegalizer-combiner-shuffle.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/prelegalizer-combiner-shuffle.mir
index bba608cceee19..e500cfe085110 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/prelegalizer-combiner-shuffle.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/prelegalizer-combiner-shuffle.mir
@@ -135,3 +135,27 @@ body:             |
     SI_RETURN
 ...
 
+
+---
+name: shuffle_vector_to_copy
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $vgpr0, $vgpr1
+    ; CHECK-LABEL: name: shuffle_vector_to_copy
+    ; CHECK: liveins: $vgpr0, $vgpr1
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(p3) = COPY $vgpr1
+    ; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(<8 x s16>) = G_LOAD [[COPY]](p3) :: (load (<8 x s16>), align 8, addrspace 3)
+    ; CHECK-NEXT: [[UV:%[0-9]+]]:_(s16), [[UV1:%[0-9]+]]:_(s16), [[UV2:%[0-9]+]]:_(s16), [[UV3:%[0-9]+]]:_(s16), [[UV4:%[0-9]+]]:_(s16), [[UV5:%[0-9]+]]:_(s16), [[UV6:%[0-9]+]]:_(s16), [[UV7:%[0-9]+]]:_(s16) = G_UNMERGE_VALUES [[LOAD]](<8 x s16>)
+    ; CHECK-NEXT: G_STORE [[UV4]](s16), [[COPY1]](p3) :: (store (s16), addrspace 3)
+    ; CHECK-NEXT: SI_RETURN
+    %0:_(p3) = COPY $vgpr0
+    %1:_(p3) = COPY $vgpr1
+    %12:_(<8 x s16>) = G_IMPLICIT_DEF
+    %10:_(<8 x s16>) = G_LOAD %0(p3) :: (load (<8 x s16>), align 8, addrspace 3)
+    %11:_(s16) = G_SHUFFLE_VECTOR %10(<8 x s16>), %12, shufflemask(4)
+    G_STORE %11(s16), %1(p3) :: (store (s16), addrspace 3)
+    SI_RETURN
+...

@lialan lialan linked an issue May 13, 2025 that may be closed by this pull request
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also include end to end IR test

%1:_(p3) = COPY $vgpr1
%12:_(<8 x s16>) = G_IMPLICIT_DEF
%10:_(<8 x s16>) = G_LOAD %0(p3) :: (load (<8 x s16>), align 8, addrspace 3)
%11:_(s16) = G_SHUFFLE_VECTOR %10(<8 x s16>), %12, shufflemask(4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we even permit this in the MIR verifier? What produced this shuffle?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually emitted by this GIsel rule: combine_shuffle_disjoint_mask.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that should probably directly turn this into an extract element

@lialan lialan requested review from arsenm, krzysz00 and shiltian May 15, 2025 13:29
Comment on lines 424 to 428
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (Extracts.size() == 1) {
Builder.buildCopy(MI.getOperand(0).getReg(), Extracts[0]);
} else {
Builder.buildBuildVector(MI.getOperand(0).getReg(), Extracts);
}
if (Extracts.size() == 1)
Builder.buildCopy(MI.getOperand(0).getReg(), Extracts[0]);
else
Builder.buildBuildVector(MI.getOperand(0).getReg(), Extracts);

Copy link
Contributor

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me, though I can't definitively say this is fine

lialan added 3 commits May 20, 2025 19:39
Fixes issue: llvm#139752

When G_SHUFFLE_VECTOR has only 1 element then it is possible the vector
is decayed into a scalar.
@lialan lialan force-pushed the lialan/fix_g_shuffle_vector branch from 3ed3648 to 3fbc4e5 Compare May 20, 2025 23:39
@lialan lialan merged commit ada2fbf into llvm:main May 21, 2025
9 of 10 checks passed
@lialan lialan deleted the lialan/fix_g_shuffle_vector branch May 21, 2025 01:25
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 21, 2025

LLVM Buildbot has detected a new failure on builder openmp-s390x-linux running on systemz-1 while building llvm at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/11852

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: test (failure)
******************** TEST 'libomp :: tasking/issue-94260-2.c' FAILED ********************
Exit Code: -11

Command Output (stdout):
--
# RUN: at line 1
/home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang -fopenmp   -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic && /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# executed command: /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang -fopenmp -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic
# executed command: /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -11

--

********************


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[AMDGPU][GlobalISel] Assertion failure combining shuffle_vector

6 participants