Skip to content

Commit baeb4c3

Browse files
committed
dev-util/Tensile: fix compilation of sci-libs/rocBLAS on gfx906
Clang-20 disallowed op_sel in some VOP3P dot instructions. See: llvm/llvm-project#100485 As ROCm maintains a fork of Clang, these changes did not reach official ROCm releases. However Gentoo uses original Clang-20, which has these incompatible changes. Luckilly, in Tensile these op_sel do nothing. Generally, they allow to shuffle vector elements before multiplication, but with values 0,0/1,1 shuffling is disabled and op_sel can be removed. Closes: https://bugs.gentoo.org/949817 Signed-off-by: Sv. Lockal <[email protected]>
1 parent 3991ad9 commit baeb4c3

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

dev-util/Tensile/Tensile-6.4.1.ebuild renamed to dev-util/Tensile/Tensile-6.4.1-r1.ebuild

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,10 +81,13 @@ src_prepare() {
8181

8282
sed -e "s|os\.path\.dirname.*$|\"${EPREFIX}/usr/share/Tensile/Source\", end='')|" -i __init__.py || die
8383

84+
# bug 949817: fix v_dot4_i32_i8 syntax for clang-20
85+
sed 's/ op_sel:\[0,0\] op_sel_hi:\[1,1\]//' -i Components/MAC_I8X4.py || die
86+
8487
popd || die
8588

8689
sed -e "/package_data/d" -e "/data_files/d" -i setup.py || die
87-
use client && PATCHES= cmake_src_prepare # do not apply patches again in cmake_src_prepare
90+
use client && PATCHES='' cmake_src_prepare # do not apply patches again in cmake_src_prepare
8891
}
8992

9093
src_configure() {

0 commit comments

Comments
 (0)