Skip to content

Commit 0826d68

Browse files
author
Rajalakshmi Srinivasaraghavan
committed
POWER10: Change the packing format for bfloat16
As the new MMA instructions need the inputs in 4x2 order for bfloat16, changing the format in copy/packing code. This avoids permute instructions in the gemm kernel inner loop.
1 parent 602a0c7 commit 0826d68

File tree

6 files changed

+1923
-285
lines changed

6 files changed

+1923
-285
lines changed

kernel/power/KERNEL.POWER10

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,10 @@ else
99

1010
SBGEMM_BETA = ../generic/gemm_beta.c
1111
SBGEMMKERNEL = sbgemm_kernel_power10.c
12-
SBGEMMINCOPY = ../generic/gemm_ncopy_16.c
13-
SBGEMMITCOPY = ../generic/gemm_tcopy_16.c
14-
SBGEMMONCOPY = ../generic/gemm_ncopy_8.c
15-
SBGEMMOTCOPY = ../generic/gemm_tcopy_8.c
12+
SBGEMMINCOPY = sbgemm_ncopy_16_power10.c
13+
SBGEMMITCOPY = sbgemm_tcopy_16_power10.c
14+
SBGEMMONCOPY = sbgemm_ncopy_8_power10.c
15+
SBGEMMOTCOPY = sbgemm_tcopy_8_power10.c
1616
SBGEMMINCOPYOBJ = sbgemm_incopy$(TSUFFIX).$(SUFFIX)
1717
SBGEMMITCOPYOBJ = sbgemm_itcopy$(TSUFFIX).$(SUFFIX)
1818
SBGEMMONCOPYOBJ = sbgemm_oncopy$(TSUFFIX).$(SUFFIX)

0 commit comments

Comments
 (0)