Skip to content

Commit 364e5d8

Browse files
committed
Refactor: Move matrix packing outside GEMM kernels
In class tinyBLAS_PPC, previously, packing of input matrices A and B was performed on-the-fly within each GEMM microkernel. This patch refactors the code to decouple packing from kernel by introducing a preprocessing step that packs matrices once before any kernel is invoked. Benefits: - Enables better memory locality and data reuse - Simplifies the kernel logic by focusing purely on computation - Improves overall GEMM performance, especially for large matrix sizes Signed-off-by: Shalini Salomi Bodapati <[email protected]>
1 parent e298d2f commit 364e5d8

File tree

1 file changed

+359
-400
lines changed

1 file changed

+359
-400
lines changed

0 commit comments

Comments
 (0)