Commit 364e5d8

committed

Refactor: Move matrix packing outside GEMM kernels

In class tinyBLAS_PPC, previously, packing of input matrices A and B was performed on-the-fly within each GEMM microkernel. This patch refactors the code to decouple packing from kernel by introducing a preprocessing step that packs matrices once before any kernel is invoked. Benefits: - Enables better memory locality and data reuse - Simplifies the kernel logic by focusing purely on computation - Improves overall GEMM performance, especially for large matrix sizes Signed-off-by: Shalini Salomi Bodapati <[email protected]>

1 parent e298d2f commit 364e5d8Copy full SHA for 364e5d8

1 file changed

+359

-400

lines changed

ggml/src/ggml-cpu/llamafile
- sgemm.cpp

1 file changed

+359

-400

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 364e5d8

1 file changed

1 file changed

File tree

1 file changed

1 file changed

0 commit comments