PPC MMA implementation #1

amritahs-ibm · 2024-09-23T04:25:38Z

PPC MMA implementation for llamafile_sgemm API

Signed-off-by: Amrita H S <[email protected]>

ChipKerchner · 2024-10-29T12:49:46Z

ggml/src/llamafile/sgemm.cpp

+   __builtin_mma_disassemble_acc(vec_C, ACC); \
+   for (int I = 0; I < 4; I++) { \
+      for (int J = 0; J < 4; J++) { \
+         *((float*)(C+ii+((jj+J)*ldc)+I)) = *((float*)&vec_C[I]+J); \


It's probably better to do a 4 vector transpose here or invert the MMA inputs. That way you can write vectors at a time instead of scalar elements.

ChipKerchner · 2024-10-29T14:30:36Z

ggml/src/llamafile/sgemm.cpp

+		 aoffset1 +=  8*lda;
+		 aoffset2 +=  8*lda;
+		 aoffset3 +=  8*lda;
+		 aoffset4 +=  8*lda;


How come aoffset5 - 8 are not updated here? Could this be the reason it only works for multiples of 8?

ChipKerchner · 2024-10-30T14:02:57Z

ggml/src/CMakeLists.txt

+    string(FIND ${POWER10_M} "POWER10" substring_index)
+    if(${substring_index} GREATER_EQUAL 0)
+       list(APPEND ARCH_FLAGS -mcpu=power10)
+    elseif (${CMAKE_SYSTEM_PROCESSOR} MATCHES "ppc64le")


Is it possible for CMAKE_SYSTEM_PROCESSOR to match both ppc64 and ppc64le?

ChipKerchner · 2024-10-31T13:25:00Z

ggml/src/llamafile/sgemm.cpp

+	    vector float t1, t2, t3, t4;
+	    c1 = vec_xl(0, aoffset1);
+	    c2 = vec_xl(0, aoffset2);
+	    c3 = vec_xl(0, aoffset3);


Where is c4 loaded here?

PPC MMA implementation

248dc0d

Signed-off-by: Amrita H S <[email protected]>

ChipKerchner reviewed Oct 29, 2024

View reviewed changes

ChipKerchner reviewed Oct 30, 2024

View reviewed changes

ChipKerchner reviewed Oct 31, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PPC MMA implementation #1

PPC MMA implementation #1

Uh oh!

amritahs-ibm commented Sep 23, 2024

Uh oh!

ChipKerchner Oct 29, 2024

Uh oh!

ChipKerchner Oct 29, 2024 •

edited

Loading

Uh oh!

ChipKerchner Oct 30, 2024

Uh oh!

ChipKerchner Oct 31, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PPC MMA implementation #1

Are you sure you want to change the base?

PPC MMA implementation #1

Uh oh!

Conversation

amritahs-ibm commented Sep 23, 2024

Uh oh!

ChipKerchner Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

ChipKerchner Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChipKerchner Oct 30, 2024

Choose a reason for hiding this comment

Uh oh!

ChipKerchner Oct 31, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ChipKerchner Oct 29, 2024 •

edited

Loading