Bgemm for arm64 by taoye9 · Pull Request #5287 · OpenMathLib/OpenBLAS

taoye9 · 2025-05-28T13:21:58Z

No description provided.

annop-w · 2025-05-28T14:43:18Z

common.h

 #define SIZE	8
 #define  BASE_SHIFT 3
 #define ZBASE_SHIFT 4
+#elif defined(BFLOAT16_ONLY)


I do not think we need to introduce BFLOAT16_ONLY build flag.
The type of FLOAT is the only difference. Can we simplyl use XFLOAT instead of FLOAT for C matrix, for example, in kernel/arm64/bgemm_beta.c ?

martin-frbg · 2025-05-28T14:43:44Z

Thanks. Failures on CirrusCI are from running out of compute credits (as we're close to the end of the month), I'll rectify that in a minute. Not sure how DGEMM got hit by your PR on at least RISCV and LOONGARCH64 platforms, haven't looked closely yet but possibly some unintentional shifting of GEMM_UNROLL parameters ? (Also not sure why this touches SBGEMM settings ?)

annop-w · 2025-05-28T14:46:13Z

driver/level3/level3.c

 #define STOP_RPCC(COUNTER)
 #endif

+#if defined(HALF)


If we remove BUILD_BFLOAT16_ONLY flag, this is not needed I suppose.

annop-w · 2025-05-28T14:46:45Z

driver/level3/level3_thread.c

 #define STOP_RPCC(COUNTER)
 #endif

+#if defined(HALF)


Ditto. See comment in level3.c.

annop-w · 2025-05-28T14:53:01Z

interface/gemm.c

 #elif defined(BFLOAT16)
 #define ERROR_NAME "SBGEMM "
 #define GEMV BLASFUNC(sbgemv)
+#elif defined(BFLOAT16_ONLY)


Maybe a different naming ? Perhaps change to SBFLOAT16 for SBGEMM and BFLOAT16 for BGEMM. What do you think ?

Is there a need to split it? BUILD_BFLOAT16 could cover both cases and enable all bfloat16 ops?

@Mousius there's a slight difference between the global BUILD_BFLOAT16 and this BFLOAT16 (that IIRC is local to the interface and gets added by the (C)Makefile ). Maybe we could have a "BGEMM" define that gets set by the Makefile specifically when the function is bgemm and that is processed inside the "elif defined(BFLOAT16)" ? I haven't looked into the assumed requirement for a BUILD_BFLOAT16_ONLY in the level3 driver yet, but my gut feeling is that a local variable in interface/Makefile (and eventually CMakeLists.txt) to modify gemm.c build behaviour for bgemm.o should be sufficient.

annop-w · 2025-05-28T14:54:36Z

kernel/Makefile.L3

 	$(CC) $(CFLAGS) -c -DBFLOAT16 -UDOUBLE -UCOMPLEX $< -o $@

-ifneq ($(SBGEMM_UNROLL_M), $(SBGEMM_UNROLL_N))
+#ifneq ($(SBGEMM_UNROLL_M), $(SBGEMM_UNROLL_N))


Why is this commented out ?

My changes introduced a bug that raises an error when the kernels are reused. this line need to be uncomment when the fix is fixed.

annop-w · 2025-05-28T14:55:32Z

kernel/arm64/bgemm_beta.c

+
+int CNAME(BLASLONG m, BLASLONG n, BLASLONG dummy1, FLOAT beta, IFLOAT *dummy2,
+          BLASLONG dummy3, IFLOAT *dummy4, BLASLONG dummy5, FLOAT *c,
+          BLASLONG ldc) {


Change FLOAT to IFLOAT or XFLOAT ?

…d makefiles changes for bgemm interface

Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places. Derived mostly from the previous work done in OpenMathLib#5287 Co-authored-by: Ye Tao <ye.tao@arm.com>

martin-frbg · 2025-07-09T12:49:47Z

Closing as superseded by #5357 ; thank you very much to all involved !

This also improves the testing and generic kernel by re-using the BF16 conversion functions. Built on top of OpenMathLib#5357 and derived from OpenMathLib#5287 Co-authored-by: Ye Tao <ye.tao@arm.com>

This fixes an issue originally introduced with the BGEMM kernel when I was tweaking it. OpenMathLib#5287 didn't suffer from this bug. I've updated the tests to run with `beta=1.0` so as to test loading and updating from C. Alongside this, the tests now return sensible return values to reduce the risk of them being ignored. Co-authored-by: Ye Tao <ye.tao@arm.com>

taoye9 added 6 commits May 28, 2025 13:19

add .c and .h files for bgemm interface

2ef36a1

add generic bgemm kernel and its test file

abe9d38

support mutithreaded bgemm interface

1eb0815

support dynamic arch of bgemm interface

4d0fd12

fix generic gemm_beta for bgemm

59d0cf4

Resolve symbol conflicts when building sbgemm and bgemm together

082a9d2

taoye9 marked this pull request as draft May 28, 2025 13:27

annop-w reviewed May 28, 2025

View reviewed changes

driver/level3/level3_thread.c

#define STOP_RPCC(COUNTER)

#endif

#if defined(HALF)

Copy link

Contributor

annop-w May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. See comment in level3.c.

annop-w reviewed May 28, 2025

View reviewed changes

taoye9 added 3 commits May 29, 2025 14:45

change data type of bgemm alpha and beta from bfloat16 to fp32 and ad…

63ce52e

…d makefiles changes for bgemm interface

add neoversev1 bgemm kernels

5d16517

update init value of bgemm testcase

45aa27b

taoye9 mentioned this pull request Jun 19, 2025

RFC: Introduction of BGEMM and BGEMV for BFloat16 Matrix Operations in OpenBLAS #5155

Open

Mousius mentioned this pull request Jul 3, 2025

Add infrastructure for BGEMM #5357

Merged

martin-frbg closed this Jul 9, 2025

Mousius mentioned this pull request Jul 10, 2025

Add optimized BGEMM kernel for NEOVERSEV1 target #5373

Merged

Mousius mentioned this pull request Oct 6, 2025

Fix bf16->f32 conversion for NEOVERSEV1 and NEOVERSEN2 targets #5483

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bgemm for arm64#5287

Bgemm for arm64#5287
taoye9 wants to merge 9 commits intoOpenMathLib:developfrom
taoye9:bgemm_for_arm64

taoye9 commented May 28, 2025

Uh oh!

annop-w May 28, 2025 •

edited

Loading

Uh oh!

martin-frbg commented May 28, 2025

Uh oh!

annop-w May 28, 2025

Uh oh!

annop-w May 28, 2025

Uh oh!

annop-w May 28, 2025

Uh oh!

Mousius May 28, 2025

Uh oh!

martin-frbg May 28, 2025

Uh oh!

annop-w May 28, 2025

Uh oh!

taoye9 Jul 1, 2025

Uh oh!

annop-w May 28, 2025

Uh oh!

martin-frbg commented Jul 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

taoye9 commented May 28, 2025

Uh oh!

annop-w May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martin-frbg commented May 28, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martin-frbg commented Jul 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

annop-w May 28, 2025 •

edited

Loading