Conversation
| #define SIZE 8 | ||
| #define BASE_SHIFT 3 | ||
| #define ZBASE_SHIFT 4 | ||
| #elif defined(BFLOAT16_ONLY) |
There was a problem hiding this comment.
I do not think we need to introduce BFLOAT16_ONLY build flag.
The type of FLOAT is the only difference. Can we simplyl use XFLOAT instead of FLOAT for C matrix, for example, in kernel/arm64/bgemm_beta.c ?
|
Thanks. Failures on CirrusCI are from running out of compute credits (as we're close to the end of the month), I'll rectify that in a minute. Not sure how DGEMM got hit by your PR on at least RISCV and LOONGARCH64 platforms, haven't looked closely yet but possibly some unintentional shifting of GEMM_UNROLL parameters ? (Also not sure why this touches SBGEMM settings ?) |
| #define STOP_RPCC(COUNTER) | ||
| #endif | ||
|
|
||
| #if defined(HALF) |
There was a problem hiding this comment.
If we remove BUILD_BFLOAT16_ONLY flag, this is not needed I suppose.
| #define STOP_RPCC(COUNTER) | ||
| #endif | ||
|
|
||
| #if defined(HALF) |
There was a problem hiding this comment.
Ditto. See comment in level3.c.
| #elif defined(BFLOAT16) | ||
| #define ERROR_NAME "SBGEMM " | ||
| #define GEMV BLASFUNC(sbgemv) | ||
| #elif defined(BFLOAT16_ONLY) |
There was a problem hiding this comment.
Maybe a different naming ? Perhaps change to SBFLOAT16 for SBGEMM and BFLOAT16 for BGEMM. What do you think ?
There was a problem hiding this comment.
Is there a need to split it? BUILD_BFLOAT16 could cover both cases and enable all bfloat16 ops?
There was a problem hiding this comment.
@Mousius there's a slight difference between the global BUILD_BFLOAT16 and this BFLOAT16 (that IIRC is local to the interface and gets added by the (C)Makefile ). Maybe we could have a "BGEMM" define that gets set by the Makefile specifically when the function is bgemm and that is processed inside the "elif defined(BFLOAT16)" ? I haven't looked into the assumed requirement for a BUILD_BFLOAT16_ONLY in the level3 driver yet, but my gut feeling is that a local variable in interface/Makefile (and eventually CMakeLists.txt) to modify gemm.c build behaviour for bgemm.o should be sufficient.
| $(CC) $(CFLAGS) -c -DBFLOAT16 -UDOUBLE -UCOMPLEX $< -o $@ | ||
|
|
||
| ifneq ($(SBGEMM_UNROLL_M), $(SBGEMM_UNROLL_N)) | ||
| #ifneq ($(SBGEMM_UNROLL_M), $(SBGEMM_UNROLL_N)) |
There was a problem hiding this comment.
Why is this commented out ?
There was a problem hiding this comment.
My changes introduced a bug that raises an error when the kernels are reused. this line need to be uncomment when the fix is fixed.
kernel/arm64/bgemm_beta.c
Outdated
|
|
||
| int CNAME(BLASLONG m, BLASLONG n, BLASLONG dummy1, FLOAT beta, IFLOAT *dummy2, | ||
| BLASLONG dummy3, IFLOAT *dummy4, BLASLONG dummy5, FLOAT *c, | ||
| BLASLONG ldc) { |
There was a problem hiding this comment.
Change FLOAT to IFLOAT or XFLOAT ?
…d makefiles changes for bgemm interface
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places. Derived mostly from the previous work done in OpenMathLib#5287 Co-authored-by: Ye Tao <ye.tao@arm.com>
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places. Derived mostly from the previous work done in OpenMathLib#5287 Co-authored-by: Ye Tao <ye.tao@arm.com>
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places. Derived mostly from the previous work done in OpenMathLib#5287 Co-authored-by: Ye Tao <ye.tao@arm.com>
|
Closing as superseded by #5357 ; thank you very much to all involved ! |
This also improves the testing and generic kernel by re-using the BF16 conversion functions. Built on top of OpenMathLib#5357 and derived from OpenMathLib#5287 Co-authored-by: Ye Tao <ye.tao@arm.com>
This also improves the testing and generic kernel by re-using the BF16 conversion functions. Built on top of OpenMathLib#5357 and derived from OpenMathLib#5287 Co-authored-by: Ye Tao <ye.tao@arm.com>
This fixes an issue originally introduced with the BGEMM kernel when I was tweaking it. OpenMathLib#5287 didn't suffer from this bug. I've updated the tests to run with `beta=1.0` so as to test loading and updating from C. Alongside this, the tests now return sensible return values to reduce the risk of them being ignored. Co-authored-by: Ye Tao <ye.tao@arm.com>
No description provided.