disable fp16 flags on RISC-V unless BUILD_HFLOAT16=1 #5431

markdryan · 2025-08-28T08:57:21Z

The compiler options that enable 16 bit floating point instructions
should not be enabled by default when building the RISCV64_ZVL128B
and RISCV64_ZVL256B targets. The zfh and zvfh extensions are not part
of the 'V' extension and are not required by any of the RVA profiles.
There's no guarantee that kernels built with zfh and zvfh will work
correctly on fully compliant RVA23U64 devices.

To fix the issue we only build the RISCV64_ZVL128B and RISCV64_ZVL256B
kernels with the half float flags if BUILD_HFLOAT16=1. We also update
the RISC-V dynamic detection code to disable the RISCV64_ZVL128B and
RISCV64_ZVL256B kernels at runtime if we've built with DYNAMIC_ARCH=1
and BUILD_HFLOAT16=1 and are running on a device that does not support
both Zfh and Zvfh.

Fixes: #5428

markdryan · 2025-08-28T08:57:58Z

Makefile.prebuild

 TARGET_FLAGS = -march=rv64imafdcv_zba_zbb_zfh -mabi=lp64d
 endif

-ifeq ($(TARGET), RISCV64_ZVL256B)


Does anyone see any issue with removing this?

I don't think you want to remove these entirely, just revert them to their state before the addition of the HFLOAT16 PR (i.e. rv64imafdcv), unless we can be absolutely certain that the getarch utility and everything downstream of c_check still works on any affected RISCV64 platform (I'm especially worried about the -mabi=lp64d part) ?

I was removing these to fix

#5424

but perhaps this is not the way to do this. Perhaps it's best to remove

05c8654

from this PR and fix the issue separately.

I've dropped the commit that removes these lines entirely. The remaining commit just removes the half float flags.

markdryan · 2025-08-28T09:08:36Z

driver/others/dynamic_riscv64.c

 	} else {
+#if defined(BUILD_HFLOAT16)
+		return NULL;
+#endif


I think I should probably put a #else here.

It's fine as is, no? If BUILD_HFLOAT16 isn't defined, we just get the existing pre-your-changes behaviour

Hmm. Aren't we dropping out of the entire "vector extension supported" case with this change, when "only" the fp16 shgemm kernel is not supported by the detected hardware ? (Though I don't have an easy solution for how to keep using the others regardless, and switch just shgemm to the generic C implementation. Perhaps "rvv+fp16" would need to be a separate TARGET even)

For the case where DYNAMIC_ARCH=1, yes. The approach this patch takes is if we've built with DYNAMIC_ARCH=1 and BUILD_HFLOAT16=1 the vector kernels will only be used if we can detect Zfh, Zvfh and V at runtime, which seems reasonable as this is how they've been compiled. Not ideal perhaps, but better than the current code on develop where a DYNAMIC_ARCH=1 build will happily execute the vector kernels on machines without half float support and probably crash.

agree for shgemm but not sure why you'd expect the vector kernels in general to crash if Z(v)fh isn't available ?

but these would crash only when actually executing SHGEMM - while your change makes them inexplicably slow for any other BLAS operation if only the build host happened to have these extensions. This might create a problem with third-party packagers (like Linux distributions) depending on how likely they are to active the BUILD_HFLOAT16 option

Currently all the kernels in RISCV64_ZVL128B and RISCV64_ZVL256B are compiled with Zfh, Zvfh and V but at runtime we only check for V before executing them. In general I don't think it's a good idea to execute code compiled with optional extensions like Zfh and Zvfh without first checking to see whether the extensions are supported at runtime. Even if the other full float kernels aren't explicitly using half float instructions, by compiling them with Zfh and Zvfh, we give the compiler permission to use the half float instructions in those kernels, should it find some clever reason to do so.

This might create a problem with third-party packagers (like Linux distributions) depending on how likely they are to active the BUILD_HFLOAT16 option

I think it's very unlikely that Linux distributions will be able to activate BUILD_HFLOAT16=1 in its current form for riscv64 any time soon as Zvfh and Zfh are not mandatory in any of the existing RVA profiles, even RVA23U64. They could only safely build with BUILD_HFLOAT16=1 if runtime detection of zfh and zvfh were performed.

I'm happy to explore another solution, e.g., introducing new targets as you suggested above.

Incidentally, what is done on X86? How are bf16 and fp16 handled?

On x86_64 (and power), bf16 capability is tied to TARGET, and fp16 is as yet unimplemented - it made its debut in the riscv64 PR.
Perhaps what could be done here is issue a warning about lack of Zfh (and consequentially SHGEMM) support before activating the RISCV64_ZVL128B or RISCV64_ZVL256B kernel regardless ?

Actually I think I'll just add a FALLBACK_VERBOSE type message (as a followup commit) so that the user has a chance to find out why performance is lower than expected

Yes, that sounds like a good idea.

Apologies for not replying to your previous comment. I got a little distracted.

The compiler options that enable 16 bit floating point instructions should not be enabled by default when building the RISCV64_ZVL128B and RISCV64_ZVL256B targets. The zfh and zvfh extensions are not part of the 'V' extension and are not required by any of the RVA profiles. There's no guarantee that kernels built with zfh and zvfh will work correctly on fully compliant RVA23U64 devices. To fix the issue we only build the RISCV64_ZVL128B and RISCV64_ZVL256B kernels with the half float flags if BUILD_HFLOAT16=1. We also update the RISC-V dynamic detection code to disable the RISCV64_ZVL128B and RISCV64_ZVL256B kernels at runtime if we've built with DYNAMIC_ARCH=1 and BUILD_HFLOAT16=1 and are running on a device that does not support both Zfh and Zvfh. Fixes: OpenMathLib#5428

ChipKerchner · 2025-09-23T14:25:34Z

Does this also need to be done for BUILD_BFLOAT16 since it now has vector support for SBGEMM?

markdryan · 2025-09-23T14:40:02Z

Does this also need to be done for BUILD_BFLOAT16 since it now has vector support for SBGEMM?

By this do you mean update the dynamic_riscv64.c code to check for the bfloat16 extensions? I'd say we do need to do that, particularly as the BF16 extensions are not mandatory in RVA23.

martin-frbg · 2025-09-23T14:54:14Z

The runtime check for extension support in the DYNAMIC_ARCH case ? I guess so, if we cannot be sure that these are going to be present in any relevant -zvl128b or -zvl256b cpu.
That is unless we accept the risk that SBGEMM calls will just SIGILL on underequipped hardware - I still think any other BLAS calls should be fine, and I'm still not sure if I like the tradeoff of falling back to generic kernels for everything just because the optimized ones were built with either flavor of half-precision support - which the user may know to avoid calling. There is the risk of the compiler using half-precision tricks in optimizing regular precision code if we tell it they're available, but I assume it is largely theoretical ?

ChipKerchner · 2025-09-23T16:49:45Z

driver/others/dynamic_riscv64.c

 	if (ret == 0) {
-		if (!(pairs[0].value & RISCV_HWPROBE_IMA_V))
+#if defined(BUILD_HFLOAT16)
+		vector_mask = (RISCV_HWPROBE_IMA_V | RISCV_HWPROBE_EXT_ZFH | RISCV_HWPROBE_EXT_ZVFH);


Don't you need to check for these also?

#define RISCV_HWPROBE_EXT_ZFHMIN (1 << 28)
#define RISCV_HWPROBE_EXT_ZVFHMIN (1 << 31)

Are these extensions being used?

They don't appear to be enabled in the compiler flags

__riscv_vle16_v_f16m1 (and other variations) are listed under the Zvfhmin extension. I would think the scalar form is also being used in situations like this _Float16 B0 = B[bi+0];

The ones that you currently have are for the mult and madd instructions.

Ok, I looked it up. Minimum extension is a subset of the regular extension. So what we and you have seem correct.

markdryan commented Aug 28, 2025

View reviewed changes

markdryan force-pushed the markdryan/riscv-hf16-fix branch 2 times, most recently from 05c8654 to ce79fe1 Compare August 28, 2025 13:41

markdryan changed the title ~~Fix hfloat and fortran compiler detection issues with the RISC-V makefiles~~ Fix hfloat16 issues with the RISC-V makefiles and dynamic kernel selection Aug 28, 2025

markdryan changed the title ~~Fix hfloat16 issues with the RISC-V makefiles and dynamic kernel selection~~ disable fp16 flags on RISC-V unless BUILD_HFLOAT16=1 Aug 28, 2025

markdryan mentioned this pull request Aug 28, 2025

fix RVV 1.0 detection code #5432

Merged

martin-frbg added this to the 0.3.31 milestone Sep 9, 2025

martin-frbg merged commit 5ab1437 into OpenMathLib:develop Sep 9, 2025
83 of 88 checks passed

ChipKerchner mentioned this pull request Sep 23, 2025

Disable bf16 flags on RISC-V unless BUILD_BFLOAT16=1 #5456

Merged

ChipKerchner reviewed Sep 23, 2025

View reviewed changes

disable fp16 flags on RISC-V unless BUILD_HFLOAT16=1 #5431

disable fp16 flags on RISC-V unless BUILD_HFLOAT16=1 #5431

Uh oh!

Conversation

markdryan commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markdryan Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markdryan Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ChipKerchner commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markdryan commented Sep 23, 2025

Uh oh!

martin-frbg commented Sep 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChipKerchner Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

markdryan commented Aug 28, 2025 •

edited

Loading

markdryan Aug 28, 2025 •

edited

Loading

markdryan Aug 28, 2025 •

edited

Loading

ChipKerchner commented Sep 23, 2025 •

edited

Loading

ChipKerchner Sep 23, 2025 •

edited

Loading