You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add floating point matrix multiply-add widening intrinsics
Adds intrinsic support for the FMMLA matrix multiply instructions
introduced by the 2024 dpISA.
FEAT_F8F32MM: Neon FP8 to single-precision
FEAT_F8F16MM: Neon FP8 to half-precision
FEAT_SVE_F16F32MM: SVE half-precision to single-precision
FEAT_SSVE_F8F32MM: SVE FP8 to single-precision
FEAT_SSVE_F8F16MM: SVE FP8 to half-precision
| [`__ARM_FEATURE_SVE_MATMUL_INT8`](#multiplication-of-8-bit-integer-matrices) | SVE support for the integer matrix multiply extension (FEAT_I8MM) | 1 |
2650
2678
| [`__ARM_FEATURE_SVE_PREDICATE_OPERATORS`](#scalable-vector-extension-sve) | Level of support for C and C++ operators on SVE vector types | 1 |
2651
2679
| [`__ARM_FEATURE_SVE_VECTOR_OPERATORS`](#scalable-vector-extension-sve) | Level of support for C and C++ operators on SVE predicate types | 1 |
@@ -13676,6 +13704,30 @@ Single-precision convert, narrow, and interleave to 8-bit floating-point (top an
13676
13704
uint64_t imm0_15, fpm_t fpm);
13677
13705
```
13678
13706
13707
+
13708
+
#### FMMLA (widening, FP8 to FP16)
13709
+
13710
+
8-bit floating-point matrix multiply-add to half-precision.
13711
+
```c
13712
+
// Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_F8F16MM) || __ARM_FEATURE_SSVE_F8F16MM
13713
+
svfloat16_t svmmmla[_f16_mf8](svfloat16_t zda, svmfloat8_t zn, svmfloat8_t zm);
13714
+
```
13715
+
13716
+
#### FMMLA (widening, FP8 to FP32)
13717
+
13718
+
8-bit floating-point matrix multiply-add to single-precision.
13719
+
```c
13720
+
// Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_F8F32MM) || __ARM_FEATURE_SSVE_F8F32MM
13721
+
svfloat32_t svmmmla[_f32_mf8](svfloat32_t zda, svmfloat8_t zn, svmfloat8_t zm);
13722
+
```
13723
+
#### FMMLA (widening, FP16 to FP32)
13724
+
13725
+
16-bit floating-point matrix multiply-add to single-precision.
13726
+
```c
13727
+
// Only if (__ARM_FEATURE_SVE2 && __ARM_FEATURE_SVE_F16F32MM) || __ARM_FEATURE_SME_FA64
13728
+
svfloat32_t svmmmla[_f32_f16](svfloat32_t zda, svfloat16_t zn, svfloat16_t zm);
13729
+
```
13730
+
13679
13731
### SME2 modal 8-bit floating-point intrinsics
13680
13732
13681
13733
The intrinsics in this section are defined by the header file
0 commit comments