Skip to content

Commit 4dbaa35

Browse files
committed
Add intrinsics for the new FP conversions introduced by the 2024 dpISA
FEAT_FPRCVT adds 4 new variants for each FCVTAS, FCVTAU, FCVTMS, FCVTMU, FCVTNS, FCVTNU, FCVTPS, FCVTPU, FCVTZS, and FCVTZU instruction. 1) Half Precision to 32-bit 2) Half Precision to 64-bit 3) Single Precision to 64-bit 4) Double Precision to 32-bit For the Single Precision to 64-bit and Double Precision to 32-bit variants, this patch adds two new intrinsics, that reduce to - Single Precision to 64-bit : <INST> Dd,Sn - Double Precision to 32-bit : <INST> Sd,Dn The intrinsics for conversions from Half Precision are already defined. However they are documented as reducing to the incorrect instruction format; <INST> Hd,Hn, so this patch fixes them to be - Half Precision to 32-bit : <INST> Sd,Hn - Half Precision to 64-bit : <INST> Dd,Hn
1 parent d294acf commit 4dbaa35

File tree

5 files changed

+130
-45
lines changed

5 files changed

+130
-45
lines changed

main/acle.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -461,6 +461,9 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin
461461
* Upgrade to [**Beta**](#current-status-and-anticipated-changes)
462462
support for modal 8-bit floating point intrinsics.
463463

464+
#### Changes between ACLE Q2 2025 and ACLE Q3 2025
465+
* Added support for FEAT_FPRCVT intrinsics and `__ARM_FEATURE_FPRCVT`.
466+
464467
#### Changes for next release
465468

466469
* Added feature test macro for FEAT_SSVE_FEXPA.
@@ -2207,6 +2210,13 @@ ACLE intrinsics are available. This implies that `__ARM_FEATURE_SM4` and
22072210
floating-point absolute minimum and maximum instructions (FEAT_FAMINMAX)
22082211
and if the associated ACLE intrinsics are available.
22092212

2213+
### FPRCVT extension
2214+
2215+
`__ARM_FEATURE_FPRCVT` is defined to `1` if there is hardware
2216+
support for floating-point to/from integer convertion instructions
2217+
with only scalar SIMD&FP register operands and results having
2218+
different input and output register sizes.
2219+
22102220
### Lookup table extensions
22112221

22122222
`__ARM_FEATURE_LUT` is defined to 1 if there is hardware support for
@@ -2590,6 +2600,7 @@ be found in [[BA]](#BA).
25902600
| [`__ARM_FEATURE_FP8DOT2`](#modal-8-bit-floating-point-extensions) | Modal 8-bit floating-point extensions | 1 |
25912601
| [`__ARM_FEATURE_FP8DOT4`](#modal-8-bit-floating-point-extensions) | Modal 8-bit floating-point extensions | 1 |
25922602
| [`__ARM_FEATURE_FP8FMA`](#modal-8-bit-floating-point-extensions) | Modal 8-bit floating-point extensions | 1 |
2603+
| [`__ARM_FEATURE_FPRCVT`](#fprcvt-extension) | FPRCVT extension | 1 |
25932604
| [`__ARM_FEATURE_FRINT`](#availability-of-armv8.5-a-floating-point-rounding-intrinsics) | Floating-point rounding extension (Arm v8.5-A) | 1 |
25942605
| [`__ARM_FEATURE_GCS`](#guarded-control-stack) | Guarded Control Stack | 1 |
25952606
| [`__ARM_FEATURE_GCS_DEFAULT`](#guarded-control-stack) | Guarded Control Stack protection can be enabled | 1 |

0 commit comments

Comments
 (0)