From f080ec795d8400de33461217a6222da2130173ab Mon Sep 17 00:00:00 2001 From: Amilendra Kodithuwakku Date: Wed, 20 Aug 2025 15:42:46 +0100 Subject: [PATCH 1/2] Add intrinsics for the new FP conversions introduced by the 2024 dpISA FEAT_FPRCVT adds 4 new variants for each FCVTAS, FCVTAU, FCVTMS, FCVTMU, FCVTNS, FCVTNU, FCVTPS, FCVTPU, FCVTZS, and FCVTZU instruction. 1) Half Precision to 32-bit 2) Half Precision to 64-bit 3) Single Precision to 64-bit 4) Double Precision to 32-bit For the Single Precision to 64-bit and Double Precision to 32-bit variants, this patch adds two new intrinsics, that reduce to - Single Precision to 64-bit : Dd,Sn - Double Precision to 32-bit : Sd,Dn The intrinsics for conversions from Half Precision are already defined. However they are documented as reducing to the incorrect instruction format; Hd,Hn, so this patch fixes them to be - Half Precision to 32-bit : Sd,Hn - Half Precision to 64-bit : Dd,Hn --- main/acle.md | 9 ++ neon_intrinsics/advsimd.md | 131 +++++++++++------- neon_intrinsics/advsimd.template.md | 11 +- tools/intrinsic_db/advsimd.csv | 62 ++++++--- tools/intrinsic_db/advsimd_classification.csv | 20 +++ 5 files changed, 158 insertions(+), 75 deletions(-) diff --git a/main/acle.md b/main/acle.md index 3b066e93..5cdf4aaa 100644 --- a/main/acle.md +++ b/main/acle.md @@ -465,6 +465,7 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin * Added feature test macro for FEAT_SSVE_FEXPA. * Added feature test macro for FEAT_CSSC. +* Added support for FEAT_FPRCVT intrinsics and `__ARM_FEATURE_FPRCVT`. ### References @@ -2207,6 +2208,13 @@ ACLE intrinsics are available. This implies that `__ARM_FEATURE_SM4` and floating-point absolute minimum and maximum instructions (FEAT_FAMINMAX) and if the associated ACLE intrinsics are available. +### FPRCVT extension + +`__ARM_FEATURE_FPRCVT` is defined to `1` if there is hardware +support for floating-point to/from integer conversion instructions +with only scalar SIMD&FP register operands and results, and with +different input and output register sizes. + ### Lookup table extensions `__ARM_FEATURE_LUT` is defined to 1 if there is hardware support for @@ -2590,6 +2598,7 @@ be found in [[BA]](#BA). | [`__ARM_FEATURE_FP8DOT2`](#modal-8-bit-floating-point-extensions) | Modal 8-bit floating-point extensions | 1 | | [`__ARM_FEATURE_FP8DOT4`](#modal-8-bit-floating-point-extensions) | Modal 8-bit floating-point extensions | 1 | | [`__ARM_FEATURE_FP8FMA`](#modal-8-bit-floating-point-extensions) | Modal 8-bit floating-point extensions | 1 | +| [`__ARM_FEATURE_FPRCVT`](#fprcvt-extension) | FPRCVT extension | 1 | | [`__ARM_FEATURE_FRINT`](#availability-of-armv8.5-a-floating-point-rounding-intrinsics) | Floating-point rounding extension (Arm v8.5-A) | 1 | | [`__ARM_FEATURE_GCS`](#guarded-control-stack) | Guarded Control Stack | 1 | | [`__ARM_FEATURE_GCS_DEFAULT`](#guarded-control-stack) | Guarded Control Stack protection can be enabled | 1 | diff --git a/neon_intrinsics/advsimd.md b/neon_intrinsics/advsimd.md index a87ad725..be1ef9f6 100644 --- a/neon_intrinsics/advsimd.md +++ b/neon_intrinsics/advsimd.md @@ -12,7 +12,7 @@ toc: true ---