@@ -470,6 +470,10 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin
470470* Added support for 16-bit floating point matrix multiply-accumulate widening intrinsics.
471471* Added support for Brain 16-bit floating-point vector multiplication intrinsics.
472472* Added support for FEAT_SVE_AES2, FEAT_SSVE_AES intrinsics.
473+ * Added [**Alpha**](#current-status-and-anticipated-changes)
474+ support for SVE2.2 (FEAT_SVE2p2)
475+ * Added [**Alpha**](#current-status-and-anticipated-changes)
476+ support for SME2.2 (FEAT_SME2p2).
473477
474478### References
475479
@@ -1983,6 +1987,10 @@ are available. This implies that `__ARM_FEATURE_SVE` is nonzero.
19831987 are available and if the associated [ACLE features]
19841988(#sme-language-extensions-and-intrinsics) are supported.
19851989
1990+ `__ARM_FEATURE_SVE2p2` is defined to 1 if the FEAT_SVE2p2 instructions
1991+ are available and if the associated [ACLE features]
1992+ (#sme-language-extensions-and-intrinsics) are supported.
1993+
19861994#### NEON-SVE Bridge macro
19871995
19881996`__ARM_NEON_SVE_BRIDGE` is defined to 1 if the [`<arm_neon_sve_bridge.h>`](#arm_neon_sve_bridge.h)
@@ -2005,6 +2013,7 @@ of SME has an associated preprocessor macro, given in the table below:
20052013| FEAT_SME | __ARM_FEATURE_SME |
20062014| FEAT_SME2 | __ARM_FEATURE_SME2 |
20072015| FEAT_SME2p1 | __ARM_FEATURE_SME2p1 |
2016+ | FEAT_SME2p2 | __ARM_FEATURE_SME2p2 |
20082017
20092018Each macro is defined if there is hardware support for the associated
20102019architecture feature and if all of the [ACLE
@@ -2707,6 +2716,7 @@ be found in [[BA]](#BA).
27072716| [`__ARM_FEATURE_SVE2_SM3`](#sm3-extension) | SVE2 support for the SM3 cryptographic extension (FEAT_SVE_SM3) | 1 |
27082717| [`__ARM_FEATURE_SVE2_SM4`](#sm4-extension) | SVE2 support for the SM4 cryptographic extension (FEAT_SVE_SM4) | 1 |
27092718| [`__ARM_FEATURE_SVE2p1`](#sve2) | SVE version 2.1 (FEAT_SVE2p1)
2719+ | [`__ARM_FEATURE_SVE2p2`](#sve2) | SVE version 2.2 (FEAT_SVE2p2)
27102720| [`__ARM_FEATURE_SYSREG128`](#bit-system-registers) | Support for 128-bit system registers (FEAT_SYSREG128) | 1 |
27112721| [`__ARM_FEATURE_UNALIGNED`](#unaligned-access-supported-in-hardware) | Hardware support for unaligned access | 1 |
27122722| [`__ARM_FP`](#hardware-floating-point) | Hardware floating-point | 1 |
@@ -13007,6 +13017,33 @@ Zero ZA vector groups
1300713017 __arm_streaming __arm_inout("za");
1300813018```
1300913019
13020+ ### SME2.2 instruction intrinsics
13021+
13022+ The intrinsics in this section are defined by the header file
13023+ [`<arm_sme.h>`](#arm_sme.h) when `__ARM_FEATURE_SME2p2` is defined.
13024+
13025+ #### FMUL
13026+
13027+ Multi-vector floating-point multiply
13028+
13029+ ``` c
13030+ // Variants are also available for:
13031+ // [_single_f32_x2]
13032+ // [_single_f64_x2]
13033+ // [_single_f16_x4]
13034+ // [_single_f32_x4]
13035+ // [_single_f64_x4]
13036+ svfloat16x2_t svmul[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming;
13037+
13038+ // Variants are also available for:
13039+ // [_f32_x2]
13040+ // [_f64_x2]
13041+ // [_f16_x4]
13042+ // [_f32_x4]
13043+ // [_f64_x4]
13044+ svfloat16x2_t svmul[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming;
13045+ ```
13046+
1301013047### Streaming-compatible versions of standard routines
1301113048
1301213049ACLE provides the following streaming-compatible functions,
@@ -13556,6 +13593,56 @@ While (resulting in predicate tuple)
1355613593 svboolx2_t svwhilelt_b8[_s64]_x2(int64_t rn, int64_t rm);
1355713594```
1355813595
13596+ ### SVE2.2 and SME2.2 instruction intrinsics
13597+
13598+ The functions in this section are defined by either the header file
13599+ [`<arm_sve.h>`](#arm_sve.h) or [`<arm_sme.h>`](#arm_sme.h)
13600+ when `__ARM_FEATURE_SVE2p2` or `__ARM_FEATURE_SME2p2` is defined, respectively.
13601+
13602+ #### COMPACT, EXPAND
13603+
13604+ Copy active vector elements to/from lower-numbered elements.
13605+
13606+ These intrinsics can be called from streaming code only if the
13607+ `__ARM_FEATURE_SME2p2` feature macro is defined.
13608+
13609+ They can be called from non-streaming code if the `__ARM_FEATURE_SVE2p2` feature
13610+ macro is defined or both the `__ARM_FEATURE_SVE` and `__ARM_FEATURE_SME2p2`
13611+ feature macros are defined.
13612+
13613+ ``` c
13614+ // Variants are available for:
13615+ // _s8, _s16, _u16, _mf8, _bf16, _f16
13616+ svuint8_t svcompact[_u8](svbool_t pg, svuint8_t zn);
13617+
13618+ // Variants are available for:
13619+ // _s8, _s16, _u16, _s32, _u32, _s64, _u64
13620+ // _mf8, _bf16, _f16, _f32, _f64
13621+ svuint8_t svexpand[_u8](svbool_t pg, svuint8_t zn);
13622+
13623+ ```
13624+
13625+ #### FIRSTP, LASTP
13626+
13627+ Scalar index of first/last true predicate element (predicated).
13628+
13629+ These intrinsics can be called from streaming mode if either of the feature
13630+ macros `__ARM_FEATURE_SVE` or `__ARM_FEATURE_SME` are defined.
13631+
13632+ They can be called from non-streaming code only if the `__ARM_FEATURE_SVE`
13633+ feature macro is defined.
13634+
13635+ ``` c
13636+ // Variants are available for:
13637+ // _b16, _b32, _b64
13638+ int64_t svfirstp_b8(svbool_t pg, svbool_t op);
13639+
13640+ // Variants are available for:
13641+ // _b16, _b32, _b64
13642+ int64_t svlastp_b8(svbool_t pg, svbool_t op);
13643+
13644+ ```
13645+
1355913646
1356013647### SME2 maximum and minimum absolute value
1356113648
0 commit comments