Skip to content

Commit b421a82

Browse files
committed
Add support for COMPACT/EXPAND and FIRSTP/LASTP intrinsics
All these instructions are available under features FEAT_SVE2p2 or FEAT_SME2p2. COMPACT: Copy Active vector elements to lower-numbered elements (Byte/Halfword variants) EXPAND: Copy lower-numbered vector elements to Active elements (Byte/Halfword/Word/Doubleword variants) FIRSTP: Scalar index of first true predicate element (predicated) (Byte/Halfword/Word/Doubleword variants) LASTP: Scalar index of last true predicate element (predicated) (Byte/Halfword/Word/Doubleword variants)
1 parent a23e878 commit b421a82

File tree

1 file changed

+67
-0
lines changed

1 file changed

+67
-0
lines changed

main/acle.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -470,6 +470,10 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin
470470
* Added support for 16-bit floating point matrix multiply-add widening intrinsics.
471471
* Added support for Brain 16-bit floating-point vector multiplication intrinsics.
472472
* Added support for FEAT_SVE_AES2, FEAT_SSVE_AES intrinsics.
473+
* Added [**Beta**](#current-status-and-anticipated-changes)
474+
support for SVE2.2 (FEAT_SVE2p2)
475+
* Added [**Beta**](#current-status-and-anticipated-changes)
476+
support for SME2.2 (FEAT_SME2p2).
473477

474478
### References
475479

@@ -1983,6 +1987,10 @@ are available. This implies that `__ARM_FEATURE_SVE` is nonzero.
19831987
are available and if the associated [ACLE features]
19841988
(#sme-language-extensions-and-intrinsics) are supported.
19851989

1990+
`__ARM_FEATURE_SVE2p2` is defined to 1 if the FEAT_SVE2p2 instructions
1991+
are available and if the associated [ACLE features]
1992+
(#sme-language-extensions-and-intrinsics) are supported.
1993+
19861994
#### NEON-SVE Bridge macro
19871995

19881996
`__ARM_NEON_SVE_BRIDGE` is defined to 1 if the [`<arm_neon_sve_bridge.h>`](#arm_neon_sve_bridge.h)
@@ -2005,6 +2013,7 @@ of SME has an associated preprocessor macro, given in the table below:
20052013
| FEAT_SME | __ARM_FEATURE_SME |
20062014
| FEAT_SME2 | __ARM_FEATURE_SME2 |
20072015
| FEAT_SME2p1 | __ARM_FEATURE_SME2p1 |
2016+
| FEAT_SME2p2 | __ARM_FEATURE_SME2p2 |
20082017

20092018
Each macro is defined if there is hardware support for the associated
20102019
architecture feature and if all of the [ACLE
@@ -2710,6 +2719,7 @@ be found in [[BA]](#BA).
27102719
| [`__ARM_FEATURE_SVE2_SM3`](#sm3-extension) | SVE2 support for the SM3 cryptographic extension (FEAT_SVE_SM3) | 1 |
27112720
| [`__ARM_FEATURE_SVE2_SM4`](#sm4-extension) | SVE2 support for the SM4 cryptographic extension (FEAT_SVE_SM4) | 1 |
27122721
| [`__ARM_FEATURE_SVE2p1`](#sve2) | SVE version 2.1 (FEAT_SVE2p1)
2722+
| [`__ARM_FEATURE_SVE2p2`](#sve2) | SVE version 2.2 (FEAT_SVE2p2)
27132723
| [`__ARM_FEATURE_SYSREG128`](#bit-system-registers) | Support for 128-bit system registers (FEAT_SYSREG128) | 1 |
27142724
| [`__ARM_FEATURE_UNALIGNED`](#unaligned-access-supported-in-hardware) | Hardware support for unaligned access | 1 |
27152725
| [`__ARM_FP`](#hardware-floating-point) | Hardware floating-point | 1 |
@@ -13549,6 +13559,63 @@ While (resulting in predicate tuple)
1354913559
svboolx2_t svwhilelt_b8[_s64]_x2(int64_t rn, int64_t rm);
1355013560
```
1355113561

13562+
### SVE2.2 and SME2.2 instruction intrinsics
13563+
13564+
The functions in this section are defined by either the header file
13565+
[`<arm_sve.h>`](#arm_sve.h) or [`<arm_sme.h>`](#arm_sme.h)
13566+
when `__ARM_FEATURE_SVE2p2` or `__ARM_FEATURE_SME2p2` is defined, respectively.
13567+
13568+
These intrinsics can only be called from non-streaming code if
13569+
`__ARM_FEATURE_SVE2p2` is defined. They can only be called from streaming code
13570+
if the appropriate SME feature macro is defined (see previous paragraph).
13571+
They can only be called from streaming-compatible code if they could be called
13572+
from both non-streaming code and streaming code.
13573+
13574+
#### COMPACT
13575+
13576+
Copy active vector elements to lower-numbered elements.
13577+
13578+
``` c
13579+
// Variants are available for:
13580+
// _s8, _s16, _u16, _mf8, _bf16, _f16
13581+
svuint8_t svcompact[_u8](svbool_t pg, svuint8_t zn);
13582+
13583+
```
13584+
13585+
#### EXPAND
13586+
13587+
Copy lower-numbered vector elements to Active elements.
13588+
13589+
``` c
13590+
// Variants are available for:
13591+
// _s8, _s16, _u16, _s32, _u32, _s64, _u64
13592+
// _mf8, _bf16, _f16, _f32, _f64
13593+
svuint8_t svexpand[_u8](svbool_t pg, svuint8_t zn);
13594+
13595+
```
13596+
13597+
#### FIRSTP
13598+
13599+
Scalar index of first true predicate element (predicated).
13600+
13601+
``` c
13602+
// Variants are available for:
13603+
// _b16, _b32, _b64
13604+
svbool_t svfirstp[_b8](svbool_t pg, svbool_t op);
13605+
13606+
```
13607+
13608+
#### LASTP
13609+
13610+
Scalar index of last true predicate element (predicated)
13611+
13612+
``` c
13613+
// Variants are available for:
13614+
// _b16, _b32, _b64
13615+
svbool_t svlastp[_b8](svbool_t pg, svbool_t op);
13616+
13617+
```
13618+
1355213619

1355313620
### SME2 maximum and minimum absolute value
1355413621

0 commit comments

Comments
 (0)