Skip to content

Commit dca98df

Browse files
committed
Add intrinsics for the FEAT_SVE_AES2 feature introduced by the 2024 dpISA
FEAT_SVE_AES2 adds 1) SVE multi-vector Advanced Encryption Standard (AES) instructions Instructions added: AESE, AESD, AESEMC and AESDIMC For each instruction there are two variants a) Two registers variant b) Four registers variant 2) SVE multi-vector 128-bit polynomial multiply long instructions Instructions added: PMULL and PMLAL FEAT_SSVE_AES implements the same instructions but when in streaming mode.
1 parent 577ba57 commit dca98df

File tree

1 file changed

+36
-0
lines changed

1 file changed

+36
-0
lines changed

main/acle.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -469,6 +469,7 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin
469469
* Added support for modal 8-bit floating point matrix multiply-accumulate widening intrinsics.
470470
* Added support for 16-bit floating point matrix multiply-accumulate widening intrinsics.
471471
* Added support for Brain 16-bit floating-point vector multiplication intrinsics.
472+
* Added support for FEAT_SVE_AES2, FEAT_SSVE_AES intrinsics.
472473

473474
### References
474475

@@ -2162,6 +2163,15 @@ support for the SVE2 AES (FEAT_SVE_AES) instructions and if the associated
21622163
ACLE intrinsics are available. This implies that `__ARM_FEATURE_AES`
21632164
and `__ARM_FEATURE_SVE2` are both nonzero.
21642165

2166+
In addition, `__ARM_FEATURE_SVE2_AES2` is defined to `1` if there is hardware
2167+
support for the SVE2 AES2 (FEAT_SVE_AES2) instructions and if the associated
2168+
ACLE intrinsics are available. This implies that `__ARM_FEATURE_AES`
2169+
and `__ARM_FEATURE_SVE2` are both nonzero.
2170+
2171+
`__ARM_FEATURE_SSVE_AES2` is defined to 1 if there is hardware support for
2172+
SVE2 AES2 (FEAT_SVE_AES2) instructions in Streaming SVE mode (FEAT_SSVE_AES)
2173+
and if the associated ACLE intrinsics are available.
2174+
21652175
#### SHA2 extension
21662176

21672177
`__ARM_FEATURE_SHA2` is defined to 1 if the SHA1 & SHA2-256 Crypto
@@ -2689,6 +2699,8 @@ be found in [[BA]](#BA).
26892699
| [`__ARM_FEATURE_SVE_VECTOR_OPERATORS`](#scalable-vector-extension-sve) | Level of support for C and C++ operators on SVE predicate types | 1 |
26902700
| [`__ARM_FEATURE_SVE2`](#sve2) | SVE version 2 (FEAT_SVE2) | 1 |
26912701
| [`__ARM_FEATURE_SVE2_AES`](#aes-extension) | SVE2 support for the AES cryptographic extension (FEAT_SVE_AES) | 1 |
2702+
| [`__ARM_FEATURE_SVE2_AES2`](#aes-extension) | SVE2 support for the SVE multi-vector AES cryptographic extension (FEAT_SVE_AES2) | 1 |
2703+
| [`__ARM_FEATURE_SSVE_AES2`](#aes-extension) | SVE2 support for the SVE multi-vector AES cryptographic extension (FEAT_SSVE_AES) | 1 |
26922704
| [`__ARM_FEATURE_SVE2_BITPERM`](#bit-permute-extension) | SVE2 bit permute extension | 1 |
26932705
| [`__ARM_FEATURE_SSVE_BITPERM`](#bit-permute-extension) | SVE2 bit permute extension | 1 |
26942706
| [`__ARM_FEATURE_SSVE_FEXPA`](#streaming-sve-fexpa-extension) | Streaming SVE FEXPA extension | 1 |
@@ -9477,6 +9489,30 @@ to work with `svboolx2_t` and `svboolx4_t`. For example:
94779489
svboolx2_t svundef2_b();
94789490
```
94799491

9492+
#### AESE, AESD, AESEMC, AESDIMC
9493+
9494+
Multi-vector Advanced Encryption Standard instructions
9495+
9496+
svuint8x2_t svaese[_u8_x2] (svuint8x2_t op1, svuint64_t op2, uint64_t index);
9497+
svuint8x4_t svaese[_u8_x4] (svuint8x4_t op1, svuint64_t op2, uint64_t index);
9498+
svuint8x2_t svaesd[_u8_x2] (svuint8x2_t op1, svuint64_t op2, uint64_t index);
9499+
svuint8x4_t svaesd[_u8_x4] (svuint8x4_t op1, svuint64_t op2, uint64_t index);
9500+
svuint8x2_t svaesemc[_u8_x2] (svuint8x2_t op1, svuint64_t op2, uint64_t index);
9501+
svuint8x4_t svaesemc[_u8_x4] (svuint8x4_t op1, svuint64_t op2, uint64_t index);
9502+
svuint8x2_t svaesdimc[_u8_x2] (svuint8x2_t op1, svuint64_t op2, uint64_t index);
9503+
svuint8x4_t svaesdimc[_u8_x4] (svuint8x4_t op1, svuint64_t op2, uint64_t index);
9504+
9505+
#### PMULL, PMLAL
9506+
9507+
Multi-vector 128-bit polynomial multiply long instructions
9508+
9509+
``` c
9510+
// Variants are also available for:
9511+
// _s64x2, _f64x2
9512+
svuint64x2_t svpmull[_u64x2](svuint64_t zn, svuint64_t zm);
9513+
svuint64x2_t svpmlal[_u64x2](svuint64_t zn, svuint64_t zm);
9514+
```
9515+
94809516
#### ADDQV, FADDQV
94819517

94829518
Unsigned/FP add reduction of quadword vector segments.

0 commit comments

Comments
 (0)