You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/rationale.adoc
+19Lines changed: 19 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,3 +43,22 @@ Two conditional-zero instructions are included: one that writes zero if the
43
43
comparand is zero, and one that does so if the comparand is nonzero.
44
44
Variants that perform magnitude comparisons with zero were considered but
45
45
ultimately excluded for insufficient quantitative justification.
46
+
47
+
=== "Zvbc32e" Extension for Vector Carryless Multiplication for `SEW <= 32`
48
+
49
+
50
+
<<Zvbc>> defines vector carryless multiplication instructions for SEW=64 only.
51
+
It is not suitable for implementations with small ELEN (32) and incur some inefficiencies for algorithms were at least one of the multiplication operands is limited to 32 bits (or less).
52
+
The list of such algorithms includes the CLM-based folding algorithm used to compute the widespread 32-bit CRCs (e.g. Ethernet CRC)
53
+
With `Zvbc`, only half the 64-bit element multiplication provided is exploited.
54
+
This is due to the fact that CRC acceleration based on carryless multiplication often relies on a product term which is a polynomial modulo the CRC.
55
+
This limits the size of this term to the output size of the CRC.
56
+
57
+
Zvbc32e's defines the same vector carryless multiplication operations as Zvbc but on smaller SEW values (32, 16, and 8 bits).
58
+
It can be leveraged by implementations with any ELEN value >= 32.
59
+
For implementations with small ELEN (32), supporting Zvbc32e brings ISA support for vector carryless multiplication (which was not possible through Zvbc alone).
60
+
61
+
Zvbc32e is also useful for implementations with ELEN >= 64, as it allows more efficient implementations of algorithms relying on 32-bit (or less) carryless multiplications.
62
+
Selecting only `Zvbc32e` allows implementations to save area while providing identical performance on those algorithms.
63
+
64
+
For all implementations, `Zvbc32e` allows better implementations (less instructions and more targeted use of hardware resources) of algorithms relying on 8-bit and 16-bit carryless multiplications (e.g. erasure coding).
0 commit comments