Adding rationale for Zvkgs

nibrunie · nibrunie · commit 6d2a88200790 · 2025-11-09T11:00:55.000-08:00
diff --git a/src/rationale.adoc b/src/rationale.adoc
@@ -61,4 +61,15 @@ For implementations with small ELEN (32), supporting Zvbc32e brings ISA support
 Zvbc32e is also useful for implementations with ELEN >= 64, as it allows more efficient implementations of algorithms relying on 32-bit (or less) carryless multiplications.
 Selecting only `Zvbc32e` allows implementations to save area while providing identical performance on those algorithms.
 
-For all implementations, `Zvbc32e` allows better implementations (less instructions and more targeted use of hardware resources) of algorithms relying on 8-bit and 16-bit carryless multiplications (e.g. erasure coding).
+For all implementations, `Zvbc32e` allows better implementations (less instructions and more targeted use of hardware resources) of algorithms relying on 8-bit and 16-bit carryless multiplications (e.g. erasure coding).
+
+
+=== "Zvkgs"  Extension for Vector-Scalar GCM/GHASH
+
+One of the key use cases for the vector instructions `vghsh.vv` and `vgmul.vv` defined in <<Zvkg>> is to speed-up the Galois Counter Mode (GCM) cipher mode for a single encryption/decryption stream by computing the GHASH algorithm for multiple blocks of the same message in parallel (using the same symmetric key).
+The parallel processing accumulates and multiplies multiple blocks of the message by the same power of `H` (`H` is the encryption of `0` by the cipher key).
+The power being equal to the number of blocks processed in parallel.
+The processing completes by reducing the parallel accumulators into a single output tag.
+With `Zvkg` only, a full vector register was required to hold the multiple copies of the power of H.
+`Zvkgs` reduces the size of the vector register group needed for powers of H: it just needs to contain a 128-bit wide element group, freeing some vector registers (The exact number of freed registers depends on VLEN and LMUL).
+This exploits the same scalar element group broadcast mechanism used in other instructions defined in the vector crypto extensions (e.g. `vaesem.vs` from <<Zvkned>>).