Skip to content

Conversation

pittma
Copy link
Contributor

@pittma pittma commented Sep 29, 2025

This PR adds a 4x multi-buffer implementation of Keccakf1600 for AVX-512-capable systems.

Results

Run on an Intel(R) Xeon(R) 6787P system with the command:

$ taskset -c 4 ./tool/bssl speed -filter ML-KEM-512,ML-KEM-768,ML-KEM-1024,SHA3-512,SHA3-384,SHA3-256,SHA3-224,SHAKE-128,SHAKE-256,SHAKE256-x4,MLDSA44,MLDSA65,MLDSA87

(EDIT: updated the numbers below on a release build and with TurboBoost off)

SHA3-224

keccak1600-avx512vl main delta
16 34.8 21.6 61.11%
256 295.9 254.5 16.27%
1350 335.2 280.3 19.59%
8192 362.4 301.7 20.12%
16384 362.9 302.2 20.09%

SHA3-256

keccak1600-avx512vl main delta
16 35.1 12.7 176.38%
256 297.1 168.6 76.22%
1350 335.8 254.2 32.10%
8192 339 278 21.94%
16384 342.2 282.9 20.96%

SHA3-384

keccak1600-avx512vl main delta
16 35 21.6 62.04%
256 205 173.1 18.43%
1350 260.2 217.3 19.74%
8192 262.7 219.2 19.84%
16384 262.9 218.2 20.49%

SHA3-512

keccak1600-avx512vl main delta
16 34.5 21.6 59.72%
256 155.9 131.6 18.47%
1350 179.2 149.8 19.63%
8192 182.6 152.9 19.42%
16384 182.7 152.9 19.49%

SHAKE-128

keccak1600-avx512vl main delta
16 33.4 21 59.05%
256 290.8 249.6 16.51%
1350 369.6 308.7 19.73%
8192 419.9 349.5 20.14%
16384 421.4 350.4 20.26%

SHAKE-256

keccak1600-avx512vl main delta
16 33.5 21 59.52%
256 289.3 250.7 15.40%
1350 334 279.9 19.33%
8192 338.6 282.5 19.86%
16384 341.9 285.1 19.92%

SHAKE-256-x4

Absorb

keccak1600-avx512vl main delta
16 23.5 7.5 213.33%
256 231.5 63.5 264.57%
1350 291.9 70.8 312.29%
8192 303.4 71.4 324.93%
16384 304.7 72 323.19%

Squeeze

keccak1600-avx512vl main delta
16 23.6 7.5 214.67%
256 232.3 63.2 267.56%
1350 302.9 69.9 333.33%
8192 317.8 70.5 350.78%
16384 321.7 71.2 351.83%

ML-KEM

Key Generation

keccak1600-avx512vl main delta
512 100788.1 55436 81.81%
768 66051.9 35364.6 86.77%
1024 50816.2 23842 113.14%

Encap

keccak1600-avx512vl main delta
512 85870.7 50311.3 70.68%
768 59054.7 33292.2 77.38%
1024 44367.8 22340.2 98.60%

Decap

keccak1600-avx512vl main delta
512 66953.9 42214.6 58.60%
768 45659.9 28065.4 62.69%
1024 34363.3 19108.1 79.84%

ML-DSA

Key Generation

keccak1600-avx512vl main delta
44 10747.3 9682.2 11.00%
65 5724.3 5207.3 9.93%
87 4090.9 3644.9 12.24%

Signing

keccak1600-avx512vl main delta
44 2442.8 2363.3 3.36%
65 1542.9 1469.8 4.97%
87 1264.5 1194.6 5.85%

Verify

keccak1600-avx512vl main delta
44 9809.8 8978.4 9.26%
65 6248 5707.7 9.47%
87 3907.3 3526.3 10.80%

Putting in draft mode while I work out FIPS build issues.

@pittma pittma changed the title Keccak1600 avx512vl AVX-512 4x multi-buffer implementation of Keccakf1600 Sep 29, 2025
@pittma pittma force-pushed the keccak1600-avx512vl branch from 11f64d3 to 470dfa5 Compare October 1, 2025 17:05
@pittma pittma marked this pull request as ready for review October 1, 2025 17:33
@pittma pittma requested a review from a team as a code owner October 1, 2025 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants