Skip to content

Conversation

@hanno-becker
Copy link
Contributor

@hanno-becker hanno-becker commented Oct 31, 2025

  • Change mlk_polyvec back to struct { mlk_poly vec[MLKEM_K]; }
  • Change mlk_polymat to struct { mlk_polyvec vec[MLKEM_K]; }
  • Update all function signatures to use pointer style
  • Fix all implementations to use struct member access
  • Update tests, benchmarks, and CBMC harnesses
  • Add consistent const annotations

@hanno-becker hanno-becker force-pushed the structured branch 3 times, most recently from f300f02 to f18ce9f Compare November 2, 2025 05:36
@hanno-becker hanno-becker changed the title [WIP] Reintroduce struct definitions for mlk_poly{mat,vec} Reintroduce struct definitions for mlk_poly{mat,vec} Nov 2, 2025
@hanno-becker hanno-becker marked this pull request as ready for review November 2, 2025 05:36
@hanno-becker hanno-becker requested a review from a team as a code owner November 2, 2025 05:36
@hanno-becker hanno-becker force-pushed the structured branch 5 times, most recently from 32c1493 to 3dc4b2c Compare November 5, 2025 19:53
@hanno-becker
Copy link
Contributor Author

The runtime of polyvec_add further degrades here to >6min. That should certainly be addressed before this PR can be merged.

@hanno-becker hanno-becker force-pushed the structured branch 2 times, most recently from e9256dd to 478f245 Compare November 7, 2025 12:12
@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed needs-work labels Nov 7, 2025
Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 4th gen (c7i) (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 478f245 Previous: 90fed62 Ratio
ML-KEM-512 encaps 36417 cycles 35123 cycles 1.04
ML-KEM-1024 encaps 85851 cycles 83296 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 3rd gen (c6a) (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 478f245 Previous: 90fed62 Ratio
ML-KEM-512 keypair 39857 cycles 38475 cycles 1.04

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 3rd gen (c6i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 478f245 Previous: 90fed62 Ratio
ML-KEM-768 decaps 41443 cycles 39363 cycles 1.05

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks

Benchmark suite Current: 500213c Previous: 403daac Ratio
ML-KEM-512 keypair 0 cycles 0 cycles 1
ML-KEM-512 encaps 0 cycles 0 cycles 1
ML-KEM-512 decaps 0 cycles 0 cycles 1
ML-KEM-768 keypair 0 cycles 0 cycles 1
ML-KEM-768 encaps 0 cycles 0 cycles 1
ML-KEM-768 decaps 0 cycles 0 cycles 1
ML-KEM-1024 keypair 0 cycles 0 cycles 1
ML-KEM-1024 encaps 0 cycles 0 cycles 1
ML-KEM-1024 decaps 0 cycles 0 cycles 1

This comment was automatically generated by workflow using github-action-benchmark.

- Change mlk_polyvec back to struct `{ mlk_poly vec[MLKEM_K]; }`
- Change mlk_polymat to struct `{ mlk_polyvec vec[MLKEM_K]; }`
- Update all function signatures to use pointer style
- Fix all implementations to use struct member access
- Update tests, benchmarks, and CBMC harnesses
- Add consistent const annotations

Somewhat surprisingly and dissatisfyingly, I could not salvage
the CBMC proof for the 'monolithic' polymat_permute_bitrev_to_custom_native
but had to break it in two functions. It would be good to resolve
this as the split causes a lot of code-overhead for an entirely
trivial function.

Signed-off-by: Hanno Becker <[email protected]>
@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Nov 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark this PR should be benchmarked in CI CBMC enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants