[Perf] Only calculate the hash of circuit commitments once for the VK by ljedrz · Pull Request #2964 · ProvableHQ/snarkVM

ljedrz · 2025-10-16T14:13:30Z

This addresses the 1st part of this comment on the linked issue.

…cuitVerifyingKey Signed-off-by: ljedrz <ljedrz@users.noreply.github.com>

ljedrz · 2025-10-16T14:15:45Z

algorithms/benches/snark/varuna.rs


        let mut pks = Vec::with_capacity(circuit_batch_size);
        let mut all_circuits = Vec::with_capacity(circuit_batch_size);
+        #[allow(clippy::mutable_key_type)]


note: this is needed, because the CircuitVerifyingKey has interior mutability now; however, in its case it is perfectly fine, as the new member has no impact on the Ord impl (which only considers the id), and the fact that the key implements it is the reason why the warning is raised; see the corresponding clippy lint link.

vicsn · 2025-10-16T14:35:41Z

algorithms/src/snark/varuna/data_structures/circuit_verifying_key.rs

    /// Commitments to the indexed polynomials.
    pub circuit_commitments: Vec<sonic_pc::Commitment<E>>,
    pub id: CircuitId,
+    pub circuit_commitments_hash: OnceLock<E::Fq>,


I think this is expensive enough that we should store it to disk - and it would be great if we can get rid of the OnceLock (which obfuscates when initialization happens, making performance analysis harder). O:)

You correctly observed that we don't have to transmit it over the wire though.

The issue was that at the moment of creation, the fs_params is not available, and at later stages the VKs are immutable - hence the OnceLock.

Storing to the disk would probably work around this, but retrieving it would be expensive, perhaps to the point of offsetting any performance gains from caching it, unless the hashing is really computationally expensive.

Storing to the disk would probably work around this, but retrieving it would be expensive, perhaps to the point of offsetting any performance gains from caching it, unless the hashing is really computationally expensive.

To be clear, we would retrieve it from disk only when we retrieve the VK from disk. And the hashes are very expensive.

However, a big downside I do see is the sheer amount of work to adjust the database logic. So for the first version you can also compute it during construction.

The issue was that at the moment of creation, the fs_params is not available

Looks to me it's always available in N::varuna_fs_parameters() ? Or is there a scoping issue? Have fun with that :")

Looks to me it's always available in N::varuna_fs_parameters()?

That's correct; however, there is no notion of the Network - or even snarkvm-console - in algorithms. Would it be acceptable to alter SNARK::circuit_setup to require FSParameters, like the other VarunaSNARK functions do?

However, a big downside I do see is the sheer amount of work to adjust the database logic. So for the first version you can also compute it during construction.

For my future self reading this: we should just store to disk, but we can write out that logic when we have a definite timeline for landing this feature.

vicsn · 2025-10-16T14:37:53Z

Can you report the performance improvements for the existing snark_batch_verify benchmark with:

circuit_batch_size:1, instance_size:5
circuit_batch_size:1, instance_size:1000

vicsn · 2025-10-16T14:40:41Z

algorithms/src/snark/varuna/varuna.rs

        fs_parameters: &FS::Parameters,
        inputs_and_batch_sizes: &BTreeMap<CircuitId, (usize, &[Vec<E::Fr>])>,
-        circuit_commitments: impl Iterator<Item = &'a [crate::polycommit::sonic_pc::Commitment<E>]>,
+        circuit_commitments_hashes: Vec<E::Fq>,


Can you make a 3rd VarunaVersion, and guard the changes by it to preserve backwards compatibility? You can peek at where we use VarunaVersion::V2 for inspiration.

ljedrz · 2025-10-17T12:33:07Z

circuit_batch_size:1, instance_size:5

snark_batch_verify      time:   [4.6137 ms 4.6239 ms 4.6343 ms]
                        change: [−5.4575% −5.1305% −4.8178%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

circuit_batch_size:1, instance_size:1000

snark_batch_verify      time:   [149.98 ms 150.09 ms 150.22 ms]
                        change: [−0.1857% −0.0640% +0.0545%] (p = 0.32 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

vicsn · 2025-10-17T14:02:02Z

Thank you for the benchmarks, looks like I made a mistake...

This PR is only consequential when we have a very large amount of different circuits - but in the current system and even with dynamic dispatch we're never planning to have more than 32 circuits to batch - and the most common case is just 3 circuits.

Can you do one more benchmark with circuit_size: 3, instance_size: 10, before and after this PR? If the difference is inconsequential we can close this PR and stop this line of work.

ljedrz · 2025-10-17T14:46:50Z

sure:

snark_batch_verify      time:   [10.654 ms 10.669 ms 10.683 ms]
                        change: [−6.3521% −6.1469% −5.9447%] (p = 0.00 < 0.05)
                        Performance has improved.

vicsn · 2025-10-23T08:36:50Z

circuit_batch_size:1, instance_size:1000

Could you maybe run this comparison again? Perhaps even try instance_size 4000?

If it is indeed a negligible difference, could you reproduce the single-threaded flamegraph I made to revisit whether initi_sponge hashing is really a bottleneck? staging...perf/verify_batch#diff-9ae02456754b316d993bd81a0e311bff5e432156b69d28b867d5c98027d0b79bR137

ljedrz · 2025-10-23T10:59:52Z

Perhaps even try instance_size 4000?

snark_batch_verify      time:   [586.57 ms 586.99 ms 587.43 ms]
                        change: [+0.2159% +0.3150% +0.4199%] (p = 0.00 < 0.05)
                        Change within noise threshold.

ljedrz · 2025-10-23T11:12:48Z

If it is indeed a negligible difference, could you reproduce the single-threaded flamegraph I made to revisit whether initi_sponge hashing is really a bottleneck?

Done; it does show that init_sponge takes ~43%, but this PR doesn't seem to impact it (only ~0.4% decrease in that value).

vicsn · 2025-10-29T15:38:48Z

TIL again that dynamic dispatch does create potentially lots of circuits, we can keep this open as a draft.

vicsn · 2026-01-12T16:30:50Z

@ljedrz do you want to build and benchmark another feature to this draft. Currently we hash inputs_and_batch_sizes inside of fn init_sponge using the expensive Poseidon hash. Instead, the prover and verifier can hash those to a single field element using hash_sha3_256, and then pass that single value into fn {prove_verify}_batch and into fn init_sponge to be hashed by Poseidon

Signed-off-by: ljedrz <ljedrz@users.noreply.github.com>

ljedrz · 2026-01-14T15:37:01Z

The most recent commit provides the following benchmark wins compared with the previous one:

circuit_batch_size:1, instance_size:5

nark_batch_verify      time:   [7.2229 ms 7.2641 ms 7.3056 ms]
                        change: [−5.8052% −5.1281% −4.4218%] (p = 0.00 < 0.05)
                        Performance has improved.

circuit_batch_size:1, instance_size:1000

snark_batch_verify      time:   [162.82 ms 163.37 ms 163.95 ms]
                        change: [−27.720% −27.366% −27.010%] (p = 0.00 < 0.05)
                        Performance has improved.

perf: only calculate the hash of circuit commitments once for the Cir…

189c75a

…cuitVerifyingKey Signed-off-by: ljedrz <ljedrz@users.noreply.github.com>

ljedrz requested a review from vicsn October 16, 2025 14:13

ljedrz commented Oct 16, 2025

View reviewed changes

vicsn reviewed Oct 16, 2025

View reviewed changes

vicsn marked this pull request as draft October 16, 2025 14:40

vicsn closed this Oct 24, 2025

vicsn reopened this Oct 29, 2025

vicsn mentioned this pull request Nov 27, 2025

[Feature] Implement snark.verify opcode #3004

Merged

5 tasks

perf: use a faster hash for inputs_and_batches

e316f80

Signed-off-by: ljedrz <ljedrz@users.noreply.github.com>

Conversation

ljedrz commented Oct 16, 2025

Uh oh!

ljedrz Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vicsn Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

ljedrz Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vicsn Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

ljedrz Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

vicsn Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

vicsn commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vicsn Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

ljedrz commented Oct 17, 2025

Uh oh!

vicsn commented Oct 17, 2025

Uh oh!

ljedrz commented Oct 17, 2025

Uh oh!

vicsn commented Oct 23, 2025

Uh oh!

ljedrz commented Oct 23, 2025

Uh oh!

ljedrz commented Oct 23, 2025

Uh oh!

vicsn commented Oct 29, 2025

Uh oh!

vicsn commented Jan 12, 2026

Uh oh!

ljedrz commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ljedrz Oct 16, 2025 •

edited

Loading

ljedrz Oct 16, 2025 •

edited

Loading

vicsn commented Oct 16, 2025 •

edited

Loading