perf: fold constraints on the fly by Nashtare · Pull Request #55 · 0xMiden/p3-miden

Nashtare · 2026-03-19T03:43:38Z

Summary

Tweak constraint folding to perform folding on the fly. The finalize_constraints() method then basically just adds both accumulators, for base constraints (lifted to PE) and extension constraints.

This seems to yield about ~4% improvement on my M4 pro when running the lifted_miden example for the eval_instance debug span.

Nashtare · 2026-03-19T05:30:26Z

@adr1anh I just saw your upstream Plonky3/Plonky3#1452 PR, overall savings seem much more interesting focusing on heap-allocation so perhaps not worth considering this local PR and apply instead a similar approach to the upstream one.

adr1anh · 2026-03-19T09:00:39Z

One of the goals of Plonky3/Plonky3#1452 was indeed to improve the memory pressure by avoiding the Vec allocation for each row. However, I think @Al-Kindi-0 came up with this when he started testing the Miden VM constraints. I'll let him chime in with more details, but I think it was either due to the interleaving of extension field constraints, but maybe more because the initial optimized assert_zeros implementation which would perform slightly worse when a non-optimal number of constraints were passed in.

huitseeker

RE: @adr1anh 's comment: this was part of https://github.com/0xMiden/p3-miden/compare/al-optimize-constraints-eval, specifically e990a00, which AFAICT is unmerged.

So while this LGTM, both of you @Nashtare and @Al-Kindi-0 working on make me interpret this PR's perf delta as evidence that the Miden VM workload stresses the older design in a bad way, not necessarily as proof that “fold on the fly” is the only right long-term fix.

huitseeker · 2026-03-23T12:55:37Z

p3-miden-lifted-stark/src/prover/constraints/mod.rs

                    base_alpha_powers: &base_alpha_powers,
                    ext_alpha_powers: &ext_alpha_powers,
-                    constraint_index: 0,
+                    base_acc: Default::default(),


Nit: I think this could start from PE::ZERO instead of Default::default(). PE already comes from Algebra, so zero is part of the contract here, while Default is only a convention on the current packed types.

I would favor the solution in this PR for its simplicity unless the more involved solution with buffers provides a clear and consistent advantage.

Al-Kindi-0

LGTM
As mentioned in the other comment, unless we are gaining consistently, I would just go with the current simplification

huitseeker

Works for me, thanks @Nashtare

adr1anh · 2026-03-24T12:05:00Z

Agreed, I'm still investigating this on the plonky3-side, and if there's any meaningful change it would be easier to implement here.

I would add this override that was lost on the way which should improve extension field constraints a bit.

Nashtare · 2026-03-26T14:30:19Z

@adr1anh I've added the assert_zeros override. assert_zeros_ext is not part of Plonky3's ExtensionBuilder API yet (related PR: Plonky3/Plonky3#1493).

adr1anh · 2026-03-26T15:06:34Z

p3-miden-lifted-stark/src/prover/constraints/folder.rs

+        for (j, x) in array.into_iter().enumerate() {
+            let val: P = x.into();
+            let term = PE::from_basis_coefficients_fn(|d| val * self.base_alpha_powers[d][idx + j]);
+            delta += term;
+        }
+        self.base_acc += delta;


Can we do a packed linear combination here?

Ah right good point!

adr1anh · 2026-03-26T15:13:06Z

I do really like the changes, but I was wondering if we could try benchmarking the VM against this branch. It might be a bit annoying due to the braking changes to the lmcs, but such a setup would also help us figure out if Plonky3/Plonky3#1452 can be of any use on our side.

I was also wondering if we would be able to get some benefits from the current (pre-PR) approach by caching the constraint accumulation vectors as was recently done on plonky3.

What would you think about keeping this one on ice until we get some kind of benchmarking setup going?

Nashtare · 2026-03-27T05:10:45Z

Sure let's hold off!

Nashtare added 2 commits March 19, 2026 12:25

perf: fold on the fly

811f8d8

chore: update changelog

4e5acfe

huitseeker reviewed Mar 23, 2026

View reviewed changes

huitseeker requested a review from Al-Kindi-0 March 23, 2026 13:19

Al-Kindi-0 approved these changes Mar 24, 2026

View reviewed changes

chore: use P::ZERO

528ab87

huitseeker approved these changes Mar 24, 2026

View reviewed changes

Merge branch 'main' into robin/miden/constraint_eval

7329a02

Nashtare force-pushed the robin/miden/constraint_eval branch from 934fb35 to 7329a02 Compare March 26, 2026 14:07

perf: override assert_zeros

f096862

adr1anh reviewed Mar 26, 2026

View reviewed changes

refactor: use packed_linear_combination

f5b2290

Nashtare self-assigned this Mar 30, 2026

Conversation

Nashtare commented Mar 19, 2026

Summary

Uh oh!

Nashtare commented Mar 19, 2026

Uh oh!

adr1anh commented Mar 19, 2026

Uh oh!

huitseeker left a comment

Choose a reason for hiding this comment

Uh oh!

huitseeker Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Al-Kindi-0 Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Al-Kindi-0 left a comment

Choose a reason for hiding this comment

Uh oh!

huitseeker left a comment

Choose a reason for hiding this comment

Uh oh!

adr1anh commented Mar 24, 2026

Uh oh!

Nashtare commented Mar 26, 2026

Uh oh!

adr1anh Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Nashtare Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

adr1anh commented Mar 26, 2026

Uh oh!

Nashtare commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants