Skip to content

perf: fold constraints on the fly#55

Open
Nashtare wants to merge 6 commits into0xMiden:mainfrom
Nashtare:robin/miden/constraint_eval
Open

perf: fold constraints on the fly#55
Nashtare wants to merge 6 commits into0xMiden:mainfrom
Nashtare:robin/miden/constraint_eval

Conversation

@Nashtare
Copy link
Copy Markdown
Contributor

Summary

Tweak constraint folding to perform folding on the fly. The finalize_constraints() method then basically just adds both accumulators, for base constraints (lifted to PE) and extension constraints.

This seems to yield about ~4% improvement on my M4 pro when running the lifted_miden example for the eval_instance debug span.

@Nashtare
Copy link
Copy Markdown
Contributor Author

@adr1anh I just saw your upstream Plonky3/Plonky3#1452 PR, overall savings seem much more interesting focusing on heap-allocation so perhaps not worth considering this local PR and apply instead a similar approach to the upstream one.

@adr1anh
Copy link
Copy Markdown
Contributor

adr1anh commented Mar 19, 2026

One of the goals of Plonky3/Plonky3#1452 was indeed to improve the memory pressure by avoiding the Vec allocation for each row. However, I think @Al-Kindi-0 came up with this when he started testing the Miden VM constraints. I'll let him chime in with more details, but I think it was either due to the interleaving of extension field constraints, but maybe more because the initial optimized assert_zeros implementation which would perform slightly worse when a non-optimal number of constraints were passed in.

Copy link
Copy Markdown
Contributor

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE: @adr1anh 's comment: this was part of https://github.com/0xMiden/p3-miden/compare/al-optimize-constraints-eval, specifically e990a00, which AFAICT is unmerged.

So while this LGTM, both of you @Nashtare and @Al-Kindi-0 working on make me interpret this PR's perf delta as evidence that the Miden VM workload stresses the older design in a bad way, not necessarily as proof that “fold on the fly” is the only right long-term fix.

base_alpha_powers: &base_alpha_powers,
ext_alpha_powers: &ext_alpha_powers,
constraint_index: 0,
base_acc: Default::default(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think this could start from PE::ZERO instead of Default::default(). PE already comes from Algebra, so zero is part of the contract here, while Default is only a convention on the current packed types.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would favor the solution in this PR for its simplicity unless the more involved solution with buffers provides a clear and consistent advantage.

@huitseeker huitseeker requested a review from Al-Kindi-0 March 23, 2026 13:19
Copy link
Copy Markdown
Collaborator

@Al-Kindi-0 Al-Kindi-0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
As mentioned in the other comment, unless we are gaining consistently, I would just go with the current simplification

Copy link
Copy Markdown
Contributor

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me, thanks @Nashtare

@adr1anh
Copy link
Copy Markdown
Contributor

adr1anh commented Mar 24, 2026

Agreed, I'm still investigating this on the plonky3-side, and if there's any meaningful change it would be easier to implement here.

I would add this override that was lost on the way which should improve extension field constraints a bit.

@Nashtare Nashtare force-pushed the robin/miden/constraint_eval branch from 934fb35 to 7329a02 Compare March 26, 2026 14:07
@Nashtare
Copy link
Copy Markdown
Contributor Author

@adr1anh I've added the assert_zeros override. assert_zeros_ext is not part of Plonky3's ExtensionBuilder API yet (related PR: Plonky3/Plonky3#1493).

Comment on lines +154 to +159
for (j, x) in array.into_iter().enumerate() {
let val: P = x.into();
let term = PE::from_basis_coefficients_fn(|d| val * self.base_alpha_powers[d][idx + j]);
delta += term;
}
self.base_acc += delta;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do a packed linear combination here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right good point!

@adr1anh
Copy link
Copy Markdown
Contributor

adr1anh commented Mar 26, 2026

I do really like the changes, but I was wondering if we could try benchmarking the VM against this branch. It might be a bit annoying due to the braking changes to the lmcs, but such a setup would also help us figure out if Plonky3/Plonky3#1452 can be of any use on our side.

I was also wondering if we would be able to get some benefits from the current (pre-PR) approach by caching the constraint accumulation vectors as was recently done on plonky3.

What would you think about keeping this one on ice until we get some kind of benchmarking setup going?

@Nashtare
Copy link
Copy Markdown
Contributor Author

Sure let's hold off!

@Nashtare Nashtare self-assigned this Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants