Skip to content

Conversation

@georgwiese
Copy link
Collaborator

@georgwiese georgwiese commented Nov 15, 2025

Based on #3461
Issue: #3366

This PR adds empirical constraints (as detected in #3461) to the precompile. This yields a smaller precompile which is not guaranteed to work in all circumstances. Because of this, constraints might fail. In a future PR, we need to adjust execution to revert to running software in these cases.

As execution doesn't work yet, the feature is currently hidden behind the --optimistic-precompiles flag:
cargo run --bin powdr_openvm -r prove guest-keccak --input 100 --autoprecompiles 1 --apc-candidates-dir keccak100 --mock --optimistic-precompiles

This yields this effectiveness plot. Effectiveness is significantly increased: e.g. the memcpy block (0x201ecc) goes from effectiveness of 3.39 in the last nightly to 5.06 with this PR. There are a two caveats though:

  • This does not account for the fact that we might have to run software sometimes. If the empirical constraints are based on a good sample, this should only be the case in at most 2% of the runs. So, for example, an effectiveness of 5.0 would turn into 1 / (0.98 * 1 / 5 + 0.02) = 4.63.
  • The quality of the sample is important. In this particular case, the sample runs are the same as the actual runs, so this might be over-estimating the increase in effectiveness.

@georgwiese georgwiese force-pushed the pgo-range-constraints branch from 760466e to f1a72e2 Compare November 18, 2025 15:55
@georgwiese georgwiese changed the title [WIP] Pass PGO range constraints [WIP] Optimistic Precompiles Nov 24, 2025
@georgwiese georgwiese mentioned this pull request Nov 24, 2025
@georgwiese georgwiese force-pushed the pgo-range-constraints branch 2 times, most recently from 9f81ce0 to 7851df5 Compare November 27, 2025 20:39
@georgwiese georgwiese changed the base branch from main to extract-do-with-trace November 27, 2025 20:39
@georgwiese georgwiese force-pushed the pgo-range-constraints branch from 95a081c to cd73886 Compare November 27, 2025 22:08
@georgwiese georgwiese changed the base branch from extract-do-with-trace to collect-empirical-constraints November 27, 2025 22:08
@georgwiese georgwiese force-pushed the collect-empirical-constraints branch from 3b5275b to 0921d90 Compare November 28, 2025 01:36
@georgwiese georgwiese force-pushed the pgo-range-constraints branch from 9c49cd3 to 1834e9c Compare November 28, 2025 01:38
@georgwiese georgwiese force-pushed the collect-empirical-constraints branch from 0921d90 to 1322a28 Compare November 28, 2025 02:01
@georgwiese georgwiese force-pushed the pgo-range-constraints branch from 1834e9c to 2b58629 Compare November 28, 2025 02:02
Comment on lines 53 to 151
// Mapping (instruction index, column index) -> AlgebraicReference
let reverse_subs = subs
.iter()
.enumerate()
.flat_map(|(instr_index, subs)| {
subs.iter()
.enumerate()
.map(move |(col_index, &poly_id)| (poly_id, (instr_index, col_index)))
})
.collect::<BTreeMap<_, _>>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically the key (instruction_index, column_index) are those of the original instructions, and this is the Id of the equivalence classes?

Comment on lines 63 to 154
let algebraic_references = columns
.map(|r| (reverse_subs.get(&r.id).unwrap().clone(), r.clone()))
.collect::<BTreeMap<_, _>>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map from original instruction cell to AlgebraicReference.

Comment on lines 69 to 160
let Some(range_constraints) = range_constraints.get(&pc) else {
continue;
};
Copy link
Collaborator

@qwang98 qwang98 Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean redundancy can be good, but will this ever happen in practice because we pretty much traced this for all PC?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if we want to assert this, we can unwrap() here.

Comment on lines 75 to 169
let Some(reference) = algebraic_references.get(&(i, col_index)).cloned() else {
panic!(
"Missing reference for (i: {}, col_index: {}, block_id: {})",
i, col_index, block.start_pc
);
};
Copy link
Collaborator

@qwang98 qwang98 Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't it panic! here for all columns optimized away, because algebraic_reference is over all machine.main_columns()?

Or is it fine here because this whole add_empirical_constraint() is called before any optimization?

Comment on lines 92 to 190
let Some(first_ref) = algebraic_references.get(first).cloned() else {
// TODO: This fails in some blocks. For now, just return no extra constraints.
println!(
"Missing reference for (i: {}, col_index: {}, block_id: {})",
first.0, first.1, block.start_pc
);
return (range_analyzer_constraints, vec![]);
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you mean it fails in some blocks on the get()?

This is indeed quite weird, as algebraic_references, if done before the optimizations, should really contain all possible equivalence classes Id.

Do you think this can be caused by is_valid? It allocates a new AlgebraicReference, but equivalence class collection should be over execution of original instructions, so the Id we call get on shouldn't contain the is_valid column?

Maybe helps to look around build on what could be already optmized away from original instructions by the time main_columns() is called.

Comment on lines 109 to 204
let constraint = AlgebraicExpression::Reference(first_ref.clone())
- AlgebraicExpression::Reference(other_ref.clone());
equivalence_analyzer_constraints.push(SymbolicConstraint { expr: constraint });
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so looks like we are constraining all references in the equivalence class have the same value, not class_idx, so if my comment in #3461 is correct, I think this might present a future bug that hasn't arised yet (because we fail earlier at the get())?

column_allocator,
)
.unwrap();
let dumb_precompile = machine.render(&vm_config.bus_map);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, like the name :)

Comment on lines 452 to 486
let mut baseline = machine;

let (machine, column_allocator) = optimizer::optimize::<A>(
baseline.clone(),
vm_config.bus_interaction_handler.clone(),
degree_bound,
&vm_config.bus_map,
column_allocator,
)
.unwrap();
let dumb_precompile = machine.render(&vm_config.bus_map);

baseline.constraints.extend(range_analyzer_constraints);
// TODO: Appears to be buggy
// baseline
// .constraints
// .extend(equivalence_analyzer_constraints);

let (machine, column_allocator) = optimizer::optimize::<A>(
machine,
baseline,
vm_config.bus_interaction_handler,
degree_bound,
&vm_config.bus_map,
column_allocator,
)?;
)
.unwrap();
let ai_precompile = machine.render(&vm_config.bus_map);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we plan to do two optimization passes (one on dumb, then add empirical constraints, and then optimize again) or just one optimization pass after adding empirical constraints?

Might be just that we are creating two separate paths here.

Comment on lines 559 to 609
assert_eq!(
pre_degree,
machine.degree(),
"Degree should not change after adding guards"
);
// TODO: Why do we need this?
if pre_degree != 0 {
assert_eq!(
pre_degree,
machine.degree(),
"Degree should not change after adding guards, but changed from {} to {}",
pre_degree,
machine.degree(),
);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably an add_guard specific thing so not super related to this PR, but I wonder why degree should not change here, because we literally multiplied is_valid to all constraints?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_valid is applied to the constraints that cannot be zero when we set all the witness to 0. for a constraint like (a - 2)( b - 3) = 0, after adding guard, it becomes (a - 2 * is_valid)(b - 3) = 0, then the degree won't change.

Comment on lines +190 to +202
if existing_values.len() != mem_int.data().len() {
log::error!(
"Memory interaction data length mismatch: existing values = {}, new values = {}. Resetting memory.",
existing_values.iter().map(ToString::to_string).join(", "),
mem_int.data().iter().map(ToString::to_string).join(", ")
);
memory_contents.clear();
continue;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears because some of OpenVM's 256-Bit branch instructions can appear in a basic block. These instruction read larger words.

I think we should fix this by not including these instructions inside a basic block, like we do for other precompiles (= non RISC-V instructions). Doing this properly would really complicate the memory optimizer and I don't think it's worth it.

Copy link
Collaborator

@qwang98 qwang98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another comment, maybe going a bit over higher-level basics, but it seems that we are only adding constraints, so I wonder how concretely we optimize the APCs:

  • For range constraints, are we relying on the optimzer, which happens after adding the empirical constraints, to remove "looser" range constraints after adding a tigher one? Will this potentially remove some columns? How would this concretely speed up the proving process?
  • For equivalence constraints, are we also relying on the optmizer, to remove more constraints than the ones we added? Or are we more relying on the optimizer removing some columns?
  • How we obtained the effectiveness plots? Are the estimates mostly based off columns removed from "improved" range constraints?

Sorry if dummy questions!

@georgwiese georgwiese force-pushed the collect-empirical-constraints branch 4 times, most recently from d40352e to a235a5a Compare November 28, 2025 15:09
@georgwiese georgwiese force-pushed the collect-empirical-constraints branch 6 times, most recently from bddc435 to 589374e Compare November 28, 2025 20:46
@georgwiese georgwiese force-pushed the pgo-range-constraints branch 6 times, most recently from 6c915c8 to cc83372 Compare November 28, 2025 21:53
// Apply execution count threshold to avoid overfitting on rarely executed code
let empirical_constraints = empirical_constraints
.clone()
.with_thresholded_pc_count(EXECUTION_COUNT_THRESHOLD);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should avoid above-average effectiveness for the "other" category in the effectiveness plot. In fact, I would expect them to be below-average now, because they don't get any empirical constraints if they aren't executed at least 100 times.

@georgwiese georgwiese force-pushed the pgo-range-constraints branch from cc83372 to 1913d1c Compare November 29, 2025 14:49
@georgwiese georgwiese force-pushed the pgo-range-constraints branch from 1913d1c to 316ea10 Compare November 29, 2025 18:04
// In that case, we can replace both bus interactions with equality constraints
// between the data that would have been sent and received.
if let Some((previous_send, existing_values)) = memory_contents.remove(&addr) {
// TODO: This can happen which optimistic precompiles, because basic block can include
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed before merging. I suggest to black-list big-word instructions.

@leonardoalt
Copy link
Member

leonardoalt commented Dec 4, 2025

I updated my local openvm-reth-benchmark to use this branch, and currently get

thread '<unnamed>' (1606659) panicked at /home/leo/devel/powdr/openvm/src/memory_bus_interaction.rs:100:22:
Register address must be a concrete number
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Note that this is different from the bigint branching instructions case with 32 limbs, since that's already patched in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants