Skip to content

Commit 5d23fe0

Browse files
authored
feat(avm)!: finer grained target lookup selectors (#16916)
This PR gives us a way to handle duplicated cost between lookups that share the destination selector. It applies the framework to some lookups and makes `log_derivative_inverse_commitments_round` ~30% faster. ## The Problem Given a lookup instance, the inverses are computed (and then committed to) for any row which has either the source or destination selector on. This means that if you have 2 lookups (or 100+ in case of sha, etc) into, say, `bitwise.sel`, and you add 1 row to it because you did one AND in your execution, you will pay for 2 (or 100+) and not just 1 new inverse. This can be mitigated by adding new selectors like `bitwise.sel_sha` but... ah, we don't want to have to provide that information in simulation! In this PR I use the same idea I used in multipermutations, to do this almost transparently. ## The solution I introduce the concept of a "coarse" (or global selector) like `bitwise.sel` and a "fine grained" selector like `bitwise.sel_sha256`. You add the new selectors columns in PIL, e.g., `bitwise.sel_sha256`. You use that selector in your lookup in PIL. And in tracegen you specify the coarse selector in the lookup definition, e.g., `.add<lookup_sha256_w_s_0_xor_0_settings, InteractionType::LookupGeneric>(Column::bitwise_sel)`. That's it in terms of usage. ## The implementation Lookup builders * Dummy inverses are not added anymore. It was previously only done to force the correct size, so that later when computing the polynomials we would not need to think about this. See later how this is solved. * I added the concept of "outer" selector (the coarse grained one). By default, it is set the same as the fine grained one (which is taken from the settings). There is some new code where we say "ok, if I'm incrementing a count and it's the first time, set the fine grained selector as well". This is what allows us to avoid passing information in simulation. Computation of logderivative inverses: if you are familiar with this, you know that as opposed to the lookup builders, the inverses are computed _after_ wire commitment, because the challenges depend on that, and the inverses depend on the challenges. This didn't change. What changes is that now there is a call to resize the inverse polynomials to the right size, at this point. It's actually the right thing to do, the tracegen option was a bit of a hack. ## Results These are the results with just a few changes. I think we still have a lot of room for improvement by just adding a few columns. Especially for precomputed tables. I might do a few more in subsequent PRs. ``` Incoming lookups* (a bit outdated) 117 bitwise_start 54 range_check_sel 51 gt_sel 28 public_inputs_sel 25 poseidon2_hash_end 24 memory_sel 11 precomputed_sel_range_8 9 precomputed_sel_range_16 8 merkle_check_start ... ``` Before ``` Sum of all column rows: 31751512 ------- STATS ------- prove/log_derivative_inverse_commitments_round_ms: 2666 <-- prove/log_derivative_inverse_round_ms: 379 prove/pcs_rounds_ms: 3278 prove/public_inputs_round_ms: 0 prove/sumcheck_ms: 10427 prove/wire_commitments_round_ms: 3322 proving/all_ms: 21626 ``` After ``` Sum of all column rows: 22541631 ------- STATS ------- prove/log_derivative_inverse_commitments_round_ms: 1826 <-- prove/log_derivative_inverse_round_ms: 376 prove/pcs_rounds_ms: 3145 prove/public_inputs_round_ms: 0 prove/sumcheck_ms: 9902 prove/wire_commitments_round_ms: 3330 proving/all_ms: 20217 ``` *`% grep -r "DST_SELECTOR = Column::" . 2>/dev/null | sed -n 's/.*DST_SELECTOR = Column::\([^;]*\);.*/\1/p' | sort | uniq -c | sort -nr`
1 parent ee91a6d commit 5d23fe0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

67 files changed

+959
-802
lines changed

barretenberg/cpp/pil/vm2/alu.pil

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -185,16 +185,16 @@ a_lo_bits - sel_mul_div_u128 * constant_64 - sel_shift_ops * shift_lo_bits = 0;
185185
a_hi_bits - sel_mul_div_u128 * constant_64 - sel_shift_ops * SHIFT_HI_BITS = 0;
186186

187187
#[RANGE_CHECK_DECOMPOSITION_A_LO]
188-
sel_decompose_a { a_lo, a_lo_bits } in range_check.sel { range_check.value, range_check.rng_chk_bits };
188+
sel_decompose_a { a_lo, a_lo_bits } in range_check.sel_alu { range_check.value, range_check.rng_chk_bits };
189189

190190
#[RANGE_CHECK_DECOMPOSITION_A_HI]
191-
sel_decompose_a { a_hi, a_hi_bits } in range_check.sel { range_check.value, range_check.rng_chk_bits };
191+
sel_decompose_a { a_hi, a_hi_bits } in range_check.sel_alu { range_check.value, range_check.rng_chk_bits };
192192

193193
#[RANGE_CHECK_DECOMPOSITION_B_LO]
194-
sel_mul_div_u128 { b_lo, constant_64 } in range_check.sel { range_check.value, range_check.rng_chk_bits };
194+
sel_mul_div_u128 { b_lo, constant_64 } in range_check.sel_alu { range_check.value, range_check.rng_chk_bits };
195195

196196
#[RANGE_CHECK_DECOMPOSITION_B_HI]
197-
sel_mul_div_u128 { b_hi, constant_64 } in range_check.sel { range_check.value, range_check.rng_chk_bits };
197+
sel_mul_div_u128 { b_hi, constant_64 } in range_check.sel_alu { range_check.value, range_check.rng_chk_bits };
198198

199199

200200
// ADD
@@ -252,7 +252,7 @@ sel_mul_u128 * (1 - sel_tag_err)
252252
// up max_value. i.e. we cannot provide a malicious c, c_hi such that a + b - c_hi * 2^n = c passes for n < 128.
253253
// No need to range_check c_lo = ic because the memory write will ensure ic <= max_value.
254254
#[RANGE_CHECK_MUL_U128_C_HI]
255-
sel_mul_u128 { c_hi, constant_64 } in range_check.sel { range_check.value, range_check.rng_chk_bits };
255+
sel_mul_u128 { c_hi, constant_64 } in range_check.sel_alu { range_check.value, range_check.rng_chk_bits };
256256

257257
// DIV
258258

@@ -264,7 +264,7 @@ sel_op_div * (1 - sel_op_div) = 0;
264264
pol commit sel_div_no_0_err;
265265
sel_div_no_0_err = sel_op_div * (1 - sel_div_0_err);
266266
#[GT_DIV_REMAINDER]
267-
sel_div_no_0_err { ib, helper1, sel_op_div } in gt.sel { gt.input_a, gt.input_b, gt.res };
267+
sel_div_no_0_err { ib, helper1, sel_op_div } in gt.sel_alu { gt.input_a, gt.input_b, gt.res };
268268

269269
// DIV - u128
270270

@@ -395,7 +395,7 @@ in ff_gt.sel_gt { ff_gt.a, ff_gt.b, ff_gt.result };
395395

396396
#[INT_GT]
397397
sel_int_lt_ops { lt_ops_input_a, lt_ops_input_b, lt_ops_result_c }
398-
in gt.sel { gt.input_a, gt.input_b, gt.res };
398+
in gt.sel_alu { gt.input_a, gt.input_b, gt.res };
399399

400400
// NOT
401401
// Input is sent to ia, ia_tag and output is sent to ib, ib_tag.
@@ -598,4 +598,4 @@ mid_bits = sel_trunc_non_trivial * (128 - max_bits);
598598
// is supported by our range_check gadget.
599599
// No need to range_check ic because the memory write will ensure ic <= max_value.
600600
#[RANGE_CHECK_TRUNC_MID]
601-
sel_trunc_non_trivial { mid, mid_bits } in range_check.sel { range_check.value, range_check.rng_chk_bits };
601+
sel_trunc_non_trivial { mid, mid_bits } in range_check.sel_alu { range_check.value, range_check.rng_chk_bits };

barretenberg/cpp/pil/vm2/bitwise.pil

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,13 @@ pol commit start; // Identifies when we want to capture the output to the main t
5454
// Must be constrained as a boolean as any selector used in a lookup/permutation.
5555
start * (1 - start) = 0;
5656

57+
// This is used to decouple generation of inverses of lookups into this trace.
58+
pol commit start_keccak;
59+
pol commit start_sha256;
60+
start_keccak * (1 - start_keccak) = 0;
61+
// If any of the above selectors is 1, then start must be 1.
62+
(start_keccak + start_sha256) * (1 - start) = 0;
63+
5764
// To support dynamically sized memory operands we use a counter against a lookup
5865
// This decrementing counter goes from [TAG_LEN, 0] where TAG_LEN is the number of bytes in the
5966
// corresponding integer. i.e. TAG_LEN is between 1 (U1/U8) and 16 (U128).

barretenberg/cpp/pil/vm2/bytecode/update_check.pil

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ namespace update_check;
145145
// Equivalent to "timestamp_of_change > timestamp"
146146
#[TIMESTAMP_IS_LT_TIMESTAMP_OF_CHANGE]
147147
hash_not_zero { timestamp_of_change, timestamp, timestamp_is_lt_timestamp_of_change }
148-
in gt.sel { gt.input_a, gt.input_b, gt.res };
148+
in gt.sel_others { gt.input_a, gt.input_b, gt.res };
149149

150150
pol commit update_pre_class_id_is_zero;
151151
update_pre_class_id_is_zero * (1 - update_pre_class_id_is_zero) = 0;

barretenberg/cpp/pil/vm2/data_copy.pil

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ namespace data_copy;
173173
#[MAX_READ_INDEX_GT]
174174
sel_start { offset_plus_size, src_data_size, offset_plus_size_is_gt }
175175
in
176-
gt.sel { gt.input_a, gt.input_b, gt.res };
176+
gt.sel_others { gt.input_a, gt.input_b, gt.res };
177177

178178
// Set max_read_index based on the conditions (1) or (2) from above
179179
pol commit max_read_index;
@@ -197,15 +197,15 @@ namespace data_copy;
197197
#[CHECK_SRC_ADDR_IN_RANGE]
198198
sel_start { max_read_addr, max_mem_addr, src_out_of_range_err }
199199
in
200-
gt.sel { gt.input_a, gt.input_b, gt.res };
200+
gt.sel_others { gt.input_a, gt.input_b, gt.res };
201201

202202
pol MAX_WRITE_ADDR = dst_addr + copy_size;
203203
pol commit max_write_addr;
204204
max_write_addr = sel_start * MAX_WRITE_ADDR;
205205
#[CHECK_DST_ADDR_IN_RANGE]
206206
sel_start { max_write_addr, max_mem_addr, dst_out_of_range_err }
207207
in
208-
gt.sel { gt.input_a, gt.input_b, gt.res };
208+
gt.sel_others { gt.input_a, gt.input_b, gt.res };
209209

210210
// Consolidate the errors
211211
pol commit err;
@@ -249,7 +249,7 @@ namespace data_copy;
249249
#[OFFSET_GT_MAX_READ_INDEX]
250250
sel_start_no_err { offset, max_read_index, offset_gt_max_read_index }
251251
in
252-
gt.sel { gt.input_a, gt.input_b, gt.res };
252+
gt.sel_others { gt.input_a, gt.input_b, gt.res };
253253

254254
// If sel_offset_gt_max_read = 1 (i.e. when offset > MAX_READ_INDEX, reads_left = 0)
255255
// otherwise, reads_left = MAX_READ_INDEX - offset

barretenberg/cpp/pil/vm2/ecc_mem.pil

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ namespace ecc_add_mem;
8484
#[CHECK_DST_ADDR_IN_RANGE]
8585
sel { dst_addr[2], max_mem_addr, sel_dst_out_of_range_err }
8686
in
87-
gt.sel { gt.input_a, gt.input_b, gt.res };
87+
gt.sel_others { gt.input_a, gt.input_b, gt.res };
8888

8989
////////////////////////////////////////////////
9090
// Error Handling - Check Points are on curve

barretenberg/cpp/pil/vm2/execution.pil

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -331,7 +331,7 @@ pol commit sel_radix_gt_256; // sel_radix_gt_256 = 1 iff radix > 256
331331
#[CHECK_RADIX_GT_256]
332332
sel_gas_to_radix { /*radix=*/register[1], two_five_six, sel_radix_gt_256 }
333333
in
334-
gt.sel { gt.input_a, gt.input_b, gt.res };
334+
gt.sel_others { gt.input_a, gt.input_b, gt.res };
335335

336336
// Boolean for if we should look up the num_p_limbs value
337337
pol commit sel_lookup_num_p_limbs;
@@ -356,7 +356,7 @@ pol commit sel_use_num_limbs;
356356
#[GET_MAX_LIMBS]
357357
sel_gas_to_radix { /*num_limbs=*/register[2], num_p_limbs, sel_use_num_limbs }
358358
in
359-
gt.sel { gt.input_a, gt.input_b, gt.res };
359+
gt.sel_others { gt.input_a, gt.input_b, gt.res };
360360

361361
#[DYN_L2_FACTOR_TO_RADIX_BE] // num_limbs > num_p_limbs ? num_limbs : num_p_limbs
362362
sel_gas_to_radix * ((/*num_limbs=*/register[2] - num_p_limbs) * sel_use_num_limbs + num_p_limbs - dynamic_l2_gas_factor) = 0;

barretenberg/cpp/pil/vm2/execution/addressing.pil

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -200,19 +200,19 @@ pol commit highest_address;
200200
SEL_SHOULD_RESOLVE_ADDRESS * (highest_address - constants.AVM_HIGHEST_MEM_ADDRESS) = 0;
201201
// sel_relative_overflow[i] == 1 iff op_after_relative[i] > highest_address.
202202
#[RELATIVE_OVERFLOW_RESULT_0]
203-
sel_op_is_relative_effective[0] { op_after_relative[0], highest_address, sel_relative_overflow[0] } in gt.sel { gt.input_a, gt.input_b, gt.res };
203+
sel_op_is_relative_effective[0] { op_after_relative[0], highest_address, sel_relative_overflow[0] } in gt.sel_addressing { gt.input_a, gt.input_b, gt.res };
204204
#[RELATIVE_OVERFLOW_RESULT_1]
205-
sel_op_is_relative_effective[1] { op_after_relative[1], highest_address, sel_relative_overflow[1] } in gt.sel { gt.input_a, gt.input_b, gt.res };
205+
sel_op_is_relative_effective[1] { op_after_relative[1], highest_address, sel_relative_overflow[1] } in gt.sel_addressing { gt.input_a, gt.input_b, gt.res };
206206
#[RELATIVE_OVERFLOW_RESULT_2]
207-
sel_op_is_relative_effective[2] { op_after_relative[2], highest_address, sel_relative_overflow[2] } in gt.sel { gt.input_a, gt.input_b, gt.res };
207+
sel_op_is_relative_effective[2] { op_after_relative[2], highest_address, sel_relative_overflow[2] } in gt.sel_addressing { gt.input_a, gt.input_b, gt.res };
208208
#[RELATIVE_OVERFLOW_RESULT_3]
209-
sel_op_is_relative_effective[3] { op_after_relative[3], highest_address, sel_relative_overflow[3] } in gt.sel { gt.input_a, gt.input_b, gt.res };
209+
sel_op_is_relative_effective[3] { op_after_relative[3], highest_address, sel_relative_overflow[3] } in gt.sel_addressing { gt.input_a, gt.input_b, gt.res };
210210
#[RELATIVE_OVERFLOW_RESULT_4]
211-
sel_op_is_relative_effective[4] { op_after_relative[4], highest_address, sel_relative_overflow[4] } in gt.sel { gt.input_a, gt.input_b, gt.res };
211+
sel_op_is_relative_effective[4] { op_after_relative[4], highest_address, sel_relative_overflow[4] } in gt.sel_addressing { gt.input_a, gt.input_b, gt.res };
212212
#[RELATIVE_OVERFLOW_RESULT_5]
213-
sel_op_is_relative_effective[5] { op_after_relative[5], highest_address, sel_relative_overflow[5] } in gt.sel { gt.input_a, gt.input_b, gt.res };
213+
sel_op_is_relative_effective[5] { op_after_relative[5], highest_address, sel_relative_overflow[5] } in gt.sel_addressing { gt.input_a, gt.input_b, gt.res };
214214
#[RELATIVE_OVERFLOW_RESULT_6]
215-
sel_op_is_relative_effective[6] { op_after_relative[6], highest_address, sel_relative_overflow[6] } in gt.sel { gt.input_a, gt.input_b, gt.res };
215+
sel_op_is_relative_effective[6] { op_after_relative[6], highest_address, sel_relative_overflow[6] } in gt.sel_addressing { gt.input_a, gt.input_b, gt.res };
216216

217217
/**************************************************************************************************
218218
* Indirection Resolution

barretenberg/cpp/pil/vm2/gt.pil

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,18 @@ namespace gt;
1717
pol commit sel;
1818
sel * (1 - sel) = 0;
1919

20+
// This is used to decouple generation of inverses of lookups into this trace.
21+
pol commit sel_sha256;
22+
pol commit sel_addressing;
23+
pol commit sel_alu;
24+
pol commit sel_others; // Any other lookup into this trace.
25+
sel_sha256 * (1 - sel_sha256) = 0;
26+
sel_addressing * (1 - sel_addressing) = 0;
27+
sel_alu * (1 - sel_alu) = 0;
28+
sel_others * (1 - sel_others) = 0;
29+
// If any of the above selectors is 1, then sel must be 1.
30+
(sel_sha256 + sel_addressing + sel_alu + sel_others) * (1 - sel) = 0;
31+
2032
#[skippable_if]
2133
sel = 0;
2234

@@ -45,4 +57,4 @@ namespace gt;
4557
#[GT_RESULT]
4658
sel * ( (A_GT_B - A_LTE_B) * res + A_LTE_B - abs_diff ) = 0;
4759
#[GT_RANGE]
48-
sel { abs_diff, num_bits } in range_check.sel { range_check.value, range_check.rng_chk_bits };
60+
sel { abs_diff, num_bits } in range_check.sel_gt { range_check.value, range_check.rng_chk_bits };

0 commit comments

Comments
 (0)