ecmult_multi: Replace scratch space with malloc, use abcd cost model#1789
ecmult_multi: Replace scratch space with malloc, use abcd cost model#1789fjahr wants to merge 2 commits intobitcoin-core:masterfrom
Conversation
68dd612 to
8b62524
Compare
| return 1; | ||
| } | ||
|
|
||
| int secp256k1_musig_pubkey_agg(const secp256k1_context* ctx, secp256k1_xonly_pubkey *agg_pk, secp256k1_musig_keyagg_cache *keyagg_cache, const secp256k1_pubkey * const* pubkeys, size_t n_pubkeys) { |
There was a problem hiding this comment.
It might be better to pass mem_limit as an input argument rather than defining it internally
There was a problem hiding this comment.
Hm, I see the point that this would align with the batch validation API that has been discussed but I am not sure that this justifies changing the existing API of musig. It would probably better to have this in a separate/follow-up PR but I will wait a bit to see what other reviewers think about this.
src/bench_ecmult.c
Outdated
| * values from the output. | ||
| */ | ||
| static void run_ecmult_multi_calib(bench_data* data) { | ||
| static const size_t batch_sizes[] = {10, 20, 30, 50, 75, 100, 150, 200, 300, 500, 750, 1000, 1500, 2000, 3000, 5000, 7500, 10000, 15000, 20000, 30000}; |
There was a problem hiding this comment.
I think it woulbe nice to have more batch sizes in the crossover region of strauss and pippenger (~80-150 points). So, something like this:
static const size_t batch_sizes[] = {
/* Small (Strauss region) */
5, 10, 15, 20, 30, 50, 70,
/* Crossover region */
85, 88, 90, 100, 120, 150, 175,
/* Medium (Pippenger small windows, w=6..8) */
200, 300, 500, 750, 1000, 1200,
/* Large (Pippenger large windows, w=9..12) */
1500, 2000, 3000, 5000, 7500,
10000, 15000, 20000, 30000
};
There was a problem hiding this comment.
We should probably cap the batch size for Strauss calibration (7,500 or less?). At higher sizes, the benchmark results curve and become outliers, which will skew the linear regression model. Since ecmult_multi_select won't choose Strauss for large batches anyway, we can avoid these sizes. I've attached a visualization below.
|
I ran the calibration tool, and I'm getting the following results. It's very close the exisitng results, uptil My ABCD valuesI’ve visualized the benchmarks for all algorithms (see siv2r@3119386 for the diagrams). I’m planning to do a deeper dive into the regression model fit to ensure they align with these results. Will share any useful findings here |
|
I did some analysis on the linear regression model for the CD values (see siv2r@b5985a6). I basically ran ABCD Valuesstatic const struct secp256k1_ecmult_multi_abcd secp256k1_ecmult_multi_abcds[SECP256K1_ECMULT_MULTI_NUM_ALGOS] = {
{0, 0, 1000, 0 },
{SECP256K1_STRAUSS_POINT_SIZE, 0, 112, 173 },
{SECP256K1_PIPPENGER_POINT_SIZE(1), SECP256K1_PIPPENGER_FIXED_SIZE(1), 197, 285 },
{SECP256K1_PIPPENGER_POINT_SIZE(2), SECP256K1_PIPPENGER_FIXED_SIZE(2), 152, 480 },
{SECP256K1_PIPPENGER_POINT_SIZE(3), SECP256K1_PIPPENGER_FIXED_SIZE(3), 117, 767 },
{SECP256K1_PIPPENGER_POINT_SIZE(4), SECP256K1_PIPPENGER_FIXED_SIZE(4), 100, 1167 },
{SECP256K1_PIPPENGER_POINT_SIZE(5), SECP256K1_PIPPENGER_FIXED_SIZE(5), 86, 1887 },
{SECP256K1_PIPPENGER_POINT_SIZE(6), SECP256K1_PIPPENGER_FIXED_SIZE(6), 78, 3023 },
{SECP256K1_PIPPENGER_POINT_SIZE(7), SECP256K1_PIPPENGER_FIXED_SIZE(7), 73, 4906 },
{SECP256K1_PIPPENGER_POINT_SIZE(8), SECP256K1_PIPPENGER_FIXED_SIZE(8), 70, 8889 },
{SECP256K1_PIPPENGER_POINT_SIZE(9), SECP256K1_PIPPENGER_FIXED_SIZE(9), 74, 14544},
{SECP256K1_PIPPENGER_POINT_SIZE(10), SECP256K1_PIPPENGER_FIXED_SIZE(10), 79, 26764},
{SECP256K1_PIPPENGER_POINT_SIZE(11), SECP256K1_PIPPENGER_FIXED_SIZE(11), 84, 49179},
{SECP256K1_PIPPENGER_POINT_SIZE(12), SECP256K1_PIPPENGER_FIXED_SIZE(12), 106, 89518},
};Statistical analysisMetric Definitions
Per-Algorithm Metrics
The linear fit ( |
72fd3e8 to
c2c9bc2
Compare
|
Addressed the comments on the implementation code and rebased, thanks a lot for the review @siv2r ! I need a little more time to think about the calibration again, but will comment on that asap. |
c2c9bc2 to
4aa160c
Compare
|
@siv2r Thanks again for the review! The hint about using more narrow ranges for each of the algos in the calibration was awesome, I didn't think of this before and it pretty much fixed all the problems I ran into with the calibration results. I took these suggestions with some smaller additional tweaks, specifically with Strauss which still had quite a bit of variance. I see pretty stable calibration results between multiple runs now and the measurements between the different benchmarks are running smoother than any of my previous iterations. These improvements have given me a lot more confidence in these changes, so taking this out of draft status too. |
4aa160c to
ef4754b
Compare
|
Randomly found #638 when looking at older PRs and found a small bug in my pippenger size calculation from comparing the overlapping code. |
ef4754b to
b2cc6d1
Compare
This is an implementation of the discussed changes from an in-person meeting in October. It removes usage of scratch space in batch validation and replaces it with internal malloc usage. It also adds an ABCD cost model for algorithm selection.
The API and internals follow the drafted spec from the meeting very closely: https://gist.github.com/fjahr/c2a009487dffe7a1fbf17ca1821976ca There are few minor changes that should not change the intended behavior. The test coverage may be currently a bit lower than it was previously. I am guessing an adapted form of the
test_ecmult_multitest should be added back and there are some TODOs left in the test code which I am planning to address after a first round of conceptual feedback.The second commit demonstrates the calibration tooling that I have been using though it's not exactly what has given me the results that are in the PR. I have still been struggling with the calibration code and seem to never really get a result that just works without manual tweaking. The calibration itself as well as the code added there thus is rather a work in progress. I am assuming some version of the calibration code should be added to the repo and I haven't thought much about what the best place to add it is. Putting it into the benchmark and combining it with the python script was just a convenient way for experimentation. I am very open to suggestions on how to change this.