Skip to content

Conversation

@loyd
Copy link
Owner

@loyd loyd commented Dec 29, 2024

Replace custom i256/u256 impl with the i256 crate.

i9-14900K:

group                                      new                                    master
-----                                      ---                                    ------
F128p18/rdiv (~1e5/~1e4, Ceil)             1.00     10.7±0.15ns        ? ?/sec    6.02     64.3±0.15ns        ? ?/sec
F128p18/rdiv (~1e5/~1e4, Floor)            1.00     10.4±0.15ns        ? ?/sec    6.17     64.2±0.22ns        ? ?/sec
F128p18/rdiv (~1e5/~1e4, Nearest)          1.00     11.2±0.16ns        ? ?/sec    5.75     64.6±0.16ns        ? ?/sec
F128p18/rmul (~1e4, Ceil)                  1.00      7.0±0.04ns        ? ?/sec    7.71     54.4±0.36ns        ? ?/sec
F128p18/rmul (~1e4, Floor)                 1.00      7.0±0.02ns        ? ?/sec    7.70     54.3±0.34ns        ? ?/sec
F128p18/rmul (~1e4, Nearest)               1.00      7.2±0.06ns        ? ?/sec    7.60     54.4±0.19ns        ? ?/sec
F128p18/rsqrt (MAX, Ceil)                  1.00     40.2±0.28ns        ? ?/sec    13.43   540.0±9.43ns        ? ?/sec
F128p18/rsqrt (MAX, Floor)                 1.00     39.3±0.27ns        ? ?/sec    13.35   524.8±1.47ns        ? ?/sec
F128p18/rsqrt (MAX, Nearest)               1.00     41.4±0.38ns        ? ?/sec    13.17  545.4±10.95ns        ? ?/sec
F128p18/rsqrt (adaptive, Ceil)             1.00     50.0±0.42ns        ? ?/sec    10.61   530.8±2.11ns        ? ?/sec
F128p18/rsqrt (adaptive, Floor)            1.00     49.2±0.42ns        ? ?/sec    10.55   519.0±1.83ns        ? ?/sec
F128p18/rsqrt (adaptive, Nearest)          1.00     50.6±0.38ns        ? ?/sec    10.83   547.8±2.17ns        ? ?/sec
F128p18/rsqrt (~1e4, Ceil)                 1.00     40.0±0.24ns        ? ?/sec    5.91    236.3±0.59ns        ? ?/sec
F128p18/rsqrt (~1e4, Floor)                1.00     39.4±0.28ns        ? ?/sec    5.68    223.9±3.09ns        ? ?/sec
F128p18/rsqrt (~1e4, Nearest)              1.00     41.2±0.28ns        ? ?/sec    5.84    240.5±0.77ns        ? ?/sec
F64p9/rsqrt (MAX, Ceil)                    1.00      1.0±0.01ns        ? ?/sec    31.48    31.6±0.09ns        ? ?/sec
F64p9/rsqrt (MAX, Floor)                   1.00      1.0±0.01ns        ? ?/sec    30.12    30.2±0.29ns        ? ?/sec
F64p9/rsqrt (MAX, Nearest)                 1.00      1.0±0.01ns        ? ?/sec    33.35    33.6±0.24ns        ? ?/sec
F64p9/rsqrt (adaptive, Ceil)               1.00      5.4±0.02ns        ? ?/sec    5.69     30.5±0.10ns        ? ?/sec
F64p9/rsqrt (adaptive, Floor)              1.00      4.9±0.01ns        ? ?/sec    5.96     29.1±0.67ns        ? ?/sec
F64p9/rsqrt (adaptive, Nearest)            1.00      5.5±0.02ns        ? ?/sec    5.92     32.6±0.58ns        ? ?/sec
F64p9/rsqrt (~1e4, Ceil)                   1.00      1.0±0.02ns        ? ?/sec    13.55    13.8±0.05ns        ? ?/sec
F64p9/rsqrt (~1e4, Floor)                  1.00      1.0±0.02ns        ? ?/sec    12.47    12.7±0.03ns        ? ?/sec
F64p9/rsqrt (~1e4, Nearest)                1.00      1.0±0.03ns        ? ?/sec    15.04    15.5±0.05ns        ? ?/sec

@loyd
Copy link
Owner Author

loyd commented Dec 29, 2024

Need to update and check after Alexhuszagh/i256-rs#40

@loyd loyd force-pushed the perf/i256 branch 4 times, most recently from cbc690c to ca9ea48 Compare December 30, 2024 16:52
@loyd loyd merged commit 2c4ab7c into master Dec 30, 2024
5 checks passed
@loyd loyd deleted the perf/i256 branch December 30, 2024 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants