Skip to content

Commit 2d785b4

Browse files
im-0geo-ant
andauthored
Make benchmarks measure an actual computation (#1549)
* test: `codegen-units = 1` for benchmarks During benchmarking I found that `codegen-units` with default value leads to inconsistent results across recompilations (clean vs. incremental). Also, sometimes it leads to a significant performance degradation of benchmarks unrelated to code changes. Also see https://github.com/rust-lang/rust/issues/146497 * chore: Bump criterion to version 0.7 * chore: Remove unused bench_binop_fn!(), bench_unop_na!() and bench_construction!() * fix: Use iter_batched() and iter_batched_ref() in bench macros Criterion generates a `Vec` of arguments and passes them through the `black_box()` to guarantee that the benchmark closure is never optimized out of the benchmarking loop. This fixes https://github.com/dimforge/nalgebra/issues/1547 for benchmarks that use `bench_*!()` macros. * feat: Add macros to benchmark Single x N Values binary ops This simulates real-world use cases like multiplication of many vectors by a single matrix. There is a ~2x performance difference between a case when both arguments are random on each iteration and a case when one argument is static and second is random on each iteration: mat2_mul_v time: [778.33 ps 785.41 ps 797.70 ps] Found 14 outliers among 100 measurements (14.00%) 5 (5.00%) low severe 4 (4.00%) high mild 5 (5.00%) high severe mat3_mul_v time: [1.7001 ns 1.7051 ns 1.7111 ns] Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) low severe 1 (1.00%) low mild 8 (8.00%) high mild 1 (1.00%) high severe mat4_mul_v time: [2.6101 ns 2.6223 ns 2.6374 ns] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 3 (3.00%) high mild 4 (4.00%) high severe single_mat2_mul_v time: [402.65 ps 403.62 ps 404.75 ps] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) low mild 5 (5.00%) high mild 3 (3.00%) high severe single_mat3_mul_v time: [651.30 ps 654.06 ps 657.15 ps] Found 15 outliers among 100 measurements (15.00%) 3 (3.00%) low mild 8 (8.00%) high mild 4 (4.00%) high severe single_mat4_mul_v time: [1.0628 ns 1.0645 ns 1.0666 ns] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe mat2_tr_mul_v time: [719.81 ps 721.99 ps 724.59 ps] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low mild 5 (5.00%) high mild mat3_tr_mul_v time: [1.6685 ns 1.6758 ns 1.6841 ns] Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) low severe 1 (1.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe mat4_tr_mul_v time: [2.6739 ns 2.6897 ns 2.7080 ns] Found 16 outliers among 100 measurements (16.00%) 2 (2.00%) low severe 2 (2.00%) low mild 4 (4.00%) high mild 8 (8.00%) high severe single_mat2_tr_mul_v time: [353.36 ps 354.56 ps 356.03 ps] Found 6 outliers among 100 measurements (6.00%) 2 (2.00%) low mild 1 (1.00%) high mild 3 (3.00%) high severe single_mat3_tr_mul_v time: [779.82 ps 782.84 ps 786.37 ps] Found 10 outliers among 100 measurements (10.00%) 1 (1.00%) low severe 1 (1.00%) low mild 6 (6.00%) high mild 2 (2.00%) high severe single_mat4_tr_mul_v time: [1.1918 ns 1.1946 ns 1.1977 ns] Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) low mild 1 (1.00%) high mild 2 (2.00%) high severe unit_quaternion_mul_v time: [1.5002 ns 1.5088 ns 1.5183 ns] change: [−0.0578% +0.3775% +0.8498%] (p = 0.10 > 0.05) No change in performance detected. Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) high mild 3 (3.00%) high severe single_unit_quaternion_mul_v time: [1.0489 ns 1.0531 ns 1.0584 ns] Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) low severe 1 (1.00%) low mild 4 (4.00%) high mild 7 (7.00%) high severe * chore: Uncomment quaternion benchmarks I do not know why those benchmarks were commented out. * fix: Use iter_batched() and iter_batched_ref() for the remaining benchmarks The bulk of the changes was done Claude Sonnet 4. Additionally I moved `DVector` allocations outside of the benchmark, and added anything allocated and not consumed into a return tuple of a benchmark closure to ensure that implicit drop/free is not included into the measured time. This fixes https://github.com/dimforge/nalgebra/issues/1547 for the remaining benchmarks. Benchmark results before vs. after all changes: mat2_mul_m time: [1.1043 ns 1.1058 ns 1.1077 ns] change: [+49.306% +49.651% +50.045%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 4 (4.00%) low severe 2 (2.00%) high mild 6 (6.00%) high severe mat3_mul_m time: [3.1885 ns 3.1945 ns 3.2038 ns] change: [+102.62% +103.63% +104.86%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe mat4_mul_m time: [6.7759 ns 6.7840 ns 6.7929 ns] change: [+130.65% +131.50% +132.59%] (p = 0.00 < 0.05) Performance has regressed. Found 11 outliers among 100 measurements (11.00%) 4 (4.00%) low severe 3 (3.00%) high mild 4 (4.00%) high severe mat2_tr_mul_m time: [1.2882 ns 1.2901 ns 1.2926 ns] change: [+75.005% +75.472% +75.928%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 3 (3.00%) low severe 1 (1.00%) high mild 3 (3.00%) high severe mat3_tr_mul_m time: [3.1688 ns 3.1725 ns 3.1770 ns] change: [+101.61% +102.10% +102.66%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) low severe 4 (4.00%) high mild 4 (4.00%) high severe mat4_tr_mul_m time: [6.5406 ns 6.5453 ns 6.5508 ns] change: [+121.95% +122.66% +123.42%] (p = 0.00 < 0.05) Performance has regressed. Found 15 outliers among 100 measurements (15.00%) 3 (3.00%) low severe 1 (1.00%) low mild 5 (5.00%) high mild 6 (6.00%) high severe mat2_add_m time: [644.68 ps 645.88 ps 647.24 ps] change: [−13.049% −12.530% −11.972%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) low severe 1 (1.00%) low mild 1 (1.00%) high mild 2 (2.00%) high severe mat3_add_m time: [1.3543 ns 1.3572 ns 1.3607 ns] change: [−14.707% −13.705% −12.403%] (p = 0.00 < 0.05) Performance has improved. Found 15 outliers among 100 measurements (15.00%) 6 (6.00%) low severe 5 (5.00%) high mild 4 (4.00%) high severe mat4_add_m time: [2.3987 ns 2.4015 ns 2.4044 ns] change: [−20.676% −19.615% −18.453%] (p = 0.00 < 0.05) Performance has improved. Found 14 outliers among 100 measurements (14.00%) 6 (6.00%) low severe 5 (5.00%) high mild 3 (3.00%) high severe mat2_sub_m time: [637.47 ps 638.88 ps 640.62 ps] change: [−13.604% −13.020% −12.333%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) low severe 2 (2.00%) low mild 2 (2.00%) high mild 5 (5.00%) high severe mat3_sub_m time: [1.3531 ns 1.3546 ns 1.3562 ns] change: [−15.139% −14.610% −14.084%] (p = 0.00 < 0.05) Performance has improved. Found 16 outliers among 100 measurements (16.00%) 5 (5.00%) low severe 1 (1.00%) low mild 6 (6.00%) high mild 4 (4.00%) high severe mat4_sub_m time: [2.3972 ns 2.3996 ns 2.4021 ns] change: [−20.412% −19.249% −18.330%] (p = 0.00 < 0.05) Performance has improved. Found 10 outliers among 100 measurements (10.00%) 6 (6.00%) low severe 1 (1.00%) high mild 3 (3.00%) high severe mat2_mul_v time: [774.43 ps 775.48 ps 776.73 ps] change: [+144.90% +145.51% +146.12%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) low severe 5 (5.00%) high mild 3 (3.00%) high severe mat3_mul_v time: [1.6843 ns 1.6858 ns 1.6874 ns] change: [+284.57% +285.82% +287.43%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 3 (3.00%) low severe 1 (1.00%) high mild 3 (3.00%) high severe mat4_mul_v time: [2.6029 ns 2.6196 ns 2.6485 ns] change: [+255.34% +257.62% +261.68%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) low severe 1 (1.00%) low mild 2 (2.00%) high mild 5 (5.00%) high severe single_mat2_mul_v time: [392.29 ps 393.45 ps 394.87 ps] Found 8 outliers among 100 measurements (8.00%) 6 (6.00%) high mild 2 (2.00%) high severe single_mat3_mul_v time: [650.16 ps 651.47 ps 653.07 ps] Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) low severe 3 (3.00%) high mild 4 (4.00%) high severe single_mat4_mul_v time: [1.0665 ns 1.0690 ns 1.0722 ns] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe mat2_tr_mul_v time: [719.95 ps 720.92 ps 722.16 ps] change: [+127.86% +128.34% +128.98%] (p = 0.00 < 0.05) Performance has regressed. Found 14 outliers among 100 measurements (14.00%) 1 (1.00%) low severe 2 (2.00%) low mild 7 (7.00%) high mild 4 (4.00%) high severe mat3_tr_mul_v time: [1.6551 ns 1.6564 ns 1.6577 ns] change: [+277.57% +278.32% +279.16%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) low severe 1 (1.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe mat4_tr_mul_v time: [2.6477 ns 2.6546 ns 2.6666 ns] change: [+259.47% +260.55% +261.67%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 3 (3.00%) low severe 3 (3.00%) high mild 3 (3.00%) high severe single_mat2_tr_mul_v time: [353.60 ps 355.50 ps 358.48 ps] Found 10 outliers among 100 measurements (10.00%) 3 (3.00%) low mild 4 (4.00%) high mild 3 (3.00%) high severe single_mat3_tr_mul_v time: [778.13 ps 779.43 ps 781.25 ps] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) low severe 3 (3.00%) high mild 5 (5.00%) high severe single_mat4_tr_mul_v time: [1.1887 ns 1.1906 ns 1.1930 ns] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe mat2_mul_s time: [774.44 ps 775.33 ps 776.37 ps] change: [+6.0947% +6.3308% +6.5936%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 2 (2.00%) low severe 2 (2.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe mat3_mul_s time: [962.59 ps 964.98 ps 967.43 ps] change: [−38.097% −37.694% −37.145%] (p = 0.00 < 0.05) Performance has improved. Found 10 outliers among 100 measurements (10.00%) 1 (1.00%) low severe 3 (3.00%) low mild 2 (2.00%) high mild 4 (4.00%) high severe mat4_mul_s time: [1.6589 ns 1.6640 ns 1.6684 ns] change: [−43.668% −43.130% −42.518%] (p = 0.00 < 0.05) Performance has improved. Found 18 outliers among 100 measurements (18.00%) 8 (8.00%) low severe 3 (3.00%) low mild 1 (1.00%) high mild 6 (6.00%) high severe mat2_div_s time: [803.09 ps 804.70 ps 806.56 ps] change: [+10.272% +10.596% +10.960%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 3 (3.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 3 (3.00%) high severe mat3_div_s time: [2.4929 ns 2.4947 ns 2.4967 ns] change: [+58.793% +59.185% +59.709%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 3 (3.00%) low severe 5 (5.00%) high mild 4 (4.00%) high severe mat4_div_s time: [5.1650 ns 5.1688 ns 5.1735 ns] change: [+76.816% +77.215% +77.629%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) low severe 1 (1.00%) low mild 4 (4.00%) high mild 2 (2.00%) high severe mat2_inv time: [1.1514 ns 1.1523 ns 1.1533 ns] change: [−41.682% −41.556% −41.439%] (p = 0.00 < 0.05) Performance has improved. Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) low severe 1 (1.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe mat3_inv time: [3.3641 ns 3.3707 ns 3.3826 ns] change: [−37.473% −37.358% −37.214%] (p = 0.00 < 0.05) Performance has improved. Found 12 outliers among 100 measurements (12.00%) 1 (1.00%) low severe 1 (1.00%) low mild 5 (5.00%) high mild 5 (5.00%) high severe mat4_inv time: [25.970 ns 26.006 ns 26.062 ns] change: [−9.0865% −8.9013% −8.6986%] (p = 0.00 < 0.05) Performance has improved. Found 14 outliers among 100 measurements (14.00%) 3 (3.00%) low severe 2 (2.00%) low mild 3 (3.00%) high mild 6 (6.00%) high severe mat2_transpose time: [409.94 ps 410.77 ps 411.75 ps] change: [−62.889% −62.624% −62.331%] (p = 0.00 < 0.05) Performance has improved. Found 17 outliers among 100 measurements (17.00%) 4 (4.00%) low severe 2 (2.00%) low mild 4 (4.00%) high mild 7 (7.00%) high severe mat3_transpose time: [947.42 ps 953.20 ps 961.97 ps] change: [−61.273% −60.195% −58.616%] (p = 0.00 < 0.05) Performance has improved. Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) low mild 7 (7.00%) high mild 3 (3.00%) high severe mat4_transpose time: [1.6510 ns 1.6551 ns 1.6612 ns] change: [−65.877% −65.592% −65.225%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) low severe 1 (1.00%) low mild 2 (2.00%) high mild 5 (5.00%) high severe mat_div_scalar time: [480.25 µs 480.55 µs 480.99 µs] change: [−22.235% −22.169% −22.095%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) high mild 3 (3.00%) high severe mat100_add_mat100 time: [3.0426 µs 3.0910 µs 3.1351 µs] change: [+81.145% +84.392% +88.112%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 2 (2.00%) low severe 3 (3.00%) low mild 7 (7.00%) high mild 1 (1.00%) high severe mat4_mul_mat4 time: [36.836 ns 36.859 ns 36.886 ns] change: [+24.966% +25.568% +26.171%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 7 (7.00%) low severe 4 (4.00%) high mild 2 (2.00%) high severe mat5_mul_mat5 time: [56.715 ns 56.876 ns 57.015 ns] change: [+10.239% +10.666% +11.091%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low severe 1 (1.00%) low mild 6 (6.00%) high mild mat6_mul_mat6 time: [83.817 ns 83.999 ns 84.156 ns] change: [+10.675% +10.890% +11.065%] (p = 0.00 < 0.05) Performance has regressed. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild mat7_mul_mat7 time: [93.211 ns 93.386 ns 93.534 ns] change: [+10.654% +10.892% +11.129%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low severe 2 (2.00%) low mild mat8_mul_mat8 time: [88.919 ns 89.410 ns 89.884 ns] change: [+22.808% +23.376% +23.888%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high mild mat9_mul_mat9 time: [207.12 ns 209.04 ns 211.17 ns] change: [+14.053% +14.646% +15.258%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 9 (9.00%) low mild 1 (1.00%) high mild mat10_mul_mat10 time: [236.75 ns 237.11 ns 237.47 ns] change: [+20.055% +20.366% +20.651%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) low severe 7 (7.00%) low mild 1 (1.00%) high mild mat10_mul_mat10_static time: [116.68 ns 117.15 ns 117.62 ns] change: [+11.160% +11.617% +12.049%] (p = 0.00 < 0.05) Performance has regressed. mat100_mul_mat100 time: [40.188 µs 40.327 µs 40.459 µs] change: [+3.2490% +3.4765% +3.7130%] (p = 0.00 < 0.05) Performance has regressed. Found 15 outliers among 100 measurements (15.00%) 7 (7.00%) high mild 8 (8.00%) high severe mat500_mul_mat500 time: [4.3909 ms 4.3944 ms 4.3978 ms] change: [+0.8556% +0.9519% +1.0448%] (p = 0.00 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 6 (6.00%) low severe 2 (2.00%) high mild 1 (1.00%) high severe iter time: [840.01 µs 840.39 µs 840.81 µs] change: [+10.527% +10.726% +10.915%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 2 (2.00%) high mild 11 (11.00%) high severe iter_rev time: [210.14 µs 211.10 µs 212.84 µs] change: [+0.2455% +0.7119% +1.7846%] (p = 0.02 < 0.05) Change within noise threshold. Found 8 outliers among 100 measurements (8.00%) 2 (2.00%) high mild 6 (6.00%) high severe copy_from time: [199.77 µs 200.80 µs 202.55 µs] change: [+41.195% +41.962% +43.287%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 8 (8.00%) low mild 1 (1.00%) high severe axpy time: [31.301 µs 33.301 µs 34.957 µs] change: [+40.726% +52.001% +63.112%] (p = 0.00 < 0.05) Performance has regressed. tr_mul_to time: [126.46 µs 127.12 µs 128.09 µs] change: [−4.0124% −3.5145% −2.7708%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high severe mat_mul_mat time: [39.252 µs 39.443 µs 39.626 µs] change: [−0.7084% −0.3800% −0.0130%] (p = 0.02 < 0.05) Change within noise threshold. Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) low mild 8 (8.00%) high mild 2 (2.00%) high severe mat100_from_fn time: [6.8398 µs 6.8418 µs 6.8446 µs] change: [+519.35% +522.43% +524.76%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) high mild 9 (9.00%) high severe mat500_from_fn time: [172.11 µs 172.14 µs 172.18 µs] change: [+498.70% +499.32% +499.93%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 1 (1.00%) low mild 5 (5.00%) high mild 7 (7.00%) high severe vec2_add_v_f32 time: [303.98 ps 304.76 ps 305.65 ps] change: [−5.1499% −4.3536% −3.5996%] (p = 0.00 < 0.05) Performance has improved. Found 15 outliers among 100 measurements (15.00%) 4 (4.00%) low severe 5 (5.00%) high mild 6 (6.00%) high severe vec3_add_v_f32 time: [586.36 ps 587.93 ps 589.92 ps] change: [+34.275% +34.886% +35.631%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 1 (1.00%) low mild 5 (5.00%) high mild 6 (6.00%) high severe vec4_add_v_f32 time: [603.45 ps 604.44 ps 605.59 ps] change: [−18.949% −18.215% −17.623%] (p = 0.00 < 0.05) Performance has improved. Found 14 outliers among 100 measurements (14.00%) 5 (5.00%) low severe 2 (2.00%) low mild 2 (2.00%) high mild 5 (5.00%) high severe vec2_add_v_f64 time: [602.08 ps 602.83 ps 603.64 ps] change: [+89.139% +90.573% +91.808%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 5 (5.00%) high severe vec3_add_v_f64 time: [910.94 ps 912.60 ps 914.56 ps] change: [+107.10% +108.18% +109.41%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 3 (3.00%) low severe 6 (6.00%) high mild 3 (3.00%) high severe vec4_add_v_f64 time: [1.1894 ns 1.1933 ns 1.1963 ns] change: [+82.607% +85.023% +86.911%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 9 (9.00%) low severe 2 (2.00%) low mild 2 (2.00%) high severe vec2_sub_v time: [303.45 ps 304.42 ps 305.37 ps] change: [−5.3598% −4.4578% −3.6738%] (p = 0.00 < 0.05) Performance has improved. Found 15 outliers among 100 measurements (15.00%) 8 (8.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 3 (3.00%) high severe vec3_sub_v time: [672.95 ps 674.82 ps 676.51 ps] change: [+51.463% +52.336% +53.346%] (p = 0.00 < 0.05) Performance has regressed. Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) low mild 2 (2.00%) high mild 1 (1.00%) high severe vec4_sub_v time: [602.84 ps 604.65 ps 607.70 ps] change: [−19.744% −18.754% −17.881%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 6 (6.00%) low severe 1 (1.00%) low mild 2 (2.00%) high mild 4 (4.00%) high severe vec2_mul_s time: [666.49 ps 667.29 ps 668.31 ps] change: [+111.37% +111.81% +112.32%] (p = 0.00 < 0.05) Performance has regressed. Found 16 outliers among 100 measurements (16.00%) 4 (4.00%) low severe 6 (6.00%) high mild 6 (6.00%) high severe vec3_mul_s time: [511.42 ps 513.44 ps 515.86 ps] change: [+15.556% +16.273% +17.049%] (p = 0.00 < 0.05) Performance has regressed. Found 6 outliers among 100 measurements (6.00%) 5 (5.00%) high mild 1 (1.00%) high severe vec4_mul_s time: [774.13 ps 775.22 ps 776.52 ps] change: [+5.1602% +5.5545% +6.0225%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 1 (1.00%) low severe 2 (2.00%) low mild 3 (3.00%) high mild 7 (7.00%) high severe vec2_div_s time: [1.3658 ns 1.3694 ns 1.3726 ns] change: [+328.67% +329.83% +331.09%] (p = 0.00 < 0.05) Performance has regressed. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe vec3_div_s time: [607.73 ps 608.63 ps 609.66 ps] change: [+37.642% +38.017% +38.440%] (p = 0.00 < 0.05) Performance has regressed. Found 16 outliers among 100 measurements (16.00%) 2 (2.00%) low severe 8 (8.00%) high mild 6 (6.00%) high severe vec4_div_s time: [802.59 ps 803.62 ps 804.82 ps] change: [+8.9451% +9.3240% +9.7149%] (p = 0.00 < 0.05) Performance has regressed. Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) low severe 6 (6.00%) high mild 2 (2.00%) high severe vec2_dot_f32 time: [461.20 ps 461.73 ps 462.30 ps] change: [+117.88% +119.27% +120.79%] (p = 0.00 < 0.05) Performance has regressed. Found 16 outliers among 100 measurements (16.00%) 2 (2.00%) low severe 2 (2.00%) low mild 3 (3.00%) high mild 9 (9.00%) high severe vec3_dot_f32 time: [688.24 ps 689.05 ps 689.95 ps] change: [+225.49% +227.19% +229.16%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 1 (1.00%) low mild 4 (4.00%) high mild 5 (5.00%) high severe vec4_dot_f32 time: [917.20 ps 921.23 ps 928.57 ps] change: [+338.59% +341.30% +344.17%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 8 (8.00%) high mild 5 (5.00%) high severe vec2_dot_f64 time: [596.11 ps 597.51 ps 598.79 ps] change: [+177.79% +179.60% +182.13%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe vec3_dot_f64 time: [749.32 ps 751.02 ps 752.81 ps] change: [+253.48% +257.12% +262.11%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 3 (3.00%) high mild 7 (7.00%) high severe vec4_dot_f64 time: [1.0145 ns 1.0185 ns 1.0230 ns] change: [+376.34% +379.47% +383.46%] (p = 0.00 < 0.05) Performance has regressed. Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe vec3_cross time: [971.01 ps 971.87 ps 972.73 ps] change: [+122.34% +122.74% +123.17%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 4 (4.00%) high severe vec2_norm time: [1.0612 ns 1.0623 ns 1.0637 ns] change: [−0.0722% +0.0499% +0.1765%] (p = 0.44 > 0.05) No change in performance detected. Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) low mild 2 (2.00%) high severe vec3_norm time: [1.0649 ns 1.0665 ns 1.0694 ns] change: [−4.3787% −4.1856% −3.8679%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe vec4_norm time: [1.0733 ns 1.0739 ns 1.0746 ns] change: [−4.5616% −3.9738% −2.9157%] (p = 0.00 < 0.05) Performance has improved. Found 19 outliers among 100 measurements (19.00%) 2 (2.00%) low severe 7 (7.00%) low mild 5 (5.00%) high mild 5 (5.00%) high severe vec2_normalize time: [2.5310 ns 2.5326 ns 2.5345 ns] change: [+3.5769% +3.6696% +3.7678%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe vec3_normalize time: [2.5389 ns 2.5409 ns 2.5424 ns] change: [+1.1411% +1.2860% +1.4910%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe vec4_normalize time: [1.8154 ns 1.8164 ns 1.8173 ns] change: [−1.1191% −0.9926% −0.8485%] (p = 0.00 < 0.05) Change within noise threshold. Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low severe 1 (1.00%) low mild 1 (1.00%) high mild 3 (3.00%) high severe vec10000_dot_f64 time: [2.0296 µs 2.0337 µs 2.0383 µs] change: [+71.107% +72.619% +74.228%] (p = 0.00 < 0.05) Performance has regressed. Found 11 outliers among 100 measurements (11.00%) 4 (4.00%) low severe 3 (3.00%) high mild 4 (4.00%) high severe vec10000_dot_f32 time: [1.1891 µs 1.1926 µs 1.1962 µs] change: [+6.3585% +7.1059% +7.9357%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 1 (1.00%) low severe 1 (1.00%) low mild 4 (4.00%) high mild 6 (6.00%) high severe vec10000_axpy_f64 time: [2.0702 µs 2.0739 µs 2.0777 µs] change: [+39.373% +40.227% +41.210%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 3 (3.00%) low severe 1 (1.00%) low mild 4 (4.00%) high mild 2 (2.00%) high severe vec10000_axpy_beta_f64 time: [2.0914 µs 2.0962 µs 2.1012 µs] change: [+31.958% +32.843% +33.467%] (p = 0.00 < 0.05) Performance has regressed. Found 11 outliers among 100 measurements (11.00%) 4 (4.00%) low severe 5 (5.00%) high mild 2 (2.00%) high severe vec10000_axpy_f64_slice time: [2.0272 µs 2.0303 µs 2.0335 µs] change: [+35.880% +36.621% +37.307%] (p = 0.00 < 0.05) Performance has regressed. Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) low severe 2 (2.00%) high mild 1 (1.00%) high severe vec10000_axpy_f64_static time: [13.917 µs 13.965 µs 14.005 µs] change: [+859.61% +869.73% +879.35%] (p = 0.00 < 0.05) Performance has regressed. Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) low severe 3 (3.00%) high mild 2 (2.00%) high severe vec10000_axpy_f32 time: [1.0402 µs 1.0421 µs 1.0437 µs] change: [+38.710% +39.603% +40.363%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 5 (5.00%) low severe 1 (1.00%) low mild 2 (2.00%) high mild 1 (1.00%) high severe vec10000_axpy_beta_f32 time: [1.0329 µs 1.0346 µs 1.0364 µs] change: [+30.705% +31.490% +32.040%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) low severe 1 (1.00%) low mild 2 (2.00%) high mild 1 (1.00%) high severe quaternion_add_q time: [642.58 ps 650.39 ps 662.45 ps] change: [−11.788% −10.934% −9.9463%] (p = 0.00 < 0.05) Performance has improved. Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) low severe 2 (2.00%) low mild 4 (4.00%) high mild 6 (6.00%) high severe quaternion_sub_q time: [641.16 ps 643.22 ps 645.88 ps] change: [−12.654% −11.822% −10.943%] (p = 0.00 < 0.05) Performance has improved. Found 15 outliers among 100 measurements (15.00%) 5 (5.00%) low severe 1 (1.00%) low mild 5 (5.00%) high mild 4 (4.00%) high severe quaternion_mul_q time: [1.4252 ns 1.4271 ns 1.4294 ns] change: [+94.545% +95.022% +95.499%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 1 (1.00%) low severe 2 (2.00%) low mild 4 (4.00%) high mild 5 (5.00%) high severe unit_quaternion_mul_v time: [1.4859 ns 1.4874 ns 1.4890 ns] change: [+242.77% +243.56% +244.31%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild single_unit_quaternion_mul_v time: [1.0422 ns 1.0457 ns 1.0504 ns] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low severe 4 (4.00%) high mild 4 (4.00%) high severe quaternion_mul_s time: [771.17 ps 772.18 ps 773.37 ps] change: [+6.1278% +6.4276% +6.7583%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 3 (3.00%) low mild 3 (3.00%) high mild 3 (3.00%) high severe quaternion_div_s time: [798.54 ps 799.82 ps 801.43 ps] change: [+9.2123% +9.7287% +10.338%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 2 (2.00%) low severe 2 (2.00%) low mild 4 (4.00%) high mild 5 (5.00%) high severe quaternion_inv time: [1.2401 ns 1.2408 ns 1.2417 ns] change: [−43.660% −43.521% −43.317%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 2 (2.00%) low severe 5 (5.00%) high mild 6 (6.00%) high severe unit_quaternion_inv time: [596.01 ps 598.93 ps 602.66 ps] change: [−49.707% −49.184% −48.445%] (p = 0.00 < 0.05) Performance has improved. Found 15 outliers among 100 measurements (15.00%) 6 (6.00%) high mild 9 (9.00%) high severe quaternion_conjugate time: [604.36 ps 608.60 ps 613.48 ps] Found 12 outliers among 100 measurements (12.00%) 3 (3.00%) high mild 9 (9.00%) high severe quaternion_normalize time: [1.8268 ns 1.8274 ns 1.8281 ns] Found 18 outliers among 100 measurements (18.00%) 4 (4.00%) low severe 4 (4.00%) low mild 7 (7.00%) high mild 3 (3.00%) high severe bidiagonalize_100x100 time: [265.91 µs 266.00 µs 266.11 µs] change: [+0.7553% +0.8363% +0.9114%] (p = 0.00 < 0.05) Change within noise threshold. Found 8 outliers among 100 measurements (8.00%) 5 (5.00%) high mild 3 (3.00%) high severe bidiagonalize_100x500 time: [2.0053 ms 2.0060 ms 2.0065 ms] change: [+4.0325% +4.2372% +4.3938%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) low severe 2 (2.00%) high mild 5 (5.00%) high severe bidiagonalize_4x4 time: [266.92 ns 267.24 ns 267.62 ns] change: [+7.1063% +7.2057% +7.3231%] (p = 0.00 < 0.05) Performance has regressed. Found 23 outliers among 100 measurements (23.00%) 1 (1.00%) low severe 5 (5.00%) low mild 13 (13.00%) high mild 4 (4.00%) high severe Benchmarking bidiagonalize_500x100: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.1s, enable flat sampling, or reduce sample count to 50. bidiagonalize_500x100 time: [1.6781 ms 1.6793 ms 1.6804 ms] change: [+1.3944% +1.5312% +1.6400%] (p = 0.00 < 0.05) Performance has regressed. bidiagonalize_unpack_100x100 time: [522.13 µs 522.36 µs 522.63 µs] change: [−0.5318% −0.4044% −0.2627%] (p = 0.00 < 0.05) Change within noise threshold. Found 12 outliers among 100 measurements (12.00%) 1 (1.00%) low mild 4 (4.00%) high mild 7 (7.00%) high severe bidiagonalize_unpack_100x500 time: [2.9858 ms 2.9916 ms 2.9976 ms] change: [−0.7824% −0.3995% −0.0370%] (p = 0.04 < 0.05) Change within noise threshold. bidiagonalize_unpack_500x100 time: [2.5884 ms 2.5896 ms 2.5910 ms] change: [+0.0767% +0.1539% +0.2316%] (p = 0.00 < 0.05) Change within noise threshold. cholesky_100x100 time: [31.084 µs 31.101 µs 31.122 µs] change: [−5.0365% −4.7949% −4.4205%] (p = 0.00 < 0.05) Performance has improved. Found 16 outliers among 100 measurements (16.00%) 2 (2.00%) low severe 4 (4.00%) low mild 1 (1.00%) high mild 9 (9.00%) high severe cholesky_500x500 time: [4.4799 ms 4.4849 ms 4.4903 ms] change: [−0.5985% −0.3685% −0.1374%] (p = 0.00 < 0.05) Change within noise threshold. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe cholesky_decompose_unpack_100x100 time: [31.659 µs 31.685 µs 31.727 µs] change: [−4.9712% −4.7445% −4.3325%] (p = 0.00 < 0.05) Performance has improved. Found 15 outliers among 100 measurements (15.00%) 4 (4.00%) low severe 4 (4.00%) low mild 2 (2.00%) high mild 5 (5.00%) high severe cholesky_decompose_unpack_500x500 time: [4.4795 ms 4.4845 ms 4.4910 ms] change: [−1.9595% −1.7121% −1.4978%] (p = 0.00 < 0.05) Performance has improved. Found 14 outliers among 100 measurements (14.00%) 3 (3.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 7 (7.00%) high severe cholesky_solve_10x10 time: [170.70 ns 170.76 ns 170.82 ns] change: [+8.0936% +8.1777% +8.2764%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 3 (3.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe cholesky_solve_100x100 time: [2.9071 µs 2.9117 µs 2.9174 µs] change: [+8.4770% +8.9956% +9.6254%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low severe 3 (3.00%) low mild 2 (2.00%) high mild 1 (1.00%) high severe cholesky_solve_500x500 time: [54.193 µs 54.303 µs 54.417 µs] change: [+3.9332% +4.1755% +4.4477%] (p = 0.00 < 0.05) Performance has regressed. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild cholesky_inverse_10x10 time: [1.3189 µs 1.3195 µs 1.3201 µs] change: [+2.5360% +2.6238% +2.7131%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) high mild 5 (5.00%) high severe cholesky_inverse_100x100 time: [270.85 µs 270.88 µs 270.92 µs] change: [−0.9726% −0.8524% −0.7319%] (p = 0.00 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low severe 4 (4.00%) low mild 2 (2.00%) high mild 2 (2.00%) high severe cholesky_inverse_500x500 time: [26.673 ms 26.694 ms 26.714 ms] change: [+1.0784% +1.1816% +1.2794%] (p = 0.00 < 0.05) Performance has regressed. Found 23 outliers among 100 measurements (23.00%) 19 (19.00%) low severe 2 (2.00%) low mild 2 (2.00%) high severe full_piv_lu_decompose_10x10 time: [582.31 ns 582.48 ns 582.67 ns] change: [+19.583% +19.702% +19.795%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) low severe 6 (6.00%) high mild 2 (2.00%) high severe full_piv_lu_decompose_100x100 time: [218.73 µs 218.78 µs 218.84 µs] change: [+5.8729% +5.9828% +6.0904%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 2 (2.00%) low severe 5 (5.00%) low mild 1 (1.00%) high severe full_piv_lu_solve_10x10 time: [124.88 ns 124.94 ns 125.02 ns] change: [+7.4724% +7.6252% +7.7787%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) low severe 6 (6.00%) high mild 4 (4.00%) high severe full_piv_lu_solve_100x100 time: [2.5202 µs 2.5244 µs 2.5289 µs] change: [+11.226% +11.847% +12.518%] (p = 0.00 < 0.05) Performance has regressed. Found 17 outliers among 100 measurements (17.00%) 14 (14.00%) low severe 2 (2.00%) low mild 1 (1.00%) high mild full_piv_lu_inverse_10x10 time: [869.61 ns 870.27 ns 871.19 ns] change: [+4.7996% +4.9224% +5.0608%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) low severe 1 (1.00%) high mild 4 (4.00%) high severe full_piv_lu_inverse_100x100 time: [212.68 µs 212.83 µs 213.05 µs] change: [−0.2835% −0.0351% +0.1310%] (p = 0.80 > 0.05) No change in performance detected. Found 13 outliers among 100 measurements (13.00%) 1 (1.00%) low severe 4 (4.00%) low mild 3 (3.00%) high mild 5 (5.00%) high severe full_piv_lu_determinant_10x10 time: [15.320 ns 15.338 ns 15.357 ns] change: [+410.70% +421.41% +430.41%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 9 (9.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild full_piv_lu_determinant_100x100 time: [137.44 ns 139.37 ns 141.00 ns] change: [+213.54% +227.75% +241.42%] (p = 0.00 < 0.05) Performance has regressed. hessenberg_decompose_4x4 time: [82.510 ns 82.538 ns 82.564 ns] change: [−27.950% −27.887% −27.830%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild hessenberg_decompose_100x100 time: [295.98 µs 296.16 µs 296.44 µs] change: [+3.3234% +3.5705% +3.7986%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 2 (2.00%) low mild 2 (2.00%) high mild 4 (4.00%) high severe hessenberg_decompose_200x200 time: [2.2647 ms 2.2681 ms 2.2714 ms] change: [+4.8426% +4.9983% +5.1646%] (p = 0.00 < 0.05) Performance has regressed. hessenberg_decompose_unpack_100x100 time: [435.30 µs 435.75 µs 436.12 µs] change: [+2.7479% +2.8420% +2.9424%] (p = 0.00 < 0.05) Performance has regressed. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe hessenberg_decompose_unpack_200x200 time: [3.2667 ms 3.2678 ms 3.2690 ms] change: [+3.9624% +4.0021% +4.0423%] (p = 0.00 < 0.05) Performance has regressed. Found 22 outliers among 100 measurements (22.00%) 13 (13.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 5 (5.00%) high severe lu_decompose_10x10 time: [353.04 ns 353.16 ns 353.31 ns] change: [−5.0408% −4.9435% −4.8487%] (p = 0.00 < 0.05) Performance has improved. Found 19 outliers among 100 measurements (19.00%) 4 (4.00%) low severe 4 (4.00%) low mild 6 (6.00%) high mild 5 (5.00%) high severe lu_decompose_100x100 time: [71.544 µs 71.560 µs 71.579 µs] change: [−1.7176% −1.6430% −1.5721%] (p = 0.00 < 0.05) Performance has improved. Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) low severe 2 (2.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe lu_solve_10x10 time: [115.42 ns 115.52 ns 115.61 ns] change: [+3.9363% +4.1024% +4.2557%] (p = 0.00 < 0.05) Performance has regressed. Found 15 outliers among 100 measurements (15.00%) 4 (4.00%) low severe 8 (8.00%) low mild 2 (2.00%) high mild 1 (1.00%) high severe lu_solve_100x100 time: [2.5152 µs 2.5190 µs 2.5225 µs] change: [+15.120% +15.625% +16.088%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 4 (4.00%) low severe 2 (2.00%) low mild 1 (1.00%) high mild lu_inverse_10x10 time: [902.55 ns 903.32 ns 903.97 ns] change: [+0.7407% +0.8734% +1.0263%] (p = 0.00 < 0.05) Change within noise threshold. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high severe lu_inverse_100x100 time: [216.21 µs 216.47 µs 216.80 µs] change: [−0.6663% −0.5584% −0.4316%] (p = 0.00 < 0.05) Change within noise threshold. Found 18 outliers among 100 measurements (18.00%) 2 (2.00%) low severe 4 (4.00%) low mild 5 (5.00%) high mild 7 (7.00%) high severe lu_determinant_10x10 time: [13.394 ns 13.481 ns 13.665 ns] change: [+508.98% +524.96% +543.53%] (p = 0.00 < 0.05) Performance has regressed. Found 14 outliers among 100 measurements (14.00%) 6 (6.00%) low severe 1 (1.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe lu_determinant_100x100 time: [149.12 ns 150.16 ns 151.08 ns] change: [+265.69% +281.86% +296.23%] (p = 0.00 < 0.05) Performance has regressed. Found 14 outliers among 100 measurements (14.00%) 10 (10.00%) low severe 4 (4.00%) low mild qr_decompose_100x100 time: [141.62 µs 141.65 µs 141.69 µs] change: [+0.6391% +0.8447% +0.9784%] (p = 0.00 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 5 (5.00%) low mild 1 (1.00%) high mild 3 (3.00%) high severe Benchmarking qr_decompose_100x500: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.7s, enable flat sampling, or reduce sample count to 60. qr_decompose_100x500 time: [1.0071 ms 1.0082 ms 1.0097 ms] change: [+0.9031% +1.2358% +1.6126%] (p = 0.00 < 0.05) Change within noise threshold. Found 16 outliers among 100 measurements (16.00%) 12 (12.00%) low mild 2 (2.00%) high mild 2 (2.00%) high severe qr_decompose_4x4 time: [100.40 ns 100.43 ns 100.45 ns] change: [−19.315% −19.268% −19.224%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) low mild 1 (1.00%) high mild 4 (4.00%) high severe Benchmarking qr_decompose_500x100: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.2s, enable flat sampling, or reduce sample count to 60. qr_decompose_500x100 time: [847.17 µs 847.68 µs 848.21 µs] change: [+2.1441% +2.3425% +2.5069%] (p = 0.00 < 0.05) Performance has regressed. Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) high mild 3 (3.00%) high severe qr_decompose_unpack_100x100 time: [283.22 µs 283.26 µs 283.30 µs] change: [−0.3591% −0.2383% −0.1147%] (p = 0.00 < 0.05) Change within noise threshold. Found 23 outliers among 100 measurements (23.00%) 21 (21.00%) low severe 1 (1.00%) low mild 1 (1.00%) high severe Benchmarking qr_decompose_unpack_100x500: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.8s, enable flat sampling, or reduce sample count to 60. qr_decompose_unpack_100x500 time: [1.1399 ms 1.1429 ms 1.1457 ms] change: [−1.9555% −1.8085% −1.6312%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild Benchmarking qr_decompose_unpack_500x100: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.6s, enable flat sampling, or reduce sample count to 50. qr_decompose_unpack_500x100 time: [1.6633 ms 1.6640 ms 1.6648 ms] change: [+1.4516% +1.5245% +1.5969%] (p = 0.00 < 0.05) Performance has regressed. Found 11 outliers among 100 measurements (11.00%) 2 (2.00%) low severe 5 (5.00%) low mild 4 (4.00%) high severe qr_solve_10x10 time: [156.51 ns 156.56 ns 156.61 ns] change: [+3.7415% +3.8709% +3.9947%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 6 (6.00%) low severe 5 (5.00%) low mild 1 (1.00%) high mild qr_solve_100x100 time: [3.5393 µs 3.5454 µs 3.5511 µs] change: [+6.0908% +6.5747% +6.9798%] (p = 0.00 < 0.05) Performance has regressed. Found 6 outliers among 100 measurements (6.00%) 6 (6.00%) low mild qr_inverse_10x10 time: [806.75 ns 807.99 ns 809.61 ns] change: [+0.6973% +0.8242% +0.9558%] (p = 0.00 < 0.05) Change within noise threshold. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe qr_inverse_100x100 time: [330.65 µs 330.74 µs 330.85 µs] change: [+1.2238% +1.3244% +1.4518%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 3 (3.00%) low mild 4 (4.00%) high mild 5 (5.00%) high severe schur_decompose_4x4 time: [969.14 ns 969.71 ns 970.18 ns] change: [−12.293% −12.223% −12.149%] (p = 0.00 < 0.05) Performance has improved. Found 10 outliers among 100 measurements (10.00%) 3 (3.00%) low severe 1 (1.00%) low mild 2 (2.00%) high mild 4 (4.00%) high severe schur_decompose_10x10 time: [7.3226 µs 7.3237 µs 7.3247 µs] change: [+0.3785% +0.4095% +0.4394%] (p = 0.00 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) low mild 4 (4.00%) high mild 3 (3.00%) high severe schur_decompose_100x100 time: [2.5760 ms 2.5763 ms 2.5768 ms] change: [+0.7992% +0.8504% +0.8935%] (p = 0.00 < 0.05) Change within noise threshold. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe schur_decompose_200x200 time: [18.285 ms 18.296 ms 18.308 ms] change: [+1.9360% +2.0941% +2.2427%] (p = 0.00 < 0.05) Performance has regressed. Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) low mild 3 (3.00%) high mild 2 (2.00%) high severe eigenvalues_4x4 time: [937.94 ns 938.15 ns 938.38 ns] change: [+25.764% +25.898% +26.023%] (p = 0.00 < 0.05) Performance has regressed. Found 6 outliers among 100 measurements (6.00%) 2 (2.00%) low severe 2 (2.00%) low mild 2 (2.00%) high mild eigenvalues_10x10 time: [5.9066 µs 5.9088 µs 5.9117 µs] change: [+0.1208% +0.1938% +0.2740%] (p = 0.00 < 0.05) Change within noise threshold. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 3 (3.00%) high mild 4 (4.00%) high severe Benchmarking eigenvalues_100x100: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, enable flat sampling, or reduce sample count to 50. eigenvalues_100x100 time: [1.5870 ms 1.5873 ms 1.5876 ms] change: [−0.8569% −0.8247% −0.7914%] (p = 0.00 < 0.05) Change within noise threshold. Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe eigenvalues_200x200 time: [11.081 ms 11.088 ms 11.102 ms] change: [+0.0054% +0.2956% +0.4946%] (p = 0.00 < 0.05) Change within noise threshold. Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) low mild 1 (1.00%) high mild 2 (2.00%) high severe solve_l_triangular_100x100 time: [1.3250 µs 1.3651 µs 1.4012 µs] change: [+22.932% +24.999% +27.087%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 10 (10.00%) high mild 2 (2.00%) high severe solve_l_triangular_1000x1000 time: [101.52 µs 102.04 µs 102.85 µs] change: [+1.5784% +2.0953% +2.8471%] (p = 0.00 < 0.05) Performance has regressed. Found 15 outliers among 100 measurements (15.00%) 9 (9.00%) high mild 6 (6.00%) high severe tr_solve_l_triangular_100x100 time: [2.0144 µs 2.0537 µs 2.0902 µs] change: [+13.600% +14.669% +15.998%] (p = 0.00 < 0.05) Performance has regressed. Found 16 outliers among 100 measurements (16.00%) 5 (5.00%) high mild 11 (11.00%) high severe tr_solve_l_triangular_1000x1000 time: [93.569 µs 94.056 µs 94.857 µs] change: [+1.2474% +1.7955% +2.5979%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 3 (3.00%) high mild 4 (4.00%) high severe solve_u_triangular_100x100 time: [1.5878 µs 1.6615 µs 1.7405 µs] change: [+31.200% +34.370% +38.132%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 10 (10.00%) high mild 3 (3.00%) high severe solve_u_triangular_1000x1000 time: [105.07 µs 105.46 µs 106.12 µs] change: [+6.6559% +7.0936% +7.8401%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high severe tr_solve_u_triangular_100x100 time: [1.4369 µs 1.4697 µs 1.4986 µs] change: [+17.195% +18.687% +20.307%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 11 (11.00%) high mild 2 (2.00%) high severe tr_solve_u_triangular_1000x1000 time: [88.868 µs 89.303 µs 90.014 µs] change: [+4.2489% +4.7933% +5.6045%] (p = 0.00 < 0.05) Performance has regressed. Found 11 outliers among 100 measurements (11.00%) 4 (4.00%) high mild 7 (7.00%) high severe svd_decompose_2x2 time: [22.913 ns 22.958 ns 23.017 ns] change: [+9.3648% +9.7443% +10.253%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) high mild 5 (5.00%) high severe svd_decompose_3x3 time: [359.30 ns 359.72 ns 360.20 ns] change: [+9.0123% +9.1174% +9.2394%] (p = 0.00 < 0.05) Performance has regressed. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild svd_decompose_4x4 time: [896.28 ns 896.55 ns 896.85 ns] change: [−7.1192% −7.0496% −6.9853%] (p = 0.00 < 0.05) Performance has improved. Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) low severe 3 (3.00%) low mild 3 (3.00%) high mild 2 (2.00%) high severe svd_decompose_10x10 time: [5.7680 µs 5.7708 µs 5.7739 µs] change: [+1.1933% +1.4155% +1.6347%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe Benchmarking svd_decompose_100x100: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, enable flat sampling, or reduce sample count to 50. svd_decompose_100x100 time: [1.5704 ms 1.5709 ms 1.5715 ms] change: [+1.4465% +1.4891% +1.5357%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe svd_decompose_200x200 time: [11.845 ms 11.847 ms 11.850 ms] change: [+1.4378% +1.4794% +1.5225%] (p = 0.00 < 0.05) Performance has regressed. Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe rank_4x4 time: [716.49 ns 716.62 ns 716.74 ns] change: [+4.9084% +4.9678% +5.0237%] (p = 0.00 < 0.05) Performance has regressed. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild rank_10x10 time: [4.2304 µs 4.2341 µs 4.2377 µs] change: [+0.4993% +0.6056% +0.7271%] (p = 0.00 < 0.05) Change within noise threshold. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild rank_100x100 time: [522.74 µs 522.85 µs 522.97 µs] change: [+0.2822% +0.3170% +0.3535%] (p = 0.00 < 0.05) Change within noise threshold. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low mild 2 (2.00%) high severe rank_200x200 time: [3.0167 ms 3.0217 ms 3.0267 ms] change: [+0.3924% +0.5333% +0.6946%] (p = 0.00 < 0.05) Change within noise threshold. singular_values_4x4 time: [735.97 ns 736.08 ns 736.21 ns] change: [−7.6736% −7.6163% −7.5596%] (p = 0.00 < 0.05) Performance has improved. Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) low severe 2 (2.00%) low mild 2 (2.00%) high severe singular_values_10x10 time: [4.2987 µs 4.2997 µs 4.3010 µs] change: [+1.6193% +1.7215% +1.8186%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe singular_values_100x100 time: [525.20 µs 525.36 µs 525.54 µs] change: [+0.4054% +0.4526% +0.4982%] (p = 0.00 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 6 (6.00%) low mild 1 (1.00%) high mild 2 (2.00%) high severe singular_values_200x200 time: [3.0712 ms 3.0729 ms 3.0750 ms] change: [+2.1769% +2.2358% +2.3112%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low mild 1 (1.00%) high mild 1 (1.00%) high severe pseudo_inverse_4x4 time: [877.64 ns 878.38 ns 879.12 ns] change: [−8.2828% −8.2216% −8.1662%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 1 (1.00%) low severe 3 (3.00%) low mild 2 (2.00%) high mild 7 (7.00%) high severe pseudo_inverse_10x10 time: [6.0008 µs 6.0034 µs 6.0064 µs] change: [+0.2665% +0.3678% +0.4766%] (p = 0.00 < 0.05) Change within noise threshold. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe Benchmarking pseudo_inverse_100x100: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.4s, enable flat sampling, or reduce sample count to 50. pseudo_inverse_100x100 time: [1.6088 ms 1.6091 ms 1.6094 ms] change: [+0.1161% +0.2007% +0.2937%] (p = 0.00 < 0.05) Change within noise threshold. Found 12 outliers among 100 measurements (12.00%) 2 (2.00%) high mild 10 (10.00%) high severe pseudo_inverse_200x200 time: [12.038 ms 12.042 ms 12.047 ms] change: [−0.4351% −0.2531% −0.0699%] (p = 0.01 < 0.05) Change within noise threshold. Found 22 outliers among 100 measurements (22.00%) 16 (16.00%) low severe 2 (2.00%) low mild 1 (1.00%) high mild 3 (3.00%) high severe symmetric_eigen_decompose_4x4 time: [518.00 ns 518.07 ns 518.15 ns] change: [+4.7008% +4.7492% +4.8006%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 2 (2.00%) low mild 2 (2.00%) high mild 4 (4.00%) high severe symmetric_eigen_decompose_10x10 time: [3.6417 µs 3.6428 µs 3.6440 µs] change: [−0.1549% −0.0998% −0.0483%] (p = 0.00 < 0.05) Change within noise threshold. Found 12 outliers among 100 measurements (12.00%) 6 (6.00%) high mild 6 (6.00%) high severe symmetric_eigen_decompose_100x100 time: [761.64 µs 762.66 µs 763.80 µs] change: [−5.8109% −5.7178% −5.6284%] (p = 0.00 < 0.05) Performance has improved. Found 19 outliers among 100 measurements (19.00%) 9 (9.00%) low severe 9 (9.00%) low mild 1 (1.00%) high severe symmetric_eigen_decompose_200x200 time: [5.1304 ms 5.1337 ms 5.1372 ms] change: [−9.4434% −9.3646% −9.2959%] (p = 0.00 < 0.05) Performance has improved. Total run time of full benchmark suite on my machine (AMD 5950X) has not changed and is still around ~30 minutes. * fix: Add reproducible_smatrix() Some algorithms may not converge when used on completely random values with the default value of epsilon and unlimited iterations. `reproducible_dmatrix()` already exist to circumvent this for `DMatrix`, so I implemented the same for `SMatrix`. In my tests this problem manifested itself only on `schur_decompose_4x4`, but I decided to apply similar fix for all benchmarks that also use `reproducible_dmatrix()` for `DMatrix`. * fix: Use reproducible_dmatrix() for Cholesky benches Random matrices may be not positive-definite and Cholesky decomposition benchmarks panic because of that: Benchmarking cholesky_decompose_unpack_100x100: Warming up for 3.0000 s thread 'main' panicked at benches/linalg/cholesky.rs:38:45: called `Option::unwrap()` on a `None` value * don't require reproducible matrix for cholesky and make it use randomly generated positive definite matrix * remove constant elements where useful, remove reproducible matrix calls and replace with random * fix wrong test name * update changelog --------- Co-authored-by: geo-ant <54497890+geo-ant@users.noreply.github.com>
1 parent 27696b4 commit 2d785b4

File tree

16 files changed

+1136
-630
lines changed

16 files changed

+1136
-630
lines changed

Cargo.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ encase = { version = "0.12", optional = true }
163163
serde_json = "1.0"
164164
rand_xorshift = "0.4"
165165
rand_isaac = "0.4"
166-
criterion = { version = "0.4", features = ["html_reports"] }
166+
criterion = { version = "0.7", features = ["html_reports"] }
167167
nalgebra = { path = ".", features = ["debug", "compare", "rand", "macros"] }
168168

169169
# For matrix comparison macro
@@ -195,6 +195,7 @@ required-features = ["rand"]
195195

196196
[profile.bench]
197197
lto = true
198+
codegen-units = 1
198199

199200
[package.metadata.docs.rs]
200201
# Enable all the features when building the docs on docs.rs

benches/common/macros.rs

Lines changed: 58 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,24 @@
11
#![macro_use]
22

3+
/// This will call `Drop::drop()` within the benchmarking loop for all arguments that were not consumed
4+
/// by the binary operation.
5+
///
6+
/// Do not use this macro for types with non-trivial `Drop` implementation unless you want to include it
7+
/// into the measurement.
38
macro_rules! bench_binop(
49
($name: ident, $t1: ty, $t2: ty, $binop: ident) => {
510
fn $name(bh: &mut criterion::Criterion) {
611
use rand::SeedableRng;
12+
713
let mut rng = IsaacRng::seed_from_u64(0);
8-
let a = rng.random::<$t1>();
9-
let b = rng.random::<$t2>();
1014

11-
bh.bench_function(stringify!($name), move |bh| bh.iter(|| {
12-
a.$binop(b)
13-
}));
15+
bh.bench_function(stringify!($name), |bh| bh.iter_batched(
16+
|| (rng.random::<$t1>(), rng.random::<$t2>()),
17+
|args| {
18+
args.0.$binop(args.1)
19+
},
20+
criterion::BatchSize::SmallInput),
21+
);
1422
}
1523
}
1624
);
@@ -19,95 +27,81 @@ macro_rules! bench_binop_ref(
1927
($name: ident, $t1: ty, $t2: ty, $binop: ident) => {
2028
fn $name(bh: &mut criterion::Criterion) {
2129
use rand::SeedableRng;
30+
2231
let mut rng = IsaacRng::seed_from_u64(0);
23-
let a = rng.random::<$t1>();
24-
let b = rng.random::<$t2>();
2532

26-
bh.bench_function(stringify!($name), move |bh| bh.iter(|| {
27-
a.$binop(&b)
28-
}));
33+
bh.bench_function(stringify!($name), |bh| bh.iter_batched_ref(
34+
|| (rng.random::<$t1>(), rng.random::<$t2>()),
35+
|args| {
36+
args.0.$binop(&args.1)
37+
},
38+
criterion::BatchSize::SmallInput),
39+
);
2940
}
3041
}
3142
);
3243

33-
macro_rules! bench_binop_fn(
34-
($name: ident, $t1: ty, $t2: ty, $binop: path) => {
44+
/// This will call `Drop::drop()` within the benchmarking loop for all arguments that were not consumed
45+
/// by the binary operation.
46+
///
47+
/// Do not use this macro for types with non-trivial `Drop` implementation unless you want to include it
48+
/// into the measurement.
49+
macro_rules! bench_binop_single_1st(
50+
($name: ident, $t1: ty, $t2: ty, $binop: ident) => {
3551
fn $name(bh: &mut criterion::Criterion) {
3652
use rand::SeedableRng;
53+
use std::hint::black_box;
54+
3755
let mut rng = IsaacRng::seed_from_u64(0);
38-
let a = rng.random::<$t1>();
39-
let b = rng.random::<$t2>();
4056

41-
bh.bench_function(stringify!($name), move |bh| bh.iter(|| {
42-
$binop(&a, &b)
43-
}));
57+
let first = black_box(rng.random::<$t1>());
58+
59+
bh.bench_function(stringify!($name), |bh| bh.iter_batched(
60+
|| rng.random::<$t2>(),
61+
|second| {
62+
first.$binop(second)
63+
},
64+
criterion::BatchSize::SmallInput),
65+
);
4466
}
4567
}
4668
);
4769

48-
macro_rules! bench_unop_na(
49-
($name: ident, $t: ty, $unop: ident) => {
70+
macro_rules! bench_binop_single_1st_ref(
71+
($name: ident, $t1: ty, $t2: ty, $binop: ident) => {
5072
fn $name(bh: &mut criterion::Criterion) {
51-
const LEN: usize = 1 << 13;
52-
5373
use rand::SeedableRng;
54-
let mut rng = IsaacRng::seed_from_u64(0);
74+
use std::hint::black_box;
5575

56-
let elems: Vec<$t> = (0usize .. LEN).map(|_| rng.random::<$t>()).collect();
57-
let mut i = 0;
76+
let mut rng = IsaacRng::seed_from_u64(0);
5877

59-
bh.bench_function(stringify!($name), move |bh| bh.iter(|| {
60-
i = (i + 1) & (LEN - 1);
78+
let first = black_box(rng.random::<$t1>());
6179

62-
unsafe {
63-
std::hint::black_box(na::$unop(elems.get_unchecked(i)))
64-
}
65-
}));
80+
bh.bench_function(stringify!($name), |bh| bh.iter_batched_ref(
81+
|| rng.random::<$t2>(),
82+
|second| {
83+
first.$binop(second)
84+
},
85+
criterion::BatchSize::SmallInput),
86+
);
6687
}
6788
}
6889
);
6990

7091
macro_rules! bench_unop(
7192
($name: ident, $t: ty, $unop: ident) => {
7293
fn $name(bh: &mut criterion::Criterion) {
73-
const LEN: usize = 1 << 13;
74-
7594
use rand::SeedableRng;
76-
let mut rng = IsaacRng::seed_from_u64(0);
7795

78-
let mut elems: Vec<$t> = (0usize .. LEN).map(|_| rng.random::<$t>()).collect();
79-
let mut i = 0;
80-
81-
bh.bench_function(stringify!($name), move |bh| bh.iter(|| {
82-
i = (i + 1) & (LEN - 1);
83-
84-
unsafe {
85-
std::hint::black_box(elems.get_unchecked_mut(i).$unop())
86-
}
87-
}));
88-
}
89-
}
90-
);
91-
92-
macro_rules! bench_construction(
93-
($name: ident, $constructor: path, $( $args: ident: $types: ty),*) => {
94-
fn $name(bh: &mut criterion::Criterion) {
95-
const LEN: usize = 1 << 13;
96-
97-
use rand::SeedableRng;
9896
let mut rng = IsaacRng::seed_from_u64(0);
9997

100-
$(let $args: Vec<$types> = (0usize .. LEN).map(|_| rng.random::<$types>()).collect();)*
101-
let mut i = 0;
102-
103-
bh.bench_function(stringify!($name), move |bh| bh.iter(|| {
104-
i = (i + 1) & (LEN - 1);
105-
106-
unsafe {
107-
let res = $constructor($(*$args.get_unchecked(i),)*);
108-
std::hint::black_box(res)
109-
}
110-
}));
98+
bh.bench_function(stringify!($name), |bh| bh.iter_batched_ref(
99+
|| rng.random::<$t>(),
100+
|arg| {
101+
arg.$unop()
102+
},
103+
criterion::BatchSize::SmallInput),
104+
);
111105
}
112106
}
113107
);

0 commit comments

Comments
 (0)