Commit 2d785b4
Make benchmarks measure an actual computation (#1549)
* test: `codegen-units = 1` for benchmarks
During benchmarking I found that `codegen-units` with default value
leads to inconsistent results across recompilations (clean vs.
incremental). Also, sometimes it leads to a significant performance
degradation of benchmarks unrelated to code changes.
Also see https://github.com/rust-lang/rust/issues/146497
* chore: Bump criterion to version 0.7
* chore: Remove unused bench_binop_fn!(), bench_unop_na!() and bench_construction!()
* fix: Use iter_batched() and iter_batched_ref() in bench macros
Criterion generates a `Vec` of arguments and passes them through
the `black_box()` to guarantee that the benchmark closure is never
optimized out of the benchmarking loop.
This fixes https://github.com/dimforge/nalgebra/issues/1547 for
benchmarks that use `bench_*!()` macros.
* feat: Add macros to benchmark Single x N Values binary ops
This simulates real-world use cases like multiplication of
many vectors by a single matrix.
There is a ~2x performance difference between a case when both arguments
are random on each iteration and a case when one argument is static and
second is random on each iteration:
mat2_mul_v time: [778.33 ps 785.41 ps 797.70 ps]
Found 14 outliers among 100 measurements (14.00%)
5 (5.00%) low severe
4 (4.00%) high mild
5 (5.00%) high severe
mat3_mul_v time: [1.7001 ns 1.7051 ns 1.7111 ns]
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low severe
1 (1.00%) low mild
8 (8.00%) high mild
1 (1.00%) high severe
mat4_mul_v time: [2.6101 ns 2.6223 ns 2.6374 ns]
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
single_mat2_mul_v time: [402.65 ps 403.62 ps 404.75 ps]
Found 11 outliers among 100 measurements (11.00%)
3 (3.00%) low mild
5 (5.00%) high mild
3 (3.00%) high severe
single_mat3_mul_v time: [651.30 ps 654.06 ps 657.15 ps]
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low mild
8 (8.00%) high mild
4 (4.00%) high severe
single_mat4_mul_v time: [1.0628 ns 1.0645 ns 1.0666 ns]
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
5 (5.00%) high mild
2 (2.00%) high severe
mat2_tr_mul_v time: [719.81 ps 721.99 ps 724.59 ps]
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) low mild
5 (5.00%) high mild
mat3_tr_mul_v time: [1.6685 ns 1.6758 ns 1.6841 ns]
Found 13 outliers among 100 measurements (13.00%)
4 (4.00%) low severe
1 (1.00%) low mild
4 (4.00%) high mild
4 (4.00%) high severe
mat4_tr_mul_v time: [2.6739 ns 2.6897 ns 2.7080 ns]
Found 16 outliers among 100 measurements (16.00%)
2 (2.00%) low severe
2 (2.00%) low mild
4 (4.00%) high mild
8 (8.00%) high severe
single_mat2_tr_mul_v time: [353.36 ps 354.56 ps 356.03 ps]
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) low mild
1 (1.00%) high mild
3 (3.00%) high severe
single_mat3_tr_mul_v time: [779.82 ps 782.84 ps 786.37 ps]
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low severe
1 (1.00%) low mild
6 (6.00%) high mild
2 (2.00%) high severe
single_mat4_tr_mul_v time: [1.1918 ns 1.1946 ns 1.1977 ns]
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) low mild
1 (1.00%) high mild
2 (2.00%) high severe
unit_quaternion_mul_v time: [1.5002 ns 1.5088 ns 1.5183 ns]
change: [−0.0578% +0.3775% +0.8498%] (p = 0.10 > 0.05)
No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
single_unit_quaternion_mul_v
time: [1.0489 ns 1.0531 ns 1.0584 ns]
Found 14 outliers among 100 measurements (14.00%)
2 (2.00%) low severe
1 (1.00%) low mild
4 (4.00%) high mild
7 (7.00%) high severe
* chore: Uncomment quaternion benchmarks
I do not know why those benchmarks were commented out.
* fix: Use iter_batched() and iter_batched_ref() for the remaining benchmarks
The bulk of the changes was done Claude Sonnet 4. Additionally I moved
`DVector` allocations outside of the benchmark, and added anything
allocated and not consumed into a return tuple of a benchmark closure to
ensure that implicit drop/free is not included into the measured time.
This fixes https://github.com/dimforge/nalgebra/issues/1547 for the
remaining benchmarks.
Benchmark results before vs. after all changes:
mat2_mul_m time: [1.1043 ns 1.1058 ns 1.1077 ns]
change: [+49.306% +49.651% +50.045%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
4 (4.00%) low severe
2 (2.00%) high mild
6 (6.00%) high severe
mat3_mul_m time: [3.1885 ns 3.1945 ns 3.2038 ns]
change: [+102.62% +103.63% +104.86%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) low mild
2 (2.00%) high mild
3 (3.00%) high severe
mat4_mul_m time: [6.7759 ns 6.7840 ns 6.7929 ns]
change: [+130.65% +131.50% +132.59%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
4 (4.00%) low severe
3 (3.00%) high mild
4 (4.00%) high severe
mat2_tr_mul_m time: [1.2882 ns 1.2901 ns 1.2926 ns]
change: [+75.005% +75.472% +75.928%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) low severe
1 (1.00%) high mild
3 (3.00%) high severe
mat3_tr_mul_m time: [3.1688 ns 3.1725 ns 3.1770 ns]
change: [+101.61% +102.10% +102.66%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) low severe
4 (4.00%) high mild
4 (4.00%) high severe
mat4_tr_mul_m time: [6.5406 ns 6.5453 ns 6.5508 ns]
change: [+121.95% +122.66% +123.42%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low severe
1 (1.00%) low mild
5 (5.00%) high mild
6 (6.00%) high severe
mat2_add_m time: [644.68 ps 645.88 ps 647.24 ps]
change: [−13.049% −12.530% −11.972%] (p = 0.00 < 0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
4 (4.00%) low severe
1 (1.00%) low mild
1 (1.00%) high mild
2 (2.00%) high severe
mat3_add_m time: [1.3543 ns 1.3572 ns 1.3607 ns]
change: [−14.707% −13.705% −12.403%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
6 (6.00%) low severe
5 (5.00%) high mild
4 (4.00%) high severe
mat4_add_m time: [2.3987 ns 2.4015 ns 2.4044 ns]
change: [−20.676% −19.615% −18.453%] (p = 0.00 < 0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
6 (6.00%) low severe
5 (5.00%) high mild
3 (3.00%) high severe
mat2_sub_m time: [637.47 ps 638.88 ps 640.62 ps]
change: [−13.604% −13.020% −12.333%] (p = 0.00 < 0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
4 (4.00%) low severe
2 (2.00%) low mild
2 (2.00%) high mild
5 (5.00%) high severe
mat3_sub_m time: [1.3531 ns 1.3546 ns 1.3562 ns]
change: [−15.139% −14.610% −14.084%] (p = 0.00 < 0.05)
Performance has improved.
Found 16 outliers among 100 measurements (16.00%)
5 (5.00%) low severe
1 (1.00%) low mild
6 (6.00%) high mild
4 (4.00%) high severe
mat4_sub_m time: [2.3972 ns 2.3996 ns 2.4021 ns]
change: [−20.412% −19.249% −18.330%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
6 (6.00%) low severe
1 (1.00%) high mild
3 (3.00%) high severe
mat2_mul_v time: [774.43 ps 775.48 ps 776.73 ps]
change: [+144.90% +145.51% +146.12%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) low severe
5 (5.00%) high mild
3 (3.00%) high severe
mat3_mul_v time: [1.6843 ns 1.6858 ns 1.6874 ns]
change: [+284.57% +285.82% +287.43%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) low severe
1 (1.00%) high mild
3 (3.00%) high severe
mat4_mul_v time: [2.6029 ns 2.6196 ns 2.6485 ns]
change: [+255.34% +257.62% +261.68%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) low severe
1 (1.00%) low mild
2 (2.00%) high mild
5 (5.00%) high severe
single_mat2_mul_v time: [392.29 ps 393.45 ps 394.87 ps]
Found 8 outliers among 100 measurements (8.00%)
6 (6.00%) high mild
2 (2.00%) high severe
single_mat3_mul_v time: [650.16 ps 651.47 ps 653.07 ps]
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) low severe
3 (3.00%) high mild
4 (4.00%) high severe
single_mat4_mul_v time: [1.0665 ns 1.0690 ns 1.0722 ns]
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) low mild
4 (4.00%) high mild
4 (4.00%) high severe
mat2_tr_mul_v time: [719.95 ps 720.92 ps 722.16 ps]
change: [+127.86% +128.34% +128.98%] (p = 0.00 < 0.05)
Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
1 (1.00%) low severe
2 (2.00%) low mild
7 (7.00%) high mild
4 (4.00%) high severe
mat3_tr_mul_v time: [1.6551 ns 1.6564 ns 1.6577 ns]
change: [+277.57% +278.32% +279.16%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) low severe
1 (1.00%) low mild
5 (5.00%) high mild
2 (2.00%) high severe
mat4_tr_mul_v time: [2.6477 ns 2.6546 ns 2.6666 ns]
change: [+259.47% +260.55% +261.67%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) low severe
3 (3.00%) high mild
3 (3.00%) high severe
single_mat2_tr_mul_v time: [353.60 ps 355.50 ps 358.48 ps]
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) low mild
4 (4.00%) high mild
3 (3.00%) high severe
single_mat3_tr_mul_v time: [778.13 ps 779.43 ps 781.25 ps]
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) low severe
3 (3.00%) high mild
5 (5.00%) high severe
single_mat4_tr_mul_v time: [1.1887 ns 1.1906 ns 1.1930 ns]
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) low mild
2 (2.00%) high mild
3 (3.00%) high severe
mat2_mul_s time: [774.44 ps 775.33 ps 776.37 ps]
change: [+6.0947% +6.3308% +6.5936%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) low severe
2 (2.00%) low mild
4 (4.00%) high mild
4 (4.00%) high severe
mat3_mul_s time: [962.59 ps 964.98 ps 967.43 ps]
change: [−38.097% −37.694% −37.145%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severe
mat4_mul_s time: [1.6589 ns 1.6640 ns 1.6684 ns]
change: [−43.668% −43.130% −42.518%] (p = 0.00 < 0.05)
Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
8 (8.00%) low severe
3 (3.00%) low mild
1 (1.00%) high mild
6 (6.00%) high severe
mat2_div_s time: [803.09 ps 804.70 ps 806.56 ps]
change: [+10.272% +10.596% +10.960%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) low severe
1 (1.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
mat3_div_s time: [2.4929 ns 2.4947 ns 2.4967 ns]
change: [+58.793% +59.185% +59.709%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
3 (3.00%) low severe
5 (5.00%) high mild
4 (4.00%) high severe
mat4_div_s time: [5.1650 ns 5.1688 ns 5.1735 ns]
change: [+76.816% +77.215% +77.629%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) low severe
1 (1.00%) low mild
4 (4.00%) high mild
2 (2.00%) high severe
mat2_inv time: [1.1514 ns 1.1523 ns 1.1533 ns]
change: [−41.682% −41.556% −41.439%] (p = 0.00 < 0.05)
Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
3 (3.00%) low severe
1 (1.00%) low mild
5 (5.00%) high mild
2 (2.00%) high severe
mat3_inv time: [3.3641 ns 3.3707 ns 3.3826 ns]
change: [−37.473% −37.358% −37.214%] (p = 0.00 < 0.05)
Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) low severe
1 (1.00%) low mild
5 (5.00%) high mild
5 (5.00%) high severe
mat4_inv time: [25.970 ns 26.006 ns 26.062 ns]
change: [−9.0865% −8.9013% −8.6986%] (p = 0.00 < 0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
3 (3.00%) low severe
2 (2.00%) low mild
3 (3.00%) high mild
6 (6.00%) high severe
mat2_transpose time: [409.94 ps 410.77 ps 411.75 ps]
change: [−62.889% −62.624% −62.331%] (p = 0.00 < 0.05)
Performance has improved.
Found 17 outliers among 100 measurements (17.00%)
4 (4.00%) low severe
2 (2.00%) low mild
4 (4.00%) high mild
7 (7.00%) high severe
mat3_transpose time: [947.42 ps 953.20 ps 961.97 ps]
change: [−61.273% −60.195% −58.616%] (p = 0.00 < 0.05)
Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low mild
7 (7.00%) high mild
3 (3.00%) high severe
mat4_transpose time: [1.6510 ns 1.6551 ns 1.6612 ns]
change: [−65.877% −65.592% −65.225%] (p = 0.00 < 0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
5 (5.00%) low severe
1 (1.00%) low mild
2 (2.00%) high mild
5 (5.00%) high severe
mat_div_scalar time: [480.25 µs 480.55 µs 480.99 µs]
change: [−22.235% −22.169% −22.095%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
mat100_add_mat100 time: [3.0426 µs 3.0910 µs 3.1351 µs]
change: [+81.145% +84.392% +88.112%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) low severe
3 (3.00%) low mild
7 (7.00%) high mild
1 (1.00%) high severe
mat4_mul_mat4 time: [36.836 ns 36.859 ns 36.886 ns]
change: [+24.966% +25.568% +26.171%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
7 (7.00%) low severe
4 (4.00%) high mild
2 (2.00%) high severe
mat5_mul_mat5 time: [56.715 ns 56.876 ns 57.015 ns]
change: [+10.239% +10.666% +11.091%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low severe
1 (1.00%) low mild
6 (6.00%) high mild
mat6_mul_mat6 time: [83.817 ns 83.999 ns 84.156 ns]
change: [+10.675% +10.890% +11.065%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild
mat7_mul_mat7 time: [93.211 ns 93.386 ns 93.534 ns]
change: [+10.654% +10.892% +11.129%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low severe
2 (2.00%) low mild
mat8_mul_mat8 time: [88.919 ns 89.410 ns 89.884 ns]
change: [+22.808% +23.376% +23.888%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mild
mat9_mul_mat9 time: [207.12 ns 209.04 ns 211.17 ns]
change: [+14.053% +14.646% +15.258%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
9 (9.00%) low mild
1 (1.00%) high mild
mat10_mul_mat10 time: [236.75 ns 237.11 ns 237.47 ns]
change: [+20.055% +20.366% +20.651%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
5 (5.00%) low severe
7 (7.00%) low mild
1 (1.00%) high mild
mat10_mul_mat10_static time: [116.68 ns 117.15 ns 117.62 ns]
change: [+11.160% +11.617% +12.049%] (p = 0.00 < 0.05)
Performance has regressed.
mat100_mul_mat100 time: [40.188 µs 40.327 µs 40.459 µs]
change: [+3.2490% +3.4765% +3.7130%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
7 (7.00%) high mild
8 (8.00%) high severe
mat500_mul_mat500 time: [4.3909 ms 4.3944 ms 4.3978 ms]
change: [+0.8556% +0.9519% +1.0448%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) low severe
2 (2.00%) high mild
1 (1.00%) high severe
iter time: [840.01 µs 840.39 µs 840.81 µs]
change: [+10.527% +10.726% +10.915%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) high mild
11 (11.00%) high severe
iter_rev time: [210.14 µs 211.10 µs 212.84 µs]
change: [+0.2455% +0.7119% +1.7846%] (p = 0.02 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) high mild
6 (6.00%) high severe
copy_from time: [199.77 µs 200.80 µs 202.55 µs]
change: [+41.195% +41.962% +43.287%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
8 (8.00%) low mild
1 (1.00%) high severe
axpy time: [31.301 µs 33.301 µs 34.957 µs]
change: [+40.726% +52.001% +63.112%] (p = 0.00 < 0.05)
Performance has regressed.
tr_mul_to time: [126.46 µs 127.12 µs 128.09 µs]
change: [−4.0124% −3.5145% −2.7708%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high severe
mat_mul_mat time: [39.252 µs 39.443 µs 39.626 µs]
change: [−0.7084% −0.3800% −0.0130%] (p = 0.02 < 0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low mild
8 (8.00%) high mild
2 (2.00%) high severe
mat100_from_fn time: [6.8398 µs 6.8418 µs 6.8446 µs]
change: [+519.35% +522.43% +524.76%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
4 (4.00%) high mild
9 (9.00%) high severe
mat500_from_fn time: [172.11 µs 172.14 µs 172.18 µs]
change: [+498.70% +499.32% +499.93%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low mild
5 (5.00%) high mild
7 (7.00%) high severe
vec2_add_v_f32 time: [303.98 ps 304.76 ps 305.65 ps]
change: [−5.1499% −4.3536% −3.5996%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
4 (4.00%) low severe
5 (5.00%) high mild
6 (6.00%) high severe
vec3_add_v_f32 time: [586.36 ps 587.93 ps 589.92 ps]
change: [+34.275% +34.886% +35.631%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) low mild
5 (5.00%) high mild
6 (6.00%) high severe
vec4_add_v_f32 time: [603.45 ps 604.44 ps 605.59 ps]
change: [−18.949% −18.215% −17.623%] (p = 0.00 < 0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
5 (5.00%) low severe
2 (2.00%) low mild
2 (2.00%) high mild
5 (5.00%) high severe
vec2_add_v_f64 time: [602.08 ps 602.83 ps 603.64 ps]
change: [+89.139% +90.573% +91.808%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
4 (4.00%) low severe
1 (1.00%) low mild
3 (3.00%) high mild
5 (5.00%) high severe
vec3_add_v_f64 time: [910.94 ps 912.60 ps 914.56 ps]
change: [+107.10% +108.18% +109.41%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
3 (3.00%) low severe
6 (6.00%) high mild
3 (3.00%) high severe
vec4_add_v_f64 time: [1.1894 ns 1.1933 ns 1.1963 ns]
change: [+82.607% +85.023% +86.911%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
9 (9.00%) low severe
2 (2.00%) low mild
2 (2.00%) high severe
vec2_sub_v time: [303.45 ps 304.42 ps 305.37 ps]
change: [−5.3598% −4.4578% −3.6738%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
8 (8.00%) low severe
1 (1.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
vec3_sub_v time: [672.95 ps 674.82 ps 676.51 ps]
change: [+51.463% +52.336% +53.346%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
vec4_sub_v time: [602.84 ps 604.65 ps 607.70 ps]
change: [−19.744% −18.754% −17.881%] (p = 0.00 < 0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
6 (6.00%) low severe
1 (1.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severe
vec2_mul_s time: [666.49 ps 667.29 ps 668.31 ps]
change: [+111.37% +111.81% +112.32%] (p = 0.00 < 0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
4 (4.00%) low severe
6 (6.00%) high mild
6 (6.00%) high severe
vec3_mul_s time: [511.42 ps 513.44 ps 515.86 ps]
change: [+15.556% +16.273% +17.049%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
vec4_mul_s time: [774.13 ps 775.22 ps 776.52 ps]
change: [+5.1602% +5.5545% +6.0225%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low severe
2 (2.00%) low mild
3 (3.00%) high mild
7 (7.00%) high severe
vec2_div_s time: [1.3658 ns 1.3694 ns 1.3726 ns]
change: [+328.67% +329.83% +331.09%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
vec3_div_s time: [607.73 ps 608.63 ps 609.66 ps]
change: [+37.642% +38.017% +38.440%] (p = 0.00 < 0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
2 (2.00%) low severe
8 (8.00%) high mild
6 (6.00%) high severe
vec4_div_s time: [802.59 ps 803.62 ps 804.82 ps]
change: [+8.9451% +9.3240% +9.7149%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
3 (3.00%) low severe
6 (6.00%) high mild
2 (2.00%) high severe
vec2_dot_f32 time: [461.20 ps 461.73 ps 462.30 ps]
change: [+117.88% +119.27% +120.79%] (p = 0.00 < 0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
2 (2.00%) low severe
2 (2.00%) low mild
3 (3.00%) high mild
9 (9.00%) high severe
vec3_dot_f32 time: [688.24 ps 689.05 ps 689.95 ps]
change: [+225.49% +227.19% +229.16%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low mild
4 (4.00%) high mild
5 (5.00%) high severe
vec4_dot_f32 time: [917.20 ps 921.23 ps 928.57 ps]
change: [+338.59% +341.30% +344.17%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
8 (8.00%) high mild
5 (5.00%) high severe
vec2_dot_f64 time: [596.11 ps 597.51 ps 598.79 ps]
change: [+177.79% +179.60% +182.13%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
vec3_dot_f64 time: [749.32 ps 751.02 ps 752.81 ps]
change: [+253.48% +257.12% +262.11%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) high mild
7 (7.00%) high severe
vec4_dot_f64 time: [1.0145 ns 1.0185 ns 1.0230 ns]
change: [+376.34% +379.47% +383.46%] (p = 0.00 < 0.05)
Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severe
vec3_cross time: [971.01 ps 971.87 ps 972.73 ps]
change: [+122.34% +122.74% +123.17%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) low severe
1 (1.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
vec2_norm time: [1.0612 ns 1.0623 ns 1.0637 ns]
change: [−0.0722% +0.0499% +0.1765%] (p = 0.44 > 0.05)
No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) low mild
2 (2.00%) high severe
vec3_norm time: [1.0649 ns 1.0665 ns 1.0694 ns]
change: [−4.3787% −4.1856% −3.8679%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
vec4_norm time: [1.0733 ns 1.0739 ns 1.0746 ns]
change: [−4.5616% −3.9738% −2.9157%] (p = 0.00 < 0.05)
Performance has improved.
Found 19 outliers among 100 measurements (19.00%)
2 (2.00%) low severe
7 (7.00%) low mild
5 (5.00%) high mild
5 (5.00%) high severe
vec2_normalize time: [2.5310 ns 2.5326 ns 2.5345 ns]
change: [+3.5769% +3.6696% +3.7678%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
vec3_normalize time: [2.5389 ns 2.5409 ns 2.5424 ns]
change: [+1.1411% +1.2860% +1.4910%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
vec4_normalize time: [1.8154 ns 1.8164 ns 1.8173 ns]
change: [−1.1191% −0.9926% −0.8485%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) low severe
1 (1.00%) low mild
1 (1.00%) high mild
3 (3.00%) high severe
vec10000_dot_f64 time: [2.0296 µs 2.0337 µs 2.0383 µs]
change: [+71.107% +72.619% +74.228%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
4 (4.00%) low severe
3 (3.00%) high mild
4 (4.00%) high severe
vec10000_dot_f32 time: [1.1891 µs 1.1926 µs 1.1962 µs]
change: [+6.3585% +7.1059% +7.9357%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) low severe
1 (1.00%) low mild
4 (4.00%) high mild
6 (6.00%) high severe
vec10000_axpy_f64 time: [2.0702 µs 2.0739 µs 2.0777 µs]
change: [+39.373% +40.227% +41.210%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) low severe
1 (1.00%) low mild
4 (4.00%) high mild
2 (2.00%) high severe
vec10000_axpy_beta_f64 time: [2.0914 µs 2.0962 µs 2.1012 µs]
change: [+31.958% +32.843% +33.467%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
4 (4.00%) low severe
5 (5.00%) high mild
2 (2.00%) high severe
vec10000_axpy_f64_slice time: [2.0272 µs 2.0303 µs 2.0335 µs]
change: [+35.880% +36.621% +37.307%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) low severe
2 (2.00%) high mild
1 (1.00%) high severe
vec10000_axpy_f64_static
time: [13.917 µs 13.965 µs 14.005 µs]
change: [+859.61% +869.73% +879.35%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low severe
3 (3.00%) high mild
2 (2.00%) high severe
vec10000_axpy_f32 time: [1.0402 µs 1.0421 µs 1.0437 µs]
change: [+38.710% +39.603% +40.363%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
5 (5.00%) low severe
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
vec10000_axpy_beta_f32 time: [1.0329 µs 1.0346 µs 1.0364 µs]
change: [+30.705% +31.490% +32.040%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
4 (4.00%) low severe
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
quaternion_add_q time: [642.58 ps 650.39 ps 662.45 ps]
change: [−11.788% −10.934% −9.9463%] (p = 0.00 < 0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
2 (2.00%) low severe
2 (2.00%) low mild
4 (4.00%) high mild
6 (6.00%) high severe
quaternion_sub_q time: [641.16 ps 643.22 ps 645.88 ps]
change: [−12.654% −11.822% −10.943%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
5 (5.00%) low severe
1 (1.00%) low mild
5 (5.00%) high mild
4 (4.00%) high severe
quaternion_mul_q time: [1.4252 ns 1.4271 ns 1.4294 ns]
change: [+94.545% +95.022% +95.499%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) low severe
2 (2.00%) low mild
4 (4.00%) high mild
5 (5.00%) high severe
unit_quaternion_mul_v time: [1.4859 ns 1.4874 ns 1.4890 ns]
change: [+242.77% +243.56% +244.31%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
single_unit_quaternion_mul_v
time: [1.0422 ns 1.0457 ns 1.0504 ns]
Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) low severe
4 (4.00%) high mild
4 (4.00%) high severe
quaternion_mul_s time: [771.17 ps 772.18 ps 773.37 ps]
change: [+6.1278% +6.4276% +6.7583%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
quaternion_div_s time: [798.54 ps 799.82 ps 801.43 ps]
change: [+9.2123% +9.7287% +10.338%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) low severe
2 (2.00%) low mild
4 (4.00%) high mild
5 (5.00%) high severe
quaternion_inv time: [1.2401 ns 1.2408 ns 1.2417 ns]
change: [−43.660% −43.521% −43.317%] (p = 0.00 < 0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) low severe
5 (5.00%) high mild
6 (6.00%) high severe
unit_quaternion_inv time: [596.01 ps 598.93 ps 602.66 ps]
change: [−49.707% −49.184% −48.445%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
6 (6.00%) high mild
9 (9.00%) high severe
quaternion_conjugate time: [604.36 ps 608.60 ps 613.48 ps]
Found 12 outliers among 100 measurements (12.00%)
3 (3.00%) high mild
9 (9.00%) high severe
quaternion_normalize time: [1.8268 ns 1.8274 ns 1.8281 ns]
Found 18 outliers among 100 measurements (18.00%)
4 (4.00%) low severe
4 (4.00%) low mild
7 (7.00%) high mild
3 (3.00%) high severe
bidiagonalize_100x100 time: [265.91 µs 266.00 µs 266.11 µs]
change: [+0.7553% +0.8363% +0.9114%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
5 (5.00%) high mild
3 (3.00%) high severe
bidiagonalize_100x500 time: [2.0053 ms 2.0060 ms 2.0065 ms]
change: [+4.0325% +4.2372% +4.3938%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
5 (5.00%) low severe
2 (2.00%) high mild
5 (5.00%) high severe
bidiagonalize_4x4 time: [266.92 ns 267.24 ns 267.62 ns]
change: [+7.1063% +7.2057% +7.3231%] (p = 0.00 < 0.05)
Performance has regressed.
Found 23 outliers among 100 measurements (23.00%)
1 (1.00%) low severe
5 (5.00%) low mild
13 (13.00%) high mild
4 (4.00%) high severe
Benchmarking bidiagonalize_500x100: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.1s, enable flat sampling, or reduce sample count to 50.
bidiagonalize_500x100 time: [1.6781 ms 1.6793 ms 1.6804 ms]
change: [+1.3944% +1.5312% +1.6400%] (p = 0.00 < 0.05)
Performance has regressed.
bidiagonalize_unpack_100x100
time: [522.13 µs 522.36 µs 522.63 µs]
change: [−0.5318% −0.4044% −0.2627%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) low mild
4 (4.00%) high mild
7 (7.00%) high severe
bidiagonalize_unpack_100x500
time: [2.9858 ms 2.9916 ms 2.9976 ms]
change: [−0.7824% −0.3995% −0.0370%] (p = 0.04 < 0.05)
Change within noise threshold.
bidiagonalize_unpack_500x100
time: [2.5884 ms 2.5896 ms 2.5910 ms]
change: [+0.0767% +0.1539% +0.2316%] (p = 0.00 < 0.05)
Change within noise threshold.
cholesky_100x100 time: [31.084 µs 31.101 µs 31.122 µs]
change: [−5.0365% −4.7949% −4.4205%] (p = 0.00 < 0.05)
Performance has improved.
Found 16 outliers among 100 measurements (16.00%)
2 (2.00%) low severe
4 (4.00%) low mild
1 (1.00%) high mild
9 (9.00%) high severe
cholesky_500x500 time: [4.4799 ms 4.4849 ms 4.4903 ms]
change: [−0.5985% −0.3685% −0.1374%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
cholesky_decompose_unpack_100x100
time: [31.659 µs 31.685 µs 31.727 µs]
change: [−4.9712% −4.7445% −4.3325%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
4 (4.00%) low severe
4 (4.00%) low mild
2 (2.00%) high mild
5 (5.00%) high severe
cholesky_decompose_unpack_500x500
time: [4.4795 ms 4.4845 ms 4.4910 ms]
change: [−1.9595% −1.7121% −1.4978%] (p = 0.00 < 0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
3 (3.00%) low severe
1 (1.00%) low mild
3 (3.00%) high mild
7 (7.00%) high severe
cholesky_solve_10x10 time: [170.70 ns 170.76 ns 170.82 ns]
change: [+8.0936% +8.1777% +8.2764%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) low mild
5 (5.00%) high mild
2 (2.00%) high severe
cholesky_solve_100x100 time: [2.9071 µs 2.9117 µs 2.9174 µs]
change: [+8.4770% +8.9956% +9.6254%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
cholesky_solve_500x500 time: [54.193 µs 54.303 µs 54.417 µs]
change: [+3.9332% +4.1755% +4.4477%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
cholesky_inverse_10x10 time: [1.3189 µs 1.3195 µs 1.3201 µs]
change: [+2.5360% +2.6238% +2.7131%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe
cholesky_inverse_100x100
time: [270.85 µs 270.88 µs 270.92 µs]
change: [−0.9726% −0.8524% −0.7319%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) low severe
4 (4.00%) low mild
2 (2.00%) high mild
2 (2.00%) high severe
cholesky_inverse_500x500
time: [26.673 ms 26.694 ms 26.714 ms]
change: [+1.0784% +1.1816% +1.2794%] (p = 0.00 < 0.05)
Performance has regressed.
Found 23 outliers among 100 measurements (23.00%)
19 (19.00%) low severe
2 (2.00%) low mild
2 (2.00%) high severe
full_piv_lu_decompose_10x10
time: [582.31 ns 582.48 ns 582.67 ns]
change: [+19.583% +19.702% +19.795%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) low severe
6 (6.00%) high mild
2 (2.00%) high severe
full_piv_lu_decompose_100x100
time: [218.73 µs 218.78 µs 218.84 µs]
change: [+5.8729% +5.9828% +6.0904%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) low severe
5 (5.00%) low mild
1 (1.00%) high severe
full_piv_lu_solve_10x10 time: [124.88 ns 124.94 ns 125.02 ns]
change: [+7.4724% +7.6252% +7.7787%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
3 (3.00%) low severe
6 (6.00%) high mild
4 (4.00%) high severe
full_piv_lu_solve_100x100
time: [2.5202 µs 2.5244 µs 2.5289 µs]
change: [+11.226% +11.847% +12.518%] (p = 0.00 < 0.05)
Performance has regressed.
Found 17 outliers among 100 measurements (17.00%)
14 (14.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
full_piv_lu_inverse_10x10
time: [869.61 ns 870.27 ns 871.19 ns]
change: [+4.7996% +4.9224% +5.0608%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) low severe
1 (1.00%) high mild
4 (4.00%) high severe
full_piv_lu_inverse_100x100
time: [212.68 µs 212.83 µs 213.05 µs]
change: [−0.2835% −0.0351% +0.1310%] (p = 0.80 > 0.05)
No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low severe
4 (4.00%) low mild
3 (3.00%) high mild
5 (5.00%) high severe
full_piv_lu_determinant_10x10
time: [15.320 ns 15.338 ns 15.357 ns]
change: [+410.70% +421.41% +430.41%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
9 (9.00%) low severe
1 (1.00%) low mild
3 (3.00%) high mild
full_piv_lu_determinant_100x100
time: [137.44 ns 139.37 ns 141.00 ns]
change: [+213.54% +227.75% +241.42%] (p = 0.00 < 0.05)
Performance has regressed.
hessenberg_decompose_4x4
time: [82.510 ns 82.538 ns 82.564 ns]
change: [−27.950% −27.887% −27.830%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
hessenberg_decompose_100x100
time: [295.98 µs 296.16 µs 296.44 µs]
change: [+3.3234% +3.5705% +3.7986%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severe
hessenberg_decompose_200x200
time: [2.2647 ms 2.2681 ms 2.2714 ms]
change: [+4.8426% +4.9983% +5.1646%] (p = 0.00 < 0.05)
Performance has regressed.
hessenberg_decompose_unpack_100x100
time: [435.30 µs 435.75 µs 436.12 µs]
change: [+2.7479% +2.8420% +2.9424%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
hessenberg_decompose_unpack_200x200
time: [3.2667 ms 3.2678 ms 3.2690 ms]
change: [+3.9624% +4.0021% +4.0423%] (p = 0.00 < 0.05)
Performance has regressed.
Found 22 outliers among 100 measurements (22.00%)
13 (13.00%) low severe
1 (1.00%) low mild
3 (3.00%) high mild
5 (5.00%) high severe
lu_decompose_10x10 time: [353.04 ns 353.16 ns 353.31 ns]
change: [−5.0408% −4.9435% −4.8487%] (p = 0.00 < 0.05)
Performance has improved.
Found 19 outliers among 100 measurements (19.00%)
4 (4.00%) low severe
4 (4.00%) low mild
6 (6.00%) high mild
5 (5.00%) high severe
lu_decompose_100x100 time: [71.544 µs 71.560 µs 71.579 µs]
change: [−1.7176% −1.6430% −1.5721%] (p = 0.00 < 0.05)
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) low severe
2 (2.00%) low mild
2 (2.00%) high mild
3 (3.00%) high severe
lu_solve_10x10 time: [115.42 ns 115.52 ns 115.61 ns]
change: [+3.9363% +4.1024% +4.2557%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
4 (4.00%) low severe
8 (8.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
lu_solve_100x100 time: [2.5152 µs 2.5190 µs 2.5225 µs]
change: [+15.120% +15.625% +16.088%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
lu_inverse_10x10 time: [902.55 ns 903.32 ns 903.97 ns]
change: [+0.7407% +0.8734% +1.0263%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe
lu_inverse_100x100 time: [216.21 µs 216.47 µs 216.80 µs]
change: [−0.6663% −0.5584% −0.4316%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 18 outliers among 100 measurements (18.00%)
2 (2.00%) low severe
4 (4.00%) low mild
5 (5.00%) high mild
7 (7.00%) high severe
lu_determinant_10x10 time: [13.394 ns 13.481 ns 13.665 ns]
change: [+508.98% +524.96% +543.53%] (p = 0.00 < 0.05)
Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
6 (6.00%) low severe
1 (1.00%) low mild
5 (5.00%) high mild
2 (2.00%) high severe
lu_determinant_100x100 time: [149.12 ns 150.16 ns 151.08 ns]
change: [+265.69% +281.86% +296.23%] (p = 0.00 < 0.05)
Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
10 (10.00%) low severe
4 (4.00%) low mild
qr_decompose_100x100 time: [141.62 µs 141.65 µs 141.69 µs]
change: [+0.6391% +0.8447% +0.9784%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
5 (5.00%) low mild
1 (1.00%) high mild
3 (3.00%) high severe
Benchmarking qr_decompose_100x500: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.7s, enable flat sampling, or reduce sample count to 60.
qr_decompose_100x500 time: [1.0071 ms 1.0082 ms 1.0097 ms]
change: [+0.9031% +1.2358% +1.6126%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 16 outliers among 100 measurements (16.00%)
12 (12.00%) low mild
2 (2.00%) high mild
2 (2.00%) high severe
qr_decompose_4x4 time: [100.40 ns 100.43 ns 100.45 ns]
change: [−19.315% −19.268% −19.224%] (p = 0.00 < 0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) low mild
1 (1.00%) high mild
4 (4.00%) high severe
Benchmarking qr_decompose_500x100: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.2s, enable flat sampling, or reduce sample count to 60.
qr_decompose_500x100 time: [847.17 µs 847.68 µs 848.21 µs]
change: [+2.1441% +2.3425% +2.5069%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) high mild
3 (3.00%) high severe
qr_decompose_unpack_100x100
time: [283.22 µs 283.26 µs 283.30 µs]
change: [−0.3591% −0.2383% −0.1147%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 23 outliers among 100 measurements (23.00%)
21 (21.00%) low severe
1 (1.00%) low mild
1 (1.00%) high severe
Benchmarking qr_decompose_unpack_100x500: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.8s, enable flat sampling, or reduce sample count to 60.
qr_decompose_unpack_100x500
time: [1.1399 ms 1.1429 ms 1.1457 ms]
change: [−1.9555% −1.8085% −1.6312%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
Benchmarking qr_decompose_unpack_500x100: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.6s, enable flat sampling, or reduce sample count to 50.
qr_decompose_unpack_500x100
time: [1.6633 ms 1.6640 ms 1.6648 ms]
change: [+1.4516% +1.5245% +1.5969%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
2 (2.00%) low severe
5 (5.00%) low mild
4 (4.00%) high severe
qr_solve_10x10 time: [156.51 ns 156.56 ns 156.61 ns]
change: [+3.7415% +3.8709% +3.9947%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
6 (6.00%) low severe
5 (5.00%) low mild
1 (1.00%) high mild
qr_solve_100x100 time: [3.5393 µs 3.5454 µs 3.5511 µs]
change: [+6.0908% +6.5747% +6.9798%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
6 (6.00%) low mild
qr_inverse_10x10 time: [806.75 ns 807.99 ns 809.61 ns]
change: [+0.6973% +0.8242% +0.9558%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
qr_inverse_100x100 time: [330.65 µs 330.74 µs 330.85 µs]
change: [+1.2238% +1.3244% +1.4518%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
3 (3.00%) low mild
4 (4.00%) high mild
5 (5.00%) high severe
schur_decompose_4x4 time: [969.14 ns 969.71 ns 970.18 ns]
change: [−12.293% −12.223% −12.149%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) low severe
1 (1.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severe
schur_decompose_10x10 time: [7.3226 µs 7.3237 µs 7.3247 µs]
change: [+0.3785% +0.4095% +0.4394%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) low mild
4 (4.00%) high mild
3 (3.00%) high severe
schur_decompose_100x100 time: [2.5760 ms 2.5763 ms 2.5768 ms]
change: [+0.7992% +0.8504% +0.8935%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
schur_decompose_200x200 time: [18.285 ms 18.296 ms 18.308 ms]
change: [+1.9360% +2.0941% +2.2427%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low mild
3 (3.00%) high mild
2 (2.00%) high severe
eigenvalues_4x4 time: [937.94 ns 938.15 ns 938.38 ns]
change: [+25.764% +25.898% +26.023%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) low severe
2 (2.00%) low mild
2 (2.00%) high mild
eigenvalues_10x10 time: [5.9066 µs 5.9088 µs 5.9117 µs]
change: [+0.1208% +0.1938% +0.2740%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
Benchmarking eigenvalues_100x100: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, enable flat sampling, or reduce sample count to 50.
eigenvalues_100x100 time: [1.5870 ms 1.5873 ms 1.5876 ms]
change: [−0.8569% −0.8247% −0.7914%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severe
eigenvalues_200x200 time: [11.081 ms 11.088 ms 11.102 ms]
change: [+0.0054% +0.2956% +0.4946%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
1 (1.00%) high mild
2 (2.00%) high severe
solve_l_triangular_100x100
time: [1.3250 µs 1.3651 µs 1.4012 µs]
change: [+22.932% +24.999% +27.087%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
10 (10.00%) high mild
2 (2.00%) high severe
solve_l_triangular_1000x1000
time: [101.52 µs 102.04 µs 102.85 µs]
change: [+1.5784% +2.0953% +2.8471%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
9 (9.00%) high mild
6 (6.00%) high severe
tr_solve_l_triangular_100x100
time: [2.0144 µs 2.0537 µs 2.0902 µs]
change: [+13.600% +14.669% +15.998%] (p = 0.00 < 0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
5 (5.00%) high mild
11 (11.00%) high severe
tr_solve_l_triangular_1000x1000
time: [93.569 µs 94.056 µs 94.857 µs]
change: [+1.2474% +1.7955% +2.5979%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
solve_u_triangular_100x100
time: [1.5878 µs 1.6615 µs 1.7405 µs]
change: [+31.200% +34.370% +38.132%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
10 (10.00%) high mild
3 (3.00%) high severe
solve_u_triangular_1000x1000
time: [105.07 µs 105.46 µs 106.12 µs]
change: [+6.6559% +7.0936% +7.8401%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high severe
tr_solve_u_triangular_100x100
time: [1.4369 µs 1.4697 µs 1.4986 µs]
change: [+17.195% +18.687% +20.307%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
11 (11.00%) high mild
2 (2.00%) high severe
tr_solve_u_triangular_1000x1000
time: [88.868 µs 89.303 µs 90.014 µs]
change: [+4.2489% +4.7933% +5.6045%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
4 (4.00%) high mild
7 (7.00%) high severe
svd_decompose_2x2 time: [22.913 ns 22.958 ns 23.017 ns]
change: [+9.3648% +9.7443% +10.253%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe
svd_decompose_3x3 time: [359.30 ns 359.72 ns 360.20 ns]
change: [+9.0123% +9.1174% +9.2394%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
svd_decompose_4x4 time: [896.28 ns 896.55 ns 896.85 ns]
change: [−7.1192% −7.0496% −6.9853%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) low severe
3 (3.00%) low mild
3 (3.00%) high mild
2 (2.00%) high severe
svd_decompose_10x10 time: [5.7680 µs 5.7708 µs 5.7739 µs]
change: [+1.1933% +1.4155% +1.6347%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) high mild
2 (2.00%) high severe
Benchmarking svd_decompose_100x100: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, enable flat sampling, or reduce sample count to 50.
svd_decompose_100x100 time: [1.5704 ms 1.5709 ms 1.5715 ms]
change: [+1.4465% +1.4891% +1.5357%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
svd_decompose_200x200 time: [11.845 ms 11.847 ms 11.850 ms]
change: [+1.4378% +1.4794% +1.5225%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high severe
rank_4x4 time: [716.49 ns 716.62 ns 716.74 ns]
change: [+4.9084% +4.9678% +5.0237%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild
rank_10x10 time: [4.2304 µs 4.2341 µs 4.2377 µs]
change: [+0.4993% +0.6056% +0.7271%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
rank_100x100 time: [522.74 µs 522.85 µs 522.97 µs]
change: [+0.2822% +0.3170% +0.3535%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
2 (2.00%) high severe
rank_200x200 time: [3.0167 ms 3.0217 ms 3.0267 ms]
change: [+0.3924% +0.5333% +0.6946%] (p = 0.00 < 0.05)
Change within noise threshold.
singular_values_4x4 time: [735.97 ns 736.08 ns 736.21 ns]
change: [−7.6736% −7.6163% −7.5596%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low severe
2 (2.00%) low mild
2 (2.00%) high severe
singular_values_10x10 time: [4.2987 µs 4.2997 µs 4.3010 µs]
change: [+1.6193% +1.7215% +1.8186%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
4 (4.00%) high mild
4 (4.00%) high severe
singular_values_100x100 time: [525.20 µs 525.36 µs 525.54 µs]
change: [+0.4054% +0.4526% +0.4982%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) low mild
1 (1.00%) high mild
2 (2.00%) high severe
singular_values_200x200 time: [3.0712 ms 3.0729 ms 3.0750 ms]
change: [+2.1769% +2.2358% +2.3112%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
pseudo_inverse_4x4 time: [877.64 ns 878.38 ns 879.12 ns]
change: [−8.2828% −8.2216% −8.1662%] (p = 0.00 < 0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild
7 (7.00%) high severe
pseudo_inverse_10x10 time: [6.0008 µs 6.0034 µs 6.0064 µs]
change: [+0.2665% +0.3678% +0.4766%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
4 (4.00%) high mild
4 (4.00%) high severe
Benchmarking pseudo_inverse_100x100: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.4s, enable flat sampling, or reduce sample count to 50.
pseudo_inverse_100x100 time: [1.6088 ms 1.6091 ms 1.6094 ms]
change: [+0.1161% +0.2007% +0.2937%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) high mild
10 (10.00%) high severe
pseudo_inverse_200x200 time: [12.038 ms 12.042 ms 12.047 ms]
change: [−0.4351% −0.2531% −0.0699%] (p = 0.01 < 0.05)
Change within noise threshold.
Found 22 outliers among 100 measurements (22.00%)
16 (16.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
3 (3.00%) high severe
symmetric_eigen_decompose_4x4
time: [518.00 ns 518.07 ns 518.15 ns]
change: [+4.7008% +4.7492% +4.8006%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severe
symmetric_eigen_decompose_10x10
time: [3.6417 µs 3.6428 µs 3.6440 µs]
change: [−0.1549% −0.0998% −0.0483%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
6 (6.00%) high mild
6 (6.00%) high severe
symmetric_eigen_decompose_100x100
time: [761.64 µs 762.66 µs 763.80 µs]
change: [−5.8109% −5.7178% −5.6284%] (p = 0.00 < 0.05)
Performance has improved.
Found 19 outliers among 100 measurements (19.00%)
9 (9.00%) low severe
9 (9.00%) low mild
1 (1.00%) high severe
symmetric_eigen_decompose_200x200
time: [5.1304 ms 5.1337 ms 5.1372 ms]
change: [−9.4434% −9.3646% −9.2959%] (p = 0.00 < 0.05)
Performance has improved.
Total run time of full benchmark suite on my machine (AMD 5950X) has
not changed and is still around ~30 minutes.
* fix: Add reproducible_smatrix()
Some algorithms may not converge when used on completely random values
with the default value of epsilon and unlimited iterations.
`reproducible_dmatrix()` already exist to circumvent this for `DMatrix`,
so I implemented the same for `SMatrix`.
In my tests this problem manifested itself only on
`schur_decompose_4x4`, but I decided to apply similar fix for all
benchmarks that also use `reproducible_dmatrix()` for `DMatrix`.
* fix: Use reproducible_dmatrix() for Cholesky benches
Random matrices may be not positive-definite and Cholesky decomposition
benchmarks panic because of that:
Benchmarking cholesky_decompose_unpack_100x100: Warming up for 3.0000 s
thread 'main' panicked at benches/linalg/cholesky.rs:38:45:
called `Option::unwrap()` on a `None` value
* don't require reproducible matrix for cholesky and make it use randomly generated positive definite matrix
* remove constant elements where useful, remove reproducible matrix calls and replace with random
* fix wrong test name
* update changelog
---------
Co-authored-by: geo-ant <54497890+geo-ant@users.noreply.github.com>1 parent 27696b4 commit 2d785b4
File tree
16 files changed
+1136
-630
lines changed- benches
- common
- core
- geometry
- linalg
16 files changed
+1136
-630
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
166 | | - | |
| 166 | + | |
167 | 167 | | |
168 | 168 | | |
169 | 169 | | |
| |||
195 | 195 | | |
196 | 196 | | |
197 | 197 | | |
| 198 | + | |
198 | 199 | | |
199 | 200 | | |
200 | 201 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
3 | 8 | | |
4 | 9 | | |
5 | 10 | | |
6 | 11 | | |
| 12 | + | |
7 | 13 | | |
8 | | - | |
9 | | - | |
10 | 14 | | |
11 | | - | |
12 | | - | |
13 | | - | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
14 | 22 | | |
15 | 23 | | |
16 | 24 | | |
| |||
19 | 27 | | |
20 | 28 | | |
21 | 29 | | |
| 30 | + | |
22 | 31 | | |
23 | | - | |
24 | | - | |
25 | 32 | | |
26 | | - | |
27 | | - | |
28 | | - | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
29 | 40 | | |
30 | 41 | | |
31 | 42 | | |
32 | 43 | | |
33 | | - | |
34 | | - | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
35 | 51 | | |
36 | 52 | | |
| 53 | + | |
| 54 | + | |
37 | 55 | | |
38 | | - | |
39 | | - | |
40 | 56 | | |
41 | | - | |
42 | | - | |
43 | | - | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
44 | 66 | | |
45 | 67 | | |
46 | 68 | | |
47 | 69 | | |
48 | | - | |
49 | | - | |
| 70 | + | |
| 71 | + | |
50 | 72 | | |
51 | | - | |
52 | | - | |
53 | 73 | | |
54 | | - | |
| 74 | + | |
55 | 75 | | |
56 | | - | |
57 | | - | |
| 76 | + | |
58 | 77 | | |
59 | | - | |
60 | | - | |
| 78 | + | |
61 | 79 | | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
66 | 87 | | |
67 | 88 | | |
68 | 89 | | |
69 | 90 | | |
70 | 91 | | |
71 | 92 | | |
72 | 93 | | |
73 | | - | |
74 | | - | |
75 | 94 | | |
76 | | - | |
77 | 95 | | |
78 | | - | |
79 | | - | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | 96 | | |
99 | 97 | | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
111 | 105 | | |
112 | 106 | | |
113 | 107 | | |
0 commit comments