Skip to content

Conversation

@devmotion
Copy link
Member

The specializations of wsum! for BlasReal are not completely correct, as detected by JET, since also Union{Float32, Float64} eltypes dispatch to those. This can be fixed by restricting the input types of these specializations (as currently done in #955), but since these specializations do not show clear performance improvements compared to the fallback in my benchmarks they are generally questionable. This PR removes the BlasReal specializations completely.

On the master branch

julia> using StatsBase, Chairmarks

julia> a = rand(1);

julia> @be wsum!($a, $(rand(100)), $(weights(rand(100))), 1)
Benchmark: 3939 samples with 460 evaluations
 min    46.196 ns
 median 49.728 ns
 mean   51.990 ns
 max    257.157 ns

julia> a = rand(150);

julia> @be wsum!($a, $(rand(100, 150)), $(weights(rand(100))), 1)
Benchmark: 3561 samples with 2 evaluations
 min    12.792 μs
 median 12.875 μs
 mean   13.067 μs
 max    21.000 μs

julia> @be wsum!($a, $(rand(150, 100)), $(weights(rand(100))), 2)
Benchmark: 3274 samples with 14 evaluations
 min    2.009 μs
 median 2.021 μs
 mean   2.040 μs
 max    3.238 μs

julia> a = rand(150, 200);

julia> @be wsum!($a, $(rand(100, 150, 200)), $(weights(rand(100))), 1)
Benchmark: 83 samples with 1 evaluation
 min    1.151 ms
 median 1.207 ms
 mean   1.213 ms
 max    1.352 ms

julia> @be wsum!($a, $(rand(150, 100, 200)), $(weights(rand(100))), 2)
Benchmark: 185 samples with 1 evaluation
 min    506.791 μs
 median 513.334 μs
 mean   516.295 μs
 max    550.791 μs

julia> @be wsum!($a, $(rand(150, 200, 100)), $(weights(rand(100))), 3)
Benchmark: 135 samples with 1 evaluation
 min    694.959 μs
 median 704.375 μs
 mean   704.803 μs
 max    739.167 μs

On this PR

julia> using StatsBase, Chairmarks

julia> a = rand(1);

julia> @be wsum!($a, $(rand(100)), $(weights(rand(100))), 1)
Benchmark: 3937 samples with 464 evaluations
 min    47.502 ns
 median 48.851 ns
 mean   51.484 ns
 max    192.080 ns

julia> a = rand(150);

julia> @be wsum!($a, $(rand(100, 150)), $(weights(rand(100))), 1)
Benchmark: 3536 samples with 2 evaluations
 min    12.792 μs
 median 13.229 μs
 mean   13.205 μs
 max    49.396 μs

julia> @be wsum!($a, $(rand(150, 100)), $(weights(rand(100))), 2)
Benchmark: 3293 samples with 14 evaluations
 min    2.018 μs
 median 2.030 μs
 mean   2.031 μs
 max    2.738 μs

julia> a = rand(150, 200);

julia> @be wsum!($a, $(rand(100, 150, 200)), $(weights(rand(100))), 1)
Benchmark: 84 samples with 1 evaluation
 min    1.186 ms
 median 1.190 ms
 mean   1.195 ms
 max    1.238 ms

julia> @be wsum!($a, $(rand(150, 100, 200)), $(weights(rand(100))), 2)
Benchmark: 181 samples with 1 evaluation
 min    509.500 μs
 median 518.333 μs
 mean   527.084 μs
 max    759.083 μs

julia> @be wsum!($a, $(rand(150, 200, 100)), $(weights(rand(100))), 3)
Benchmark: 135 samples with 1 evaluation
 min    695.083 μs
 median 705.042 μs
 mean   706.163 μs
 max    733.584 μs

@devmotion devmotion requested a review from andreasnoack May 3, 2025 14:30
Copy link
Member

@andreasnoack andreasnoack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice simplification of the code. Approving but let's wait a day to hear if there are any other views on the matter.

@andreasnoack andreasnoack merged commit 6ef1840 into master May 4, 2025
13 checks passed
@andreasnoack andreasnoack deleted the dw/wsum_remove_blas branch May 4, 2025 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants