@@ -80,58 +80,114 @@ However, due to the current implementation details of Base `any`/`all`, early br
80
80
81
81
A very simple comparison.
82
82
``` julia
83
- julia> @benchmark mapreduce ($ abs2, $ + , $ A1, dims= $ (1 ,2 ,4 ))
83
+ julia> A = rand (5 ,5 ,5 ,5 );
84
+
85
+ julia> @benchmark mapreduce ($ abs2, $ + , $ A, dims= $ (1 ,2 ,4 ))
84
86
BenchmarkTools. Trial: 10000 samples with 133 evaluations.
85
87
Range (min … max): 661.038 ns … 139.234 μs ┊ GC (min … max): 0.00 % … 99.24 %
86
88
Time (median): 746.880 ns ┊ GC (median): 0.00 %
87
89
Time (mean ± σ): 798.069 ns ± 1.957 μs ┊ GC (mean ± σ): 3.46 % ± 1.40 %
88
90
89
- ▄ █▄
90
- ▃█▇▃▂▁▁▁▁▁▁▁▁▁▂▂▅██▅▄▄▃▂▂▁▁▁▁▁▁▁▁▂▂▃▄▄▄▄▄▅▅▅▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▁▁ ▂
91
- 661 ns Histogram: frequency by time 906 ns <
92
-
93
91
Memory estimate: 368 bytes, allocs estimate: 8.
94
92
95
- julia> @benchmark vvmapreduce ($ abs2, $ + , $ A1 , dims= $ (1 ,2 ,4 ))
93
+ julia> @benchmark vvmapreduce ($ abs2, $ + , $ A , dims= $ (1 ,2 ,4 ))
96
94
BenchmarkTools. Trial: 10000 samples with 788 evaluations.
97
95
Range (min … max): 160.538 ns … 29.430 μs ┊ GC (min … max): 0.00 % … 99.11 %
98
96
Time (median): 203.479 ns ┊ GC (median): 0.00 %
99
97
Time (mean ± σ): 212.916 ns ± 761.848 ns ┊ GC (mean ± σ): 10.68 % ± 2.97 %
100
98
101
- ▄██▄▃▃▁▂▁ ▁▁ ▁▄▅▆▆▄▃▂▄▅▆▅▄▃▂▁▁▁▁ ▂
102
- ███████████▇█▇▇▇▇▆▆▆▆▅▆████▇▅▆▅▆▇████████████████████▇▇▆▆▆▇██ █
103
- 161 ns Histogram: log (frequency) by time 235 ns <
104
-
105
99
Memory estimate: 240 bytes, allocs estimate: 6.
106
100
107
- julia> @benchmark extrema ($ A1, dims= $ (1 ,2 ))
108
- @benchmark vvextrema ($ A1, dims= $ (1 ,2 ))
101
+ julia> @benchmark extrema ($ A, dims= $ (1 ,2 ))
109
102
BenchmarkTools. Trial: 10000 samples with 9 evaluations.
110
103
Range (min … max): 2.813 μs … 5.827 μs ┊ GC (min … max): 0.00 % … 0.00 %
111
104
Time (median): 2.990 μs ┊ GC (median): 0.00 %
112
105
Time (mean ± σ): 3.039 μs ± 149.676 ns ┊ GC (mean ± σ): 0.00 % ± 0.00 %
113
106
114
- ▅█▅
115
- ▂▂▂▂▂▂▅████▆▄▅▇▆▆▅▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▁▁▂▁▂▂▂▁▁▂▂▂▂▂▂▂▂▂▂ ▃
116
- 2.81 μs Histogram: frequency by time 3.84 μs <
117
-
118
107
Memory estimate: 960 bytes, allocs estimate: 14.
119
108
120
- julia> @benchmark vvextrema ($ A1 , dims= $ (1 ,2 ))
109
+ julia> @benchmark vvextrema ($ A , dims= $ (1 ,2 ))
121
110
BenchmarkTools. Trial: 10000 samples with 202 evaluations.
122
111
Range (min … max): 381.743 ns … 86.288 μs ┊ GC (min … max): 0.00 % … 99.05 %
123
112
Time (median): 689.658 ns ┊ GC (median): 0.00 %
124
113
Time (mean ± σ): 712.113 ns ± 2.851 μs ┊ GC (mean ± σ): 13.84 % ± 3.43 %
125
114
126
- ▄▁ ▃█▇▂
127
- ▅██▅▃▂▂▂▂▂▂▂▁▂▂▁▁▂▂▁▁▁▁▁▂▂▂▂▁▁▁▂▁▂▁▁▁▂▂▁▁▁▂▁▁▂▂▃▄▆▅▄▅████▅▃▂ ▃
128
- 382 ns Histogram: frequency by time 726 ns <
129
-
130
115
Memory estimate: 1.19 KiB, allocs estimate: 8.
131
116
```
132
117
</p >
133
118
</details >
134
119
120
+ ### Varargs examples
121
+ <details >
122
+ <summaryClick me! ></summary >
123
+ <p >
124
+
125
+ These are somewhat standard fare, but can be quite convenient for expressing
126
+ certain Bayesian computations.
127
+ ``` julia
128
+ julia> A1, A2, A3, A4 = rand (5 ,5 ,5 ,5 ), rand (5 ,5 ,5 ,5 ), rand (5 ,5 ,5 ,5 ), rand (5 ,5 ,5 ,5 );
129
+
130
+ julia> @benchmark mapreduce ($ + , $ + , $ A1, $ A2, $ A3, $ A4, dims= $ (1 ,2 ,4 ))
131
+ BenchmarkTools. Trial: 10000 samples with 10 evaluations.
132
+ Range (min … max): 1.597 μs … 1.181 ms ┊ GC (min … max): 0.00 % … 97.71 %
133
+ Time (median): 1.867 μs ┊ GC (median): 0.00 %
134
+ Time (mean ± σ): 2.257 μs ± 14.216 μs ┊ GC (mean ± σ): 8.56 % ± 1.38 %
135
+
136
+ Memory estimate: 5.66 KiB, allocs estimate: 14.
137
+
138
+ julia> @benchmark vvmapreduce ($ + , $ + , $ A1, $ A2, $ A3, $ A4, dims= $ (1 ,2 ,4 ))
139
+ BenchmarkTools. Trial: 10000 samples with 203 evaluations.
140
+ Range (min … max): 384.768 ns … 150.041 μs ┊ GC (min … max): 0.00 % … 99.57 %
141
+ Time (median): 437.601 ns ┊ GC (median): 0.00 %
142
+ Time (mean ± σ): 478.179 ns ± 2.117 μs ┊ GC (mean ± σ): 7.50 % ± 1.72 %
143
+
144
+ Memory estimate: 304 bytes, allocs estimate: 6.
145
+
146
+ # And for really strange stuff (e.g. posterior predictive transformations)
147
+ julia> @benchmark vvmapreduce ((x,y,z) -> ifelse (x* y+ z ≥ 1 , 1 , 0 ), + , $ A1, $ A2, $ A3)
148
+ BenchmarkTools. Trial: 10000 samples with 198 evaluations.
149
+ Range (min … max): 438.126 ns … 5.704 μs ┊ GC (min … max): 0.00 % … 0.00 %
150
+ Time (median): 439.995 ns ┊ GC (median): 0.00 %
151
+ Time (mean ± σ): 442.020 ns ± 63.038 ns ┊ GC (mean ± σ): 0.00 % ± 0.00 %
152
+
153
+ Memory estimate: 0 bytes, allocs estimate: 0.
154
+
155
+ # using ifelse for just a boolean is quite slow, but the above is just for demonstration
156
+ julia> @benchmark vvmapreduce ((x,y,z) -> ≥ (x* y+ z, 1 ), + , $ A1, $ A2, $ A3)
157
+ BenchmarkTools. Trial: 10000 samples with 975 evaluations.
158
+ Range (min … max): 70.558 ns … 2.085 μs ┊ GC (min … max): 0.00 % … 0.00 %
159
+ Time (median): 70.888 ns ┊ GC (median): 0.00 %
160
+ Time (mean ± σ): 71.425 ns ± 23.489 ns ┊ GC (mean ± σ): 0.00 % ± 0.00 %
161
+
162
+ Memory estimate: 0 bytes, allocs estimate: 0.
163
+
164
+ # What I mean by posterior predictive transformation? Well, one might encounter
165
+ # this in Bayesian model checking, which provides a convenient example.
166
+ # If one wishes to compute the Pr = ∫∫𝕀(T(yʳᵉᵖ, θ) ≥ T(y, θ))p(yʳᵉᵖ|θ)p(θ|y)dyʳᵉᵖdθ
167
+ # Let's imagine that A1 represents T(yʳᵉᵖ, θ) and A2 represents T(y, θ)
168
+ # i.e. the test variable samples computed as a functional of the Markov chain (samples of θ)
169
+ # Then, Pr is computed as
170
+ vvmapreduce (≥ , + , A1, A2) / length (A1)
171
+ # Or, if only the probability is of interest, and we do not wish to use the functionals
172
+ # for any other purpose, we could compute it as:
173
+ vvmapreduce ((x, y) -> ≥ (f (x), f (y)), + , A1, A2) / length (A1)
174
+ # where `f` is the functional of interest, e.g.
175
+ vvmapreduce ((x, y) -> ≥ (abs2 (x), abs2 (y)), + , A1, A2) / length (A1)
176
+
177
+ # One can also express commonly encountered reductions with ease;
178
+ # these will be fused once a post-reduction operator can be specified
179
+ # Mean squared error
180
+ vvmapreduce ((x, y) -> abs2 (x - y), + , A1, A2, dims= (2 ,4 )) ./ (size (A1, 2 ) * size (A1, 4 ))
181
+ # Euclidean distance
182
+ (√ ). (vvmapreduce ((x, y) -> abs2 (x - y), + , A1, A2, dims= (2 ,4 )))
183
+ ```
184
+ </p >
185
+ </details >
186
+
187
+ ### ` findmin ` /` findmax ` examples
188
+
189
+
190
+
135
191
## Acknowledgments
136
192
The original motivation for this work was a vectorized & multithreaded multi-dimensional findmin, taking a variable number of array arguments -- it's a long story, but the similarity between findmin and mapreduce motivated a broad approach. My initial attempt (visible in the /attic) did not deliver all the performance possible -- this was only apparent through comparison to C. Elrod's approach to multidimensional forms in VectorizedStatistics. Having fully appreciated the beauty of branching through @generated functions, I decided to take a tour of some low-hanging fruit -- this package is the result.
137
193
0 commit comments