Commit 07725da
authored
By dealing with the `alpha == 0` case separately, we ensure that if
`alpha::Bool`, it must be `true`. This reduces the branches in
`_lscale_add` from 4 to 2 in the common case of 3-argument `mul!`. This
leads to a latency reduction, as each branch has to compile a different
broadcast expression, and we currently compile four but use only one.
Primarily, this PR leads to a reduction in allocations.
```julia
julia> using LinearAlgebra
julia> v = 1:4; w = similar(v);
julia> @time mul!(w, 1, v);
0.171120 seconds (1.04 M allocations: 52.799 MiB, 99.98% compilation time) # nightly
0.163178 seconds (702.63 k allocations: 35.533 MiB, 99.98% compilation time) # this PR
```
Something similar usually doesn't lead to a big gain in the
`_rscale_add` method, as `s * alpha` often has the same type as `s`, and
therefore the branches on `alpha` compile the same code.
1 parent 61e444d commit 07725da
2 files changed
+58
-23
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
202 | 202 | | |
203 | 203 | | |
204 | 204 | | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
| 205 | + | |
| 206 | + | |
218 | 207 | | |
219 | 208 | | |
220 | 209 | | |
221 | 210 | | |
222 | 211 | | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
223 | 234 | | |
224 | 235 | | |
225 | 236 | | |
| |||
228 | 239 | | |
229 | 240 | | |
230 | 241 | | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
236 | 245 | | |
237 | 246 | | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
| 247 | + | |
243 | 248 | | |
244 | 249 | | |
245 | 250 | | |
246 | 251 | | |
247 | 252 | | |
248 | 253 | | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
249 | 262 | | |
250 | 263 | | |
251 | 264 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
857 | 857 | | |
858 | 858 | | |
859 | 859 | | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
| 880 | + | |
| 881 | + | |
860 | 882 | | |
0 commit comments