@@ -16,93 +16,32 @@ Some operations will be inherently slower due to extra computation, like `getind
16
16
Please file an issue for any examples that are significantly slower than their
17
17
the analogue in Base.
18
18
19
- ## Concatenation
19
+ ## Lazy operations
20
20
21
- ` Vcat ` is the lazy analogue of ` vcat ` . For lazy vectors like ranges, it
22
- creating such a vector is allocation-free. ` copyto! ` allows for allocation-free
23
- population of a vector.
21
+ To construct a lazy representation of a function call `f(x,y,z...), use the command
22
+ ` applied(f, x, y, z...) ` . This will return an unmaterialized object typically of type ` Applied `
23
+ that represents the operation. To realize that object, call ` materialize ` , which
24
+ will typically be equivalent to calling ` f(x,y,z...) ` . A macro ` @~ ` is available as a shorthand:
24
25
``` julia
25
- julia> using LazyArrays, BenchmarkTools
26
+ julia> using LazyArrays, LinearAlgebra
26
27
27
- julia> A = Vcat (1 : 5 ,2 : 3 ) # allocation-free
28
- 7 - element Vcat{Int64,1 ,Tuple{UnitRange{Int64},UnitRange{Int64}}}:
29
- 1
30
- 2
31
- 3
32
- 4
33
- 5
34
- 2
35
- 3
36
-
37
- julia> Vector (A) == vcat (1 : 5 , 2 : 3 )
38
- true
39
-
40
- julia> b = Array {Int} (undef, length (A)); @btime copyto! (b, A);
41
- 26.670 ns (0 allocations: 0 bytes)
42
-
43
- julia> @btime vcat (1 : 5 , 2 : 3 ); # takes twice as long due to memory creation
44
- 43.336 ns (1 allocation: 144 bytes)
45
- ```
46
- Similarly, ` Hcat ` is the lazy analogue of ` hcat ` .
47
- ``` julia
48
- julia> A = Hcat (1 : 3 , randn (3 ,10 ))
49
- 3 × 11 Hcat{Float64,Tuple{UnitRange{Int64},Array{Float64,2 }}}:
50
- 1.0 0.350927 0.339103 - 1.03526 … 0.786593 0.0416694
51
- 2.0 - 1.10206 1.52817 0.223099 0.851804 0.430933
52
- 3.0 - 1.26467 - 0.743712 - 0.828781 - 0.0637502 - 0.066743
53
-
54
- julia> Matrix (A) == hcat (A. arrays... )
55
- true
56
-
57
- julia> b = Array {Int} (undef, length (A)); @btime copyto! (b, A);
58
- 26.670 ns (0 allocations: 0 bytes)
59
-
60
- julia> B = Array {Float64} (undef, size (A)... ); @btime copyto! (B, A);
61
- 109.625 ns (1 allocation: 32 bytes)
62
-
63
- julia> @btime hcat (A. arrays... ); # takes twice as long due to memory creation
64
- 274.620 ns (6 allocations: 560 bytes)
65
- ```
66
-
67
- ## Broadcasting
68
-
69
- Base now includes a lazy broadcast object called ` Broadcasting ` , but this is
70
- not a subtype of ` AbstractArray ` . Here we have ` BroadcastArray ` which replicates
71
- the functionality of ` Broadcasting ` while supporting the array interface.
72
- ``` julia
73
- julia> A = randn (6 ,6 );
74
-
75
- julia> B = BroadcastArray (exp, A);
76
-
77
- julia> Matrix (B) == exp .(A)
78
- true
79
-
80
- julia> B = BroadcastArray (+ , A, 2 );
81
-
82
- julia> B == A .+ 2
83
- true
84
- ```
85
- Such arrays can also be created using the macro ` @~ ` which acts on ordinary
86
- broadcasting expressions combined with ` LazyArray ` :
87
- ``` julia
88
- julia> C = rand (1000 )' ;
28
+ julia> applied (exp, 1 )
29
+ Applied (exp,1 )
89
30
90
- julia> D = LazyArray (@~ exp .(C))
31
+ julia> materialize (applied (exp, 1 ))
32
+ 2.718281828459045
91
33
92
- julia> E = LazyArray (@~ @. 2 + log (C))
34
+ julia> materialize (@~ exp (1 ))
35
+ 2.718281828459045
93
36
94
- julia> @btime sum ( LazyArray (@ ~ C .* C ' ); dims = 1 ) # without `@~`, 1.438 ms (5 allocations: 7.64 MiB )
95
- 74.425 μs ( 7 allocations : 8.08 KiB)
37
+ julia> exp ( 1 )
38
+ 2.718281828459045
96
39
```
97
40
98
- ## Multiplication
99
-
100
- Following Base's lazy broadcasting, we introduce lazy multiplication. The type
101
- ` Mul ` support multiplication of any two objects, not necessarily arrays.
102
- (In the future there will be a ` MulArray ` a la ` BroadcastArray ` .)
103
-
104
- ` Mul ` is designed to work along with broadcasting, and to lower to BLAS calls
105
- whenever possible:
41
+ The benefit of lazy operations is that they can be materialized in-place,
42
+ possible using simplifications. For example, it is possible to
43
+ do BLAS-like Matrix-Vector operations of the form ` α*A*x + β*y ` as
44
+ implemented in ` BLAS.gemv! ` using a lazy applied object:
106
45
``` julia
107
46
julia> A = randn (5 ,5 ); b = randn (5 ); c = randn (5 ); d = similar (c);
108
47
@@ -140,50 +79,144 @@ julia> @btime 2*(A*b) + 3c; # does not call gemv!
140
79
241.659 ns (4 allocations: 512 bytes)
141
80
```
142
81
143
- ## Inverses
144
-
145
- We also have lazy inverses ` PInv(A) ` , designed to work alongside ` Mul ` to
146
- to lower to BLAS calls whenever possible:
82
+ This also works for inverses, which lower to BLAS calls whenever possible:
147
83
``` julia
148
84
julia> A = randn (5 ,5 ); b = randn (5 ); c = similar (b);
149
85
150
- julia> c .= Mul ( PInv (A), b)
86
+ julia> c .= @ ~ A \ b
151
87
5 - element Array{Float64,1 }:
152
88
- 2.5366335879717514
153
89
- 5.305097174484744
154
90
- 9.818431932350942
155
91
2.421562605495651
156
92
0.26792916096572983
93
+ ```
157
94
158
- julia> c .= Ldiv (A, b) # shorthand for above
159
- 5 - element Array{Float64,1 }:
160
- - 2.5366335879717514
161
- - 5.305097174484744
162
- - 9.818431932350942
163
- 2.421562605495651
164
- 0.26792916096572983
95
+
96
+
97
+ ## Lazy arrays
98
+
99
+ Often we want lazy realizations of matrices, which are supported via ` ApplyArray ` .
100
+ For example, the following creates a lazy matrix exponential:
101
+ ``` julia
102
+ julia> E = ApplyArray (exp, [1 2 ; 3 4 ])
103
+ 2 × 2 ApplyArray{Float64,2 ,typeof (exp),Tuple{Array{Int64,2 }}}:
104
+ 51.969 74.7366
105
+ 112.105 164.074
106
+ ```
107
+ A lazy matrix exponential is useful for, say, in-place matrix-exponetial* vector:
108
+ ``` julia
109
+ julia> b = Vector {Float64} (undef, 2 ); b .= @~ E* [4 ,4 ]
110
+ 2 - element Array{Float64,1 }:
111
+ 506.8220830628333
112
+ 1104.7145995988594
113
+ ```
114
+ While this works, it is not actually optimised (yet).
115
+
116
+ Other options do have special implementations that make them fast. We
117
+ now give some examples.
118
+
119
+
120
+ ### Concatenation
121
+
122
+ Lazy ` vcat ` and ` hcat ` allow for representing the concatenation of
123
+ vectors without actually allocating memory, and support a fast
124
+ ` copyto! ` for allocation-free population of a vector.
125
+ ``` julia
126
+ julia> using BenchmarkTools
127
+
128
+ julia> A = ApplyArray (vcat,1 : 5 ,2 : 3 ) # allocation-free
129
+ 7 - element ApplyArray{Int64,1 ,typeof (vcat),Tuple{UnitRange{Int64},UnitRange{Int64}}}:
130
+ 1
131
+ 2
132
+ 3
133
+ 4
134
+ 5
135
+ 2
136
+ 3
137
+
138
+ julia> Vector (A) == vcat (1 : 5 , 2 : 3 )
139
+ true
140
+
141
+ julia> b = Array {Int} (undef, length (A)); @btime copyto! (b, A);
142
+ 26.670 ns (0 allocations: 0 bytes)
143
+
144
+ julia> @btime vcat (1 : 5 , 2 : 3 ); # takes twice as long due to memory creation
145
+ 43.336 ns (1 allocation: 144 bytes)
146
+ ```
147
+ Similar is the lazy analogue of ` hcat ` :
148
+ ``` julia
149
+ julia> A = ApplyArray (hcat, 1 : 3 , randn (3 ,10 ))
150
+ 3 × 11 ApplyArray{Float64,2 ,typeof (hcat),Tuple{UnitRange{Int64},Array{Float64,2 }}}:
151
+ 1.0 1.16561 0.224871 - 1.36416 - 0.30675 0.103714 0.590141 0.982382 - 1.50045 0.323747 - 1.28173
152
+ 2.0 1.04648 1.35506 - 0.147157 0.995657 - 0.616321 - 0.128672 - 0.671445 - 0.563587 - 0.268389 - 1.71004
153
+ 3.0 - 0.433093 - 0.325207 - 1.38496 - 0.391113 - 0.0568739 - 1.55796 - 1.00747 0.473686 - 1.2113 0.0119156
154
+
155
+ julia> Matrix (A) == hcat (A. args... )
156
+ true
157
+
158
+ julia> B = Array {Float64} (undef, size (A)... ); @btime copyto! (B, A);
159
+ 109.625 ns (1 allocation: 32 bytes)
160
+
161
+ julia> @btime hcat (A. args... ); # takes twice as long due to memory creation
162
+ 274.620 ns (6 allocations: 560 bytes)
165
163
```
166
164
167
- ## Kronecker products
165
+
166
+
167
+ ### Kronecker products
168
168
169
169
We can represent Kronecker products of arrays without constructing the full
170
- array.
170
+ array:
171
171
172
172
``` julia
173
173
julia> A = randn (2 ,2 ); B = randn (3 ,3 );
174
174
175
- julia> K = Kron ( A,B)
176
- 6 × 6 Kron {Float64,2 ,Tuple{Array{Float64,2 },Array{Float64,2 }}}:
177
- 1.99255 - 1.45132 0.864789 - 0.785538 0.572163 - 0.340932
178
- - 2.7016 0.360785 - 1.78671 1.06507 - 0.142235 0.70439
179
- 1.89938 - 2.69996 0.200992 - 0.748806 1.06443 - 0.0792386
180
- - 1.84225 1.34184 - 0.799557 - 2.45355 1.7871 - 1.06487
181
- 2.49782 - 0.333571 1.65194 3.32665 - 0.444258 2.20009
182
- - 1.75611 2.4963 - 0.185831 - 2.33883 3.32464 - 0.247494
175
+ julia> K = ApplyArray (kron, A,B)
176
+ 6 × 6 ApplyArray {Float64,2 , typeof (kron) ,Tuple{Array{Float64,2 },Array{Float64,2 }}}:
177
+ - 1.08736 - 0.19547 - 0.132824 1.60531 0.288579 0.196093
178
+ 0.353898 0.445557 - 0.257776 - 0.522472 - 0.657791 0.380564
179
+ - 0.723707 0.911737 - 0.710378 1.06843 - 1.34603 1.04876
180
+ 1.40606 0.252761 0.171754 - 0.403809 - 0.0725908 - 0.0493262
181
+ - 0.457623 - 0.576146 0.333329 0.131426 0.165464 - 0.0957293
182
+ 0.935821 - 1.17896 0.918584 - 0.26876 0.338588 - 0.26381
183
183
184
184
julia> C = Matrix {Float64} (undef, 6 , 6 ); @btime copyto! (C, K);
185
185
61.528 ns (0 allocations: 0 bytes)
186
186
187
187
julia> C == kron (A,B)
188
188
true
189
189
```
190
+
191
+
192
+ ## Broadcasting
193
+
194
+ Base includes a lazy broadcast object called ` Broadcasting ` , but this is
195
+ not a subtype of ` AbstractArray ` . Here we have ` BroadcastArray ` which replicates
196
+ the functionality of ` Broadcasting ` while supporting the array interface.
197
+ ``` julia
198
+ julia> A = randn (6 ,6 );
199
+
200
+ julia> B = BroadcastArray (exp, A);
201
+
202
+ julia> Matrix (B) == exp .(A)
203
+ true
204
+
205
+ julia> B = BroadcastArray (+ , A, 2 );
206
+
207
+ julia> B == A .+ 2
208
+ true
209
+ ```
210
+ Such arrays can also be created using the macro ` @~ ` which acts on ordinary
211
+ broadcasting expressions combined with ` LazyArray ` :
212
+ ``` julia
213
+ julia> C = rand (1000 )' ;
214
+
215
+ julia> D = LazyArray (@~ exp .(C))
216
+
217
+ julia> E = LazyArray (@~ @. 2 + log (C))
218
+
219
+ julia> @btime sum (LazyArray (@~ C .* C' ); dims= 1 ) # without `@~`, 1.438 ms (5 allocations: 7.64 MiB)
220
+ 74.425 μs (7 allocations: 8.08 KiB)
221
+ ```
222
+
0 commit comments