@@ -30,20 +30,22 @@ which is what `mapcols` does, has some overhead:
30
30
using BenchmarkTools
31
31
mat1k = rand (3 ,1000 );
32
32
33
- @btime mapreduce (fun, hcat, eachcol ($ mat1k)) # 1.522 ms
34
- @btime mapslices (fun, $ mat1k, dims= 1 ) # 1.017 ms
35
-
36
- @btime mapcols (fun, $ mat1k) # 399.016 μs
37
- @btime MapCols {3} (fun, $ mat1k) # 15.564 μs
38
- @btime MapCols (fun, $ mat1k) # 16.774 μs without size
39
-
40
- @btime ForwardDiff. gradient (m -> sum (sin, mapslices (fun, m, dims= 1 )), $ mat1k); # 372.705 ms
41
- @btime Tracker. gradient (m -> sum (sin, mapcols (fun, m)), $ mat1k); # 70.203 ms
42
- @btime Tracker. gradient (m -> sum (sin, MapCols {3} (fun, m)), $ mat1k); # 146.561 μs, 330.51 KiB
43
- @btime Zygote. gradient (m -> sum (sin, mapcols (fun, m)), $ mat1k); # 20.018 ms, 3.82 MiB
44
- @btime Zygote. gradient (m -> sum (sin, MapCols {3} (fun, m)), $ mat1k); # 245.550 μs
33
+ @btime mapreduce (fun, hcat, eachcol ($ mat1k)) # 1.522 ms, 11.80 MiB
34
+ @btime mapslices (fun, $ mat1k, dims= 1 ) # 1.017 ms, 329.92 KiB
35
+
36
+ @btime mapcols (fun, $ mat1k) # 399.016 μs, 219.02 KiB
37
+ @btime MapCols {3} (fun, $ mat1k) # 15.564 μs, 47.16 KiB
38
+ @btime MapCols (fun, $ mat1k) # 16.774 μs ( without slice size)
39
+
40
+ @btime ForwardDiff. gradient (m -> sum (mapslices (fun, m, dims= 1 )), $ mat1k); # 329.305 ms
41
+ @btime Tracker. gradient (m -> sum (mapcols (fun, m)), $ mat1k); # 70.203 ms
42
+ @btime Tracker. gradient (m -> sum (MapCols {3} (fun, m)), $ mat1k); # 51.129 μs, 282.92 KiB
43
+ @btime Zygote. gradient (m -> sum (mapcols (fun, m)), $ mat1k); # 20.454 ms, 3.52 MiB
44
+ @btime Zygote. gradient (m -> sum (MapCols {3} (fun, m)), $ mat1k); # 28.229 μs, 164.63 KiB
45
45
```
46
46
47
+ For such a simple function, timing ` sum(sin, MapCols{3}(fun, m)) ` takes 3 to 10 times longer!
48
+
47
49
## Other packages
48
50
49
51
This package also provides Zygote gradients for the slice/glue functions in
@@ -53,13 +55,13 @@ which can be used to write many mapslices-like operations.
53
55
54
56
``` julia
55
57
using TensorCast
56
- @cast [i,j] := fun (mat[:,j])[i] # same as mapcols
58
+ @cast [i,j] := fun (mat[:,j])[i] # same as mapcols
57
59
58
60
tcm (mat) = @cast out[i,j] := fun (mat[:,j])[i]
59
61
Zygote. gradient (m -> sum (sin, tcm (m)), mat)[1 ]
60
62
61
- @btime tcm ($ mat1k) # 407.176 μs
62
- @btime Zygote. gradient (m -> sum (sin, tcm (m)), $ mat1k); # 19.086 ms
63
+ @btime tcm ($ mat1k) # 427.907 μs
64
+ @btime Zygote. gradient (m -> sum (tcm (m)), $ mat1k); # 18.358 ms
63
65
```
64
66
65
67
Similar gradients work for the Slice/Align functions in
@@ -69,11 +71,11 @@ so it defines these too:
69
71
``` julia
70
72
using JuliennedArrays
71
73
jumap (f,m) = Align (map (f, Slices (m, True (), False ())), True (), False ())
72
- jumap (fun, mat) # same as mapcols
74
+ jumap (fun, mat) # same as mapcols
73
75
Zygote. gradient (m -> sum (sin, jumap (fun, m)), mat)[1 ]
74
76
75
- @btime jumap (fun, $ mat1k); # 408.259 μs
76
- @btime Zygote. gradient (m -> sum (sin, jumap (fun, m)), $ mat1k); # 18.638 ms
77
+ @btime jumap (fun, $ mat1k); # 421.061 μs
78
+ @btime Zygote. gradient (m -> sum (jumap (fun, m)), $ mat1k); # 18.383 ms
77
79
```
78
80
79
81
That's a 2-line gradient definition, so borrowing it may be easier than depending on this package.
@@ -102,11 +104,11 @@ Tracker.gradient(k -> sum(sin, MapCols{2}(g, k, 1:5)), kay)[1]
102
104
This is quite efficient, and seems to go well with multi-threading:
103
105
104
106
``` julia
105
- @btime MapCols {2} (g, $ kay, 1 : 5 ) # 1.423 ms
106
- @btime ThreadMapCols {2} (g, $ kay, 1 : 5 ) # 713.748 μs
107
+ @btime MapCols {2} (g, $ kay, 1 : 5 ) # 1.394 ms
108
+ @btime ThreadMapCols {2} (g, $ kay, 1 : 5 ) # 697.333 μs
107
109
108
- @btime Tracker. gradient (k -> sum (sin, MapCols {2} (g, k, 1 : 5 )), $ kay)[1 ] # 2.535 ms
109
- @btime Tracker. gradient (k -> sum (sin, ThreadMapCols {2} (g, k, 1 : 5 )), $ kay)[1 ] # 1.333 ms
110
+ @btime Tracker. gradient (k -> sum (MapCols {2} (g, k, 1 : 5 )), $ kay)[1 ] # 2.561 ms
111
+ @btime Tracker. gradient (k -> sum (ThreadMapCols {2} (g, k, 1 : 5 )), $ kay)[1 ] # 1.344 ms
110
112
111
113
Threads. nthreads () == 4 # on my 2/4-core laptop
112
114
```
0 commit comments