You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+31-15Lines changed: 31 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,9 @@ CUDAnative.jl is using (via LLVM + SPIR-V).
14
14
15
15
# Why another GPU array package in yet another language?
16
16
17
-
Julia offers countless advantages for a GPU array package.
17
+
Julia offers great advantages for programming the GPU.
18
+
This [blog post](http://mikeinnes.github.io/2017/08/24/cudanative.html) outlines a few of those.
19
+
18
20
E.g., we can use Julia's JIT to generate optimized kernels for map/broadcast operations.
19
21
20
22
This works even for things like complex arithmetic, since we can compile what's already in Julia Base.
@@ -45,15 +47,6 @@ Checkout the examples, to see how this can be used to emit specialized code whil
45
47
In theory, we could go as far as inspecting user defined callbacks (we can get the complete AST), count operations and estimate register usage and use those numbers to optimize our kernels!
46
48
47
49
48
-
### Automatic Differentiation
49
-
50
-
Because of neural networks, automatic differentiation is super hyped right now!
51
-
Julia offers a couple of packages for that, e.g. [ReverseDiff](https://github.com/JuliaDiff/ReverseDiff.jl).
52
-
It heavily relies on Julia's strength to specialize generic code and dispatch to different implementations depending on the Array type, allowing an almost overheadless automatic differentiation.
53
-
Making this work with GPUArrays will be a bit more involved, but the
54
-
first [prototype](https://github.com/JuliaGPU/GPUArrays.jl/blob/master/examples/logreg.jl) looks already promising!
55
-
There is also [ReverseDiffSource](https://github.com/JuliaDiff/ReverseDiffSource.jl), which should already work for simple functions.
56
-
57
50
# Scope
58
51
59
52
Current backends: OpenCL, CUDA, Julia Threaded
@@ -80,10 +73,33 @@ gemm!, scal!, gemv! and the high level functions that are implemented with these
80
73
81
74
# Usage
82
75
76
+
A backend will be initialized by default,
77
+
but can be explicitly set with `opencl()`, `cudanative()`, `threaded()`.
78
+
There is also `GPUArrays.init(device_symbol, filterfuncs...)`, which can be used to programmatically
79
+
initialize a backend.
80
+
Filterfuncs can be used to select a device like this (`opencl()`, etc also support those):
0 commit comments