Skip to content

Commit 149e29e

Browse files
authored
Update README.md
1 parent dc68bc6 commit 149e29e

File tree

1 file changed

+5
-88
lines changed

1 file changed

+5
-88
lines changed

README.md

Lines changed: 5 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,9 @@
66

77
[Benchmarks](https://github.com/JuliaGPU/GPUBenchmarks.jl/blob/master/results/results.md)
88

9-
GPU Array package for Julia's various GPU backends.
10-
The compilation for the GPU is done with [CUDAnative.jl](https://github.com/JuliaGPU/CUDAnative.jl/)
11-
and for OpenCL [Transpiler.jl](https://github.com/SimonDanisch/Transpiler.jl) is used.
12-
In the future it's planned to replace the transpiler by a similar approach
13-
CUDAnative.jl is using (via LLVM + SPIR-V).
9+
Abstract GPU Array package for Julia's various GPU backends.
10+
See it as a Julia Base.AbstractArray for GPUs.
11+
Currently, you either need to install [CLArrays](https://github.com/JuliaGPU/CLArrays.jl) or [CuArrays](https://github.com/JuliaGPU/CLArrays.jl) for a concrete implementation.
1412

1513

1614
# Why another GPU array package in yet another language?
@@ -50,9 +48,7 @@ In theory, we could go as far as inspecting user defined callbacks (we can get t
5048

5149
# Scope
5250

53-
Current backends: OpenCL, CUDA, Julia Threaded
54-
55-
Implemented for all backends:
51+
Interface offered for all backends:
5652

5753
```Julia
5854
map(f, ::GPUArray...)
@@ -71,68 +67,6 @@ From (CL/CU)BLAS
7167
gemm!, scal!, gemv! and the high level functions that are implemented with these, like A * B, A_mul_B!, etc.
7268
```
7369

74-
75-
# Usage
76-
77-
A backend will be initialized by default,
78-
but can be explicitly set with `opencl()`, `cudanative()`, `threaded()`.
79-
There is also `GPUArrays.init(device_symbol, filterfuncs...)`, which can be used to programmatically
80-
initialize a backend.
81-
Filterfuncs can be used to select a device like this (`opencl()`, etc also support those):
82-
```Julia
83-
import GPUArrays: is_gpu, has_atleast, threads
84-
GPUArrays.init(:cudanative, is_gpu, dev -> has_atleast(dev, threads, 512))
85-
```
86-
You can also temporarily create a context on the currently selected backend with this construct:
87-
```Julia
88-
on_device([device = GPUArrays.current_device()]) do context
89-
A = GPUArray(rand(Float32, 32, 32))
90-
c = A .+ A
91-
end
92-
```
93-
Or you can run some code on all currently available devices like this:
94-
95-
```Julia
96-
forall_devices(filterfuncs...) do context
97-
A = GPUArray(rand(Float32, 32, 32))
98-
c = A .+ A
99-
end
100-
```
101-
102-
103-
```Julia
104-
using GPUArrays
105-
106-
a = GPUArray(rand(Float32, 32, 32)) # can be constructed from any Julia Array
107-
b = similar(a) # similar and other Julia.Base operations are defined
108-
b .= a .+ 1f0 # broadcast in action, only works on 0.6 for .+. on 0.5 do: b .= (+).(a, 1f0)!
109-
c = a * b # calls to BLAS
110-
function test(a, b)
111-
Complex64(sin(a / b))
112-
end
113-
complex_c = test.(c, b)
114-
fft!(complex_c) # fft!/ifft!/plan_fft, plan_ifft, plan_fft!, plan_ifft!
115-
116-
"""
117-
When you program with GPUArrays, you can just write normal julia functions, feed them to gpu_call and depending on what backend you choose it will use Transpiler.jl or CUDAnative.
118-
"""
119-
#Signature, global_size == cuda blocks, local size == cuda threads
120-
gpu_call(kernel::Function, DispatchDummy::GPUArray, args::Tuple, global_size = length(DispatchDummy), local_size = nothing)
121-
with kernel looking like this:
122-
123-
function kernel(state, arg1, arg2, arg3) # args get splatted into the kernel call
124-
# state gets always passed as the first argument and is needed to offer the same
125-
# functionality across backends, even though they have very different ways of of getting e.g. the thread index
126-
# arg1 can be any gpu array - this is needed to dispatch to the correct intrinsics.
127-
# if you call gpu_call without any further modifications to global/local size, this should give you a linear index into
128-
# DispatchDummy
129-
idx = linear_index(state)
130-
arg1[idx] = arg2[idx] + arg3[idx]
131-
return #kernel must return void
132-
end
133-
```
134-
Example for [gpu_call](https://github.com/JuliaGPU/GPUArrays.jl/blob/master/examples/custom_kernels.jl)
135-
13670
# Currently supported subset of Julia Code
13771

13872
working with immutable isbits (not containing pointers) type should be completely supported
@@ -150,21 +84,4 @@ Transpiler/OpenCL has problems with putting GPU arrays on the gpu into a struct
15084

15185
# Installation
15286

153-
I recently added a lot of features and bug fixes to the master branch, so you might want to check that out (`Pkg.checkout("GPUArrays")`).
154-
155-
For the cudanative backend, you need to install [CUDAnative.jl manually](https://github.com/JuliaGPU/CUDAnative.jl/#installation) and it works only on osx + linux with a julia source build.
156-
Make sure to have either CUDA and/or OpenCL drivers installed correctly.
157-
`Pkg.build("GPUArrays")` will pick those up and should include the working backends.
158-
So if your system configuration changes, make sure to run `Pkg.build("GPUArrays")` again.
159-
The rest should work automatically:
160-
161-
```Julia
162-
Pkg.add("GPUArrays")
163-
Pkg.checkout("GPUArrays") # optional but recommended to checkout master branch
164-
Pkg.build("GPUArrays") # should print out information about what backends are added
165-
# Test it!
166-
Pkg.test("GPUArrays")
167-
```
168-
If a backend is not supported by the hardware, you will see build errors while running `Pkg.add("GPUArrays")`.
169-
Since GPUArrays selects only working backends when running `Pkg.build("GPUArrays")`
170-
**these errors can be ignored**.
87+
See CuArrays or CLArrays for installation instructions.

0 commit comments

Comments
 (0)