Skip to content

Commit 3ff82e8

Browse files
committed
Add interface docs.
1 parent 3f2ec79 commit 3ff82e8

File tree

7 files changed

+139
-41
lines changed

7 files changed

+139
-41
lines changed

docs/make.jl

Lines changed: 26 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,30 @@
11
using Documenter, GPUArrays
22

3-
makedocs(
4-
modules = [GPUArrays],
5-
format = Documenter.HTML(
6-
# Use clean URLs on CI
7-
prettyurls = get(ENV, "CI", nothing) == "true",
8-
assets = ["assets/favicon.ico"],
9-
analytics = "UA-154489943-6",
10-
),
11-
sitename = "GPUArrays.jl",
12-
pages = [
13-
"Home" => "index.md",
14-
"Interface" => "interface.md",
15-
"Functionality" => [
16-
"functionality/host.md",
17-
"functionality/device.md",
3+
function main()
4+
makedocs(
5+
modules = [GPUArrays],
6+
format = Documenter.HTML(
7+
# Use clean URLs on CI
8+
prettyurls = get(ENV, "CI", nothing) == "true",
9+
assets = ["assets/favicon.ico"],
10+
analytics = "UA-154489943-6",
11+
),
12+
sitename = "GPUArrays.jl",
13+
pages = [
14+
"Home" => "index.md",
15+
"Interface" => "interface.md",
16+
"Functionality" => [
17+
"functionality/host.md",
18+
"functionality/device.md",
19+
],
20+
"Test suite" => "testsuite.md",
1821
],
19-
"Test suite" => "testsuite.md",
20-
],
21-
doctest = true,
22-
)
22+
doctest = true,
23+
)
2324

24-
deploydocs(
25-
repo = "github.com/JuliaGPU/GPUArrays.jl.git"
26-
)
25+
deploydocs(
26+
repo = "github.com/JuliaGPU/GPUArrays.jl.git"
27+
)
28+
end
29+
30+
isinteractive() || main()

docs/src/interface.md

Lines changed: 84 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,87 @@
11
# Interface
22

3-
To extend the above functionality to a new array type, you should implement the following
4-
interfaces:
3+
To extend the above functionality to a new array type, you should use the types and
4+
implement the interfaces listed on this page. GPUArrays is design around having two
5+
different array types to represent a GPU array: one that only ever lives on the host, and
6+
one that actually can be instantiated on the device (i.e. in kernels).
57

6-
TODO
8+
## Host-side
9+
10+
Your host-side array type should build on the `AbstractGPUArray` supertype:
11+
12+
```@docs
13+
AbstractGPUArray
14+
```
15+
16+
First of all, you should implement operations that are expected to be defined for any
17+
`AbstractArray` type. Refer to the Julia manual for more details, or look at the `JLArray`
18+
reference implementation.
19+
20+
To be able to actually use the functionality that is defined for `AbstractGPUArray`s, you
21+
should provide implementations of the following interfaces:
22+
23+
```@docs
24+
GPUArrays.unsafe_reinterpret
25+
```
26+
27+
### Devices
28+
29+
```@docs
30+
GPUArrays.device
31+
GPUArrays.synchronize
32+
```
33+
34+
### Execution
35+
36+
```@docs
37+
GPUArrays.AbstractGPUBackend
38+
GPUArrays.backend
39+
```
40+
41+
```@docs
42+
GPUArrays._gpu_call
43+
```
44+
45+
### Linear algebra
46+
47+
```@docs
48+
GPUArrays.blas_module
49+
GPUArrays.blasbuffer
50+
```
51+
52+
53+
## Device-side
54+
55+
To work with GPU memory on the device itself, e.g. within a kernel, we need a different
56+
type: Most functionality will behave differently when running on the GPU, e.g., accessing
57+
memory directly instead of copying it to the host. We should also take care not to call into
58+
any host library, such as the Julia runtime or the system's math library.
59+
60+
```@docs
61+
AbstractDeviceArray
62+
```
63+
64+
Your device array type should again implement the core elements of the `AbstractArray`
65+
interface, such as indexing and certain getters. Refer to the Julia manual for more details,
66+
or look at the `JLDeviceArray` reference implementation.
67+
68+
You should also provide implementations of several "GPU intrinsics". To make sure the
69+
correct implementation is called, the first argument to these intrinsics will be the kernel
70+
state object from before.
71+
72+
```@docs
73+
GPUArrays.LocalMemory
74+
GPUArrays.synchronize_threads
75+
GPUArrays.blockidx_x
76+
GPUArrays.blockidx_y
77+
GPUArrays.blockidx_z
78+
GPUArrays.blockdim_x
79+
GPUArrays.blockdim_y
80+
GPUArrays.blockdim_z
81+
GPUArrays.threadidx_x
82+
GPUArrays.threadidx_y
83+
GPUArrays.threadidx_z
84+
GPUArrays.griddim_x
85+
GPUArrays.griddim_y
86+
GPUArrays.griddim_z
87+
```

src/device/abstractarray.jl

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,13 @@ export AbstractDeviceArray, @LocalMemory
55

66
## device array
77

8+
"""
9+
AbstractDeviceArray{T, N}
10+
11+
Supertype for `N`-dimensional GPU arrays (or array-like types) with elements of type `T`.
12+
This type is a subtype of `AbstractArray{T, N}`. Instances of this type are expected to live
13+
on the device, see [`AbstractGPUArray`](@ref) for device-side objects.
14+
"""
815
abstract type AbstractDeviceArray{T, N} <: AbstractArray{T, N} end
916

1017
Base.IndexStyle(::AbstractDeviceArray) = IndexLinear()

src/host/abstractarray.jl

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,19 @@
22

33
export AbstractGPUArray
44

5-
abstract type AbstractGPUArray{T, N} <: DenseArray{T, N} end
5+
"""
6+
AbstractGPUArray{T, N}
7+
8+
Supertype for `N`-dimensional GPU arrays (or array-like types) with elements of type `T`.
9+
This type is a subtype of `AbstractArray{T, N}`. Instances of this type are expected to live
10+
on the host, see [`AbstractDeviceArray`](@ref) for device-side objects.
11+
"""
12+
abstract type AbstractGPUArray{T, N} <: AbstractArray{T, N} end
613

7-
# Sampler type that acts like a texture/image and allows interpolated access
8-
abstract type Sampler{T, N} <: DenseArray{T, N} end
14+
const AbstractGPUVector{T} = AbstractGPUArray{T, 1}
15+
const AbstractGPUMatrix{T} = AbstractGPUArray{T, 2}
16+
const AbstractGPUVecOrMat{T} = Union{AbstractGPUArray{T, 1}, AbstractGPUArray{T, 2}}
917

10-
const GPUVector{T} = AbstractGPUArray{T, 1}
11-
const GPUMatrix{T} = AbstractGPUArray{T, 2}
12-
const GPUVecOrMat{T} = Union{AbstractGPUArray{T, 1}, AbstractGPUArray{T, 2}}
1318

1419
# input/output
1520

@@ -216,8 +221,9 @@ DEALINGS IN THE SOFTWARE.
216221
import Base.reinterpret
217222

218223
"""
219-
Unsafe reinterpret for backends to overload.
220-
This makes it easier to do checks just on the high level.
224+
unsafe_reinterpret(T, a, dims)
225+
226+
Reinterpret the array `a` to have a new element type `T` and size `dims`.
221227
"""
222228
function unsafe_reinterpret end
223229

src/host/base.jl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,12 +54,12 @@ function _sub2ind(inds, L, ind, i::IT, I::IT...) where IT
5454
end
5555

5656
# This is pretty ugly, but I feel bad to add those to device arrays, since
57-
# we're never bound checking... So getindex(a::GPUVector, 10, 10) would silently go unnoticed
57+
# we're never bound checking... So getindex(a::AbstractGPUVector, 10, 10) would silently go unnoticed
5858
# we need this here for easier implementation of repeat
5959
@inline Base.@propagate_inbounds getidx_2d1d(x::AbstractVector, i, j) = x[i]
6060
@inline Base.@propagate_inbounds getidx_2d1d(x::AbstractMatrix, i, j) = x[i, j]
6161

62-
function Base.repeat(a::GPUVecOrMat, m::Int, n::Int = 1)
62+
function Base.repeat(a::AbstractGPUVecOrMat, m::Int, n::Int = 1)
6363
o, p = size(a, 1), size(a, 2)
6464
b = similar(a, o*m, p*n)
6565
gpu_call(a, (b, a, o, p, m, n), n) do state, b, a, o, p, m, n
@@ -79,7 +79,7 @@ function Base.repeat(a::GPUVecOrMat, m::Int, n::Int = 1)
7979
return b
8080
end
8181

82-
function Base.repeat(a::GPUVector, m::Int)
82+
function Base.repeat(a::AbstractGPUVector, m::Int)
8383
o = length(a)
8484
b = similar(a, o*m)
8585
gpu_call(a, (b, a, o, m), m) do state, b, a, o, m

src/host/linalg.jl

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ for elty in (Float32, Float64, ComplexF32, ComplexF64)
1414
@eval begin
1515
function BLAS.gemm!(
1616
transA::AbstractChar, transB::AbstractChar, alpha::$T,
17-
A::GPUVecOrMat{$elty}, B::GPUVecOrMat{$elty},
18-
beta::$T, C::GPUVecOrMat{$elty}
17+
A::AbstractGPUVecOrMat{$elty}, B::AbstractGPUVecOrMat{$elty},
18+
beta::$T, C::AbstractGPUVecOrMat{$elty}
1919
)
2020
blasmod = blas_module(A)
2121
result = blasmod.gemm!(
@@ -56,7 +56,7 @@ end
5656
for elty in (Float32, Float64, ComplexF32, ComplexF64)
5757
T = VERSION >= v"1.3.0-alpha.115" ? :(Union{($elty), Bool}) : elty
5858
@eval begin
59-
function BLAS.gemv!(trans::AbstractChar, alpha::$T, A::GPUVecOrMat{$elty}, X::GPUVector{$elty}, beta::$T, Y::GPUVector{$elty})
59+
function BLAS.gemv!(trans::AbstractChar, alpha::$T, A::AbstractGPUVecOrMat{$elty}, X::AbstractGPUVector{$elty}, beta::$T, Y::AbstractGPUVector{$elty})
6060
m, n = size(A, 1), size(A, 2)
6161
if trans == 'N' && (length(X) != n || length(Y) != m)
6262
throw(DimensionMismatch("A has dimensions $(size(A)), X has length $(length(X)) and Y has length $(length(Y))"))
@@ -92,7 +92,7 @@ end
9292

9393
for elty in (Float32, Float64, ComplexF32, ComplexF64)
9494
@eval begin
95-
function BLAS.gbmv!(trans::AbstractChar, m::Integer, kl::Integer, ku::Integer, alpha::($elty), A::GPUMatrix{$elty}, X::GPUVector{$elty}, beta::($elty), Y::GPUVector{$elty})
95+
function BLAS.gbmv!(trans::AbstractChar, m::Integer, kl::Integer, ku::Integer, alpha::($elty), A::AbstractGPUMatrix{$elty}, X::AbstractGPUVector{$elty}, beta::($elty), Y::AbstractGPUVector{$elty})
9696
n = size(A, 2)
9797
if trans == 'N' && (length(X) != n || length(Y) != m)
9898
throw(DimensionMismatch("A has dimensions $n, $m, X has length $(length(X)) and Y has length $(length(Y))"))

src/host/mapreduce.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Base.count(pred::Function, A::AbstractGPUArray) = Int(mapreduce(pred, +, A; init
99

1010
Base.:(==)(A::AbstractGPUArray, B::AbstractGPUArray) = Bool(mapreduce(==, &, A, B; init = true))
1111

12-
LinearAlgebra.ishermitian(A::GPUMatrix) = acc_mapreduce(==, &, true, A, (adjoint(A),))
12+
LinearAlgebra.ishermitian(A::AbstractGPUMatrix) = acc_mapreduce(==, &, true, A, (adjoint(A),))
1313

1414
# hack to get around of fetching the first element of the AbstractGPUArray
1515
# as a startvalue, which is a bit complicated with the current reduce implementation

0 commit comments

Comments
 (0)