-
Notifications
You must be signed in to change notification settings - Fork 132
Use Adapt.jl to change storage and element type #2212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
+1,195
−263
Merged
Changes from 36 commits
Commits
Show all changes
38 commits
Select commit
Hold shift + click to select a range
9391f2c
Use Adapt.jl to change storage and element type
vchuravy 3784e65
restore elixir
benegee 542d168
offload compute_coefficients
benegee 018acc9
fmt
benegee dadd3de
test native version as well
benegee ba0ff90
adapt 1D and 3D version
benegee 3c112c9
Downgrade compat with Adapt
benegee 5f3caa4
update requires to 1.3
vchuravy 39b88b5
add support for AMDGPU
vchuravy 622d9cd
fix doctest
vchuravy 5e1dec7
Use `u_ode` to determine the computational backend
vchuravy 84f43cd
Use KA 0.9.31
vchuravy 273b9d4
handle VectorOfArray in trixi_backend
vchuravy 54d24df
fixup: runtests
vchuravy 66124de
format
vchuravy c48bcab
fix trixi_backend for RecursiveArrayTools{StaticArray}
vchuravy ff00327
fixup: amdgpu test
vchuravy 4de1df6
use unsafe_wrap with lock=false for AMDGPU
vchuravy 8867eb9
Update src/solvers/dg.jl
vchuravy 0a63915
Update ext/TrixiAMDGPUExt.jl
vchuravy 4f2fe99
Update src/auxiliary/containers.jl
vchuravy 2201422
use eleixi_advection_basic_gpu.jl for adapt test
vchuravy 46f168a
Update examples/p4est_2d_dgsem/elixir_advection_basic_gpu.jl
vchuravy 6ce7e92
something on KA
benegee c2a6f79
KA imports
vchuravy a3d9104
fix typo
vchuravy be60e83
fix CUDA compat in test/
vchuravy d8664d3
add stepsize callback
vchuravy 2e0dea4
fixup! fix CUDA compat in test/
vchuravy 958d308
upgrade minimal diffeqbase
vchuravy 7e9ccf8
upgrade StructArrays
vchuravy cb135ec
Apply suggestions from code review
vchuravy 0922acc
add Adapt compat in test/Project.toml
vchuravy fd47ebe
Apply suggestions from code review
vchuravy 232b3b5
address feedback from in-person conversation
vchuravy 4258c36
add news
vchuravy 432dcb8
Apply suggestions from code review
vchuravy 1acf007
Merge branch 'main' into vc/adapt
ranocha File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,163 @@ | ||
| # Heterogeneous computing | ||
|
|
||
| Support for heterogeneous computing is currently being worked on. | ||
|
|
||
| ## The use of Adapt.jl | ||
|
|
||
| [Adapt.jl](https://github.com/JuliaGPU/Adapt.jl) is a package in the | ||
| [JuliaGPU](https://github.com/JuliaGPU) family that allows for | ||
| the translation of nested data structures. The primary goal is to allow the substitution of `Array` | ||
| at the storage level with a GPU array like `CuArray` from [CUDA.jl](https://github.com/JuliaGPU/CUDA.jl). | ||
|
|
||
| To facilitate this, data structures must be parameterized, so instead of: | ||
|
|
||
| ```julia | ||
| struct Container <: Trixi.AbstractContainer | ||
| data::Array{Float64, 2} | ||
| end | ||
| ``` | ||
|
|
||
| They must be written as: | ||
|
|
||
| ```jldoctest adapt; output = false, setup=:(import Trixi) | ||
| struct Container{D<:AbstractArray} <: Trixi.AbstractContainer | ||
| data::D | ||
| end | ||
|
|
||
| # output | ||
|
|
||
| ``` | ||
|
|
||
| furthermore, we need to define a function that allows for the conversion of storage | ||
| of our types: | ||
|
|
||
| ```jldoctest adapt; output = false | ||
| using Adapt | ||
|
|
||
| function Adapt.adapt_structure(to, C::Container) | ||
| return Container(adapt(to, C.data)) | ||
| end | ||
|
|
||
| # output | ||
|
|
||
| ``` | ||
|
|
||
| or simply | ||
|
|
||
| ```julia | ||
| Adapt.@adapt_structure(Container) | ||
| ``` | ||
|
|
||
| additionally, we must define `Adapt.parent_type`. | ||
|
|
||
| ```jldoctest adapt; output = false | ||
| function Adapt.parent_type(::Type{<:Container{D}}) where D | ||
| return D | ||
| end | ||
|
|
||
| # output | ||
|
|
||
| ``` | ||
|
|
||
| All together we can use this machinery to perform conversions of a container. | ||
|
|
||
| ```jldoctest adapt | ||
| julia> C = Container(zeros(3)) | ||
| Container{Vector{Float64}}([0.0, 0.0, 0.0]) | ||
|
|
||
| julia> Trixi.storage_type(C) | ||
| Array | ||
| ``` | ||
|
|
||
|
|
||
| ```julia-repl | ||
| julia> using CUDA | ||
|
|
||
| julia> GPU_C = adapt(CuArray, C) | ||
| Container{CuArray{Float64, 1, CUDA.DeviceMemory}}([0.0, 0.0, 0.0]) | ||
|
|
||
| julia> Trixi.storage_type(C) | ||
| CuArray | ||
| ``` | ||
|
|
||
| ## Element-type conversion with `Trixi.trixi_adapt`. | ||
|
|
||
| We can use [`Trixi.trixi_adapt`](@ref) to perform both an element-type and a storage-type adoption: | ||
|
|
||
| ```jldoctest adapt | ||
| julia> C = Container(zeros(3)) | ||
| Container{Vector{Float64}}([0.0, 0.0, 0.0]) | ||
|
|
||
| julia> Trixi.trixi_adapt(Array, Float32, C) | ||
| Container{Vector{Float32}}(Float32[0.0, 0.0, 0.0]) | ||
| ``` | ||
|
|
||
| ```julia-repl | ||
| julia> Trixi.trixi_adapt(CuArray, Float32, C) | ||
| Container{CuArray{Float32, 1, CUDA.DeviceMemory}}(Float32[0.0, 0.0, 0.0]) | ||
| ``` | ||
|
|
||
| !!! note | ||
| `adapt(Array{Float32}, C)` is tempting, but it will do the wrong thing | ||
| in the presence of `SVector`s and similar arrays from StaticArrays.jl. | ||
|
|
||
|
|
||
| ## Writing GPU kernels | ||
|
|
||
| Offloading computations to the GPU is done with | ||
| [KernelAbstractions.jl](https://github.com/JuliaGPU/KernelAbstractions.jl), | ||
| allowing for vendor-agnostic GPU code. | ||
|
|
||
| ### Example | ||
|
|
||
| Given the following Trixi.jl code, which would typically be called from within `rhs!`: | ||
|
|
||
| ```julia | ||
| function trixi_rhs_fct(mesh, equations, solver, cache, args) | ||
| @threaded for element in eachelement(solver, cache) | ||
| # code | ||
| end | ||
| end | ||
| ``` | ||
|
|
||
| 1. Put the inner code in a new function `rhs_fct_per_element`. Besides the index | ||
| `element`, pass all required fields as arguments, but make sure to `@unpack` them from | ||
| their structs in advance. | ||
|
|
||
| 2. Where `trixi_rhs_fct` is called, get the backend, i.e., the hardware we are currently | ||
| running on via `trixi_backend(x)`. | ||
| This will, e.g., work with `u_ode`. Internally, KernelAbstractions.jl's `get_backend` | ||
| will be called, i.e., KernelAbstractions.jl has to know the type of `x`. | ||
|
|
||
| ```julia | ||
| backend = trixi_backend(u_ode) | ||
| ``` | ||
|
|
||
| 3. Add a new argument `backend` to `trixi_rhs_fct` used for dispatch. | ||
| When `backend` is `nothing`, the legacy implementation should be used: | ||
| ```julia | ||
| function trixi_rhs_fct(backend::Nothing, mesh, equations, solver, cache, args) | ||
| @unpack unpacked_args = cache | ||
| @threaded for element in eachelement(solver, cache) | ||
| rhs_fct_per_element(element, unpacked_args, args) | ||
| end | ||
| end | ||
| ``` | ||
|
|
||
| 4. When `backend` is a `Backend` (a type defined by KernelAbstractions.jl), write a | ||
| KernelAbstractions.jl kernel: | ||
| ```julia | ||
| function trixi_rhs_fct(backend::Backend, mesh, equations, solver, cache, args) | ||
| nelements(solver, cache) == 0 && return nothing # return early when there are no elements | ||
| @unpack unpacked_args = cache | ||
| kernel! = rhs_fct_kernel!(backend) | ||
| kernel!(unpacked_args, args, | ||
| ndrange = nelements(solver, cache)) | ||
| return nothing | ||
| end | ||
|
|
||
| @kernel function rhs_fct_kernel!(unpacked_args, args) | ||
| element = @index(Global) | ||
| rhs_fct_per_element(element, unpacked_args, args) | ||
| end | ||
| ``` |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.