|
| 1 | +# GPU Support |
| 2 | + |
| 3 | +GPU support is still an experimental feature that is actively being worked on. |
| 4 | +As of now, the [`WeaklyCompressibleSPHSystem`](@ref) and the [`BoundarySPHSystem`](@ref) |
| 5 | +are supported on GPUs. |
| 6 | +We have tested this on GPUs by Nvidia and AMD. |
| 7 | + |
| 8 | +To run a simulation on a GPU, we need to use the [`FullGridCellList`](@ref) |
| 9 | +as cell list for the [`GridNeighborhoodSearch`](@ref). |
| 10 | +This cell list requires a bounding box for the domain, unlike the default cell list, which |
| 11 | +uses an unbounded domain. |
| 12 | +For simulations that are bounded by a closed tank, we can use the boundary of the tank |
| 13 | +to obtain the bounding box as follows. |
| 14 | +```jldoctest gpu; output=false, setup=:(using TrixiParticles; trixi_include(@__MODULE__, joinpath(examples_dir(), "fluid", "hydrostatic_water_column_2d.jl"), sol=nothing)) |
| 15 | +search_radius = TrixiParticles.compact_support(smoothing_kernel, smoothing_length) |
| 16 | +min_corner = minimum(tank.boundary.coordinates, dims=2) .- search_radius |
| 17 | +max_corner = maximum(tank.boundary.coordinates, dims=2) .+ search_radius |
| 18 | +cell_list = TrixiParticles.PointNeighbors.FullGridCellList(; min_corner, max_corner) |
| 19 | +
|
| 20 | +# output |
| 21 | +PointNeighbors.FullGridCellList{PointNeighbors.DynamicVectorOfVectors{Int32, Matrix{Int32}, Vector{Int32}, Base.RefValue{Int32}}, Nothing, SVector{2, Float64}, SVector{2, Float64}}(Vector{Int32}[], nothing, [-0.24500000000000002, -0.24500000000000002], [1.245, 1.245]) |
| 22 | +``` |
| 23 | + |
| 24 | +We then need to pass this cell list to the neighborhood search and the neighborhood search |
| 25 | +to the [`Semidiscretization`](@ref). |
| 26 | +```jldoctest gpu; output=false |
| 27 | +semi = Semidiscretization(fluid_system, boundary_system, |
| 28 | + neighborhood_search=GridNeighborhoodSearch{2}(; cell_list)) |
| 29 | +
|
| 30 | +# output |
| 31 | +┌──────────────────────────────────────────────────────────────────────────────────────────────────┐ |
| 32 | +│ Semidiscretization │ |
| 33 | +│ ══════════════════ │ |
| 34 | +│ #spatial dimensions: ………………………… 2 │ |
| 35 | +│ #systems: ……………………………………………………… 2 │ |
| 36 | +│ neighborhood search: ………………………… GridNeighborhoodSearch │ |
| 37 | +│ total #particles: ………………………………… 636 │ |
| 38 | +└──────────────────────────────────────────────────────────────────────────────────────────────────┘ |
| 39 | +``` |
| 40 | + |
| 41 | +At this point, we should run the simulation and make sure that it still works and that |
| 42 | +the bounding box is large enough. |
| 43 | +For some simulations where particles move outside the initial tank coordinates, |
| 44 | +for example when the tank is not closed or when the tank is moving, an appropriate |
| 45 | +bounding box has to be specified. |
| 46 | + |
| 47 | +Then, we only need to specify the data type that is used for the simulation. |
| 48 | +On an Nvidia GPU, we specify: |
| 49 | +```julia |
| 50 | +using CUDA |
| 51 | +ode = semidiscretize(semi, tspan, data_type=CuArray) |
| 52 | +``` |
| 53 | +On an AMD GPU, we use: |
| 54 | +```julia |
| 55 | +using AMDGPU |
| 56 | +ode = semidiscretize(semi, tspan, data_type=ROCArray) |
| 57 | +``` |
| 58 | +Then, we can run the simulation as usual. |
| 59 | +All data is transferred to the GPU during initialization and all loops over particles |
| 60 | +and their neighbors will be executed on the GPU as kernels generated by KernelAbstractions.jl. |
| 61 | +Data is only copied to the CPU for saving VTK files via the [`SolutionSavingCallback`](@ref). |
0 commit comments