You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/examples/beeler_reuter.md
+23-23Lines changed: 23 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,7 +28,7 @@ Let's start by developing a CPU only IMEX solver. The main idea is to use the *D
28
28
29
29
First, we define the model constants:
30
30
31
-
```@example beeler
31
+
```julia
32
32
const v0 =-84.624
33
33
const v1 =10.0
34
34
const C_K1 =1.0f0
@@ -51,8 +51,8 @@ Note that the constants are defined as `Float32` and not `Float64`. The reason i
51
51
52
52
Next, we define a struct to contain our state. `BeelerReuterCpu` is a functor and we will define a deriv function as its associated function.
53
53
54
-
```@example beeler
55
-
mutable struct BeelerReuterCpu <: Function
54
+
```julia
55
+
mutable struct BeelerReuterCpu
56
56
t::Float64# the last timestep time to calculate Δt
57
57
diff_coef::Float64# the diffusion-coefficient (coupling strength)
58
58
@@ -92,7 +92,7 @@ end
92
92
93
93
The finite-difference Laplacian is calculated in-place by a 5-point stencil. The Neumann boundary condition is enforced. Note that we could have also used [DiffEqOperators.jl](https://github.com/JuliaDiffEq/DiffEqOperators.jl) to automate this step.
`rush_larsen` is a helper function that use the Rush-Larsen method to integrate the gating variables.
156
156
157
-
```@example beeler
157
+
```julia
158
158
@inlinefunctionrush_larsen(g, α, β, Δt)
159
159
inf = α/(α+β)
160
160
τ =1f0/ (α+β)
@@ -164,7 +164,7 @@ end
164
164
165
165
The gating variables are updated as below. The details of how to calculate $\alpha$ and $\beta$ are based on the Beeler-Reuter model and not of direct interest to this tutorial.
166
166
167
-
```@example beeler
167
+
```julia
168
168
functionupdate_M_cpu(g, v, Δt)
169
169
# the condition is needed here to prevent NaN when v == 47.0
The intracelleular calcium is not technically a gating variable, but we can use a similar explicit exponential integrator for it.
207
207
208
-
```@example beeler
208
+
```julia
209
209
functionupdate_C_cpu(g, d, f, v, Δt)
210
210
ECa = D_Ca -82.3f0-13.0278f0*log(g)
211
211
kCa = C_s * g_s * d * f
@@ -236,7 +236,7 @@ Now, it is time to define the derivative function as an associated function of *
236
236
237
237
Here, every time step is called three times. We distinguish between two types of calls to the deriv function. When $t$ changes, the gating variables are updated by calling `update_gates_cpu`:
238
238
239
-
```@example beeler
239
+
```julia
240
240
functionupdate_gates_cpu(u, XI, M, H, J, D, F, C, Δt)
241
241
let Δt =Float32(Δt)
242
242
n1, n2 =size(u)
@@ -260,7 +260,7 @@ end
260
260
261
261
On the other hand, du is updated at each time step, since it is independent of $\Delta{t}$.
262
262
263
-
```@example beeler
263
+
```julia
264
264
# iK1 is the inward-rectifying potassium current
265
265
functioncalc_iK1(v)
266
266
ea =exp(0.04f0*(v+85f0))
@@ -314,7 +314,7 @@ end
314
314
315
315
Finally, we put everything together is our deriv function, which is a call on `BeelerReuterCpu`.
316
316
317
-
```@example beeler
317
+
```julia
318
318
function (f::BeelerReuterCpu)(du, u, p, t)
319
319
Δt = t - f.t
320
320
@@ -337,22 +337,22 @@ end
337
337
338
338
Time to test! We need to define the starting transmembrane potential with the help of global constants **v0** and **v1**, which represent the resting and activated potentials.
339
339
340
-
```@example beeler
340
+
```julia
341
341
const N =192;
342
342
u0 =fill(v0, (N, N));
343
343
u0[90:102,90:102] .= v1; # a small square in the middle of the domain
344
344
```
345
345
346
346
The initial condition is a small square in the middle of the domain.
For stiff reaction-diffusion equations, CVODE_BDF from Sundial library is an excellent solver.
363
363
364
-
```@example beeler
364
+
```julia
365
365
@time sol =solve(prob, CVODE_BDF(linear_solver=:GMRES), saveat=100.0);
366
366
```
367
367
368
-
```@example beeler
368
+
```julia
369
369
heatmap(sol.u[end])
370
370
```
371
371
@@ -399,7 +399,7 @@ The key to fast CUDA programs is to minimize CPU/GPU memory transfers and global
399
399
400
400
We modify ``BeelerReuterCpu`` into ``BeelerReuterGpu`` by defining the state variables as *CuArray*s instead of standard Julia *Array*s. The name of each variable defined on GPU is prefixed by *d_* for clarity. Note that $\Delta{v}$ is a temporary storage for the Laplacian and stays on the CPU side.
401
401
402
-
```@example beeler
402
+
```julia
403
403
using CUDA
404
404
405
405
mutable struct BeelerReuterGpu <:Function
@@ -447,7 +447,7 @@ end
447
447
448
448
The Laplacian function remains unchanged. The main change to the explicit gating solvers is that *exp* and *expm1* functions are prefixed by *CUDAnative.*. This is a technical nuisance that will hopefully be resolved in future.
449
449
450
-
```@example beeler
450
+
```julia
451
451
functionrush_larsen_gpu(g, α, β, Δt)
452
452
inf = α/(α+β)
453
453
τ =1.0/(α+β)
@@ -503,7 +503,7 @@ end
503
503
504
504
Similarly, we modify the functions to calculate the individual currents by adding CUDAnative prefix.
505
505
506
-
```@example beeler
506
+
```julia
507
507
# iK1 is the inward-rectifying potassium current
508
508
functioncalc_iK1(v)
509
509
ea = CUDAnative.exp(0.04f0*(v+85f0))
@@ -563,7 +563,7 @@ A CUDA programmer is free to interpret the calculated index however it fits the
563
563
In the GPU version of the solver, each thread works on a single element of the medium, indexed by a (x,y) pair.
564
564
`update_gates_gpu` and `update_du_gpu` are very similar to their CPU counterparts but are in fact CUDA kernels where the *for* loops are replaced with CUDA specific indexing. Note that CUDA kernels cannot return a valve; hence, *nothing* at the end.
565
565
566
-
```@example beeler
566
+
```julia
567
567
functionupdate_gates_gpu(u, XI, M, H, J, D, F, C, Δt)
568
568
i = (blockIdx().x-UInt32(1)) *blockDim().x +threadIdx().x
Finally, the deriv function is modified to copy *u* to GPU and copy *du* back and to invoke CUDA kernels.
610
610
611
-
```@example beeler
611
+
```julia
612
612
function (f::BeelerReuterGpu)(du, u, p, t)
613
613
L =16# block size
614
614
Δt = t - f.t
@@ -636,23 +636,23 @@ end
636
636
637
637
Ready to test!
638
638
639
-
```@example beeler
639
+
```julia
640
640
using DifferentialEquations, Sundials
641
641
642
642
deriv_gpu =BeelerReuterGpu(u0, 1.0);
643
643
prob =ODEProblem(deriv_gpu, u0, (0.0, 50.0));
644
644
@time sol =solve(prob, CVODE_BDF(linear_solver=:GMRES), saveat=100.0);
645
645
```
646
646
647
-
```@example beeler
647
+
```julia
648
648
heatmap(sol.u[end])
649
649
```
650
650
651
651
## Summary
652
652
653
653
We achieve around a 6x speedup with running the explicit portion of our IMEX solver on a GPU. The major bottleneck of this technique is the communication between CPU and GPU. In its current form, not all of the internals of the method utilize GPU acceleration. In particular, the implicit equations solved by GMRES are performed on the CPU. This partial CPU nature also increases the amount of data transfer that is required between the GPU and CPU (performed every f call). Compiling the full ODE solver to the GPU would solve both of these issues and potentially give a much larger speedup. [JuliaDiffEq developers are currently working on solutions to alleviate these issues](http://www.stochasticlifestyle.com/solving-systems-stochastic-pdes-using-gpus-julia/), but these will only be compatible with native Julia solvers (and not Sundials).
0 commit comments