You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -258,28 +258,28 @@ Here is the resulting movie when running the application on 8 GPUs, solving 3-D
258
258
The corresponding file can be found [here](/examples/diffusion3D_multigpucpu_hidecomm.jl).
259
259
260
260
## Interactive prototyping with runtime hardware selection
261
-
The KernelAbstractions backend keeps the familiar parse-time `@init_parallel_stencil` workflow while enabling runtime hardware switches through the `select_hardware` and `current_hardware` functions; the runtime hardware target defaults to CPU and can be switched as many times as desired during a session without requiring redefinition of kernels or reinitialization of the backend. The following copy-pasteable example outlines this workflow with a simple SAXPY kernel, demonstrating initial execution on CPU followed by a switch to CUDA GPU and a second execution there:
261
+
The KernelAbstractions backend keeps the familiar parse-time `@init_parallel_stencil` workflow while enabling runtime hardware switches through the `select_hardware` and `current_hardware` functions; the runtime hardware target defaults to CPU and can be switched as many times as desired during a session without requiring redefinition of kernels or reinitialization of the backend. The following copy-pasteable example outlines this workflow with a simple SAXPY kernel, demonstrating initial execution on CPU followed by a switch to CUDA-capable GPU and a second execution there:
All miniapp codes follow a similar structure and permit serial and threaded CPU as well as Nvidia GPU execution. The first line of each miniapp code permits to enable the CUDA GPU backend upon setting the `USE_GPU` flag to `true`.
475
+
All miniapp codes follow a similar structure and permit serial and threaded CPU as well as Nvidia GPU execution. The first line of each miniapp code permits to enable the CUDA.jl GPU backend upon setting the `USE_GPU` flag to `true`.
476
476
477
477
All the miniapps can be interactively executed within the [Julia REPL] (this includes the multi-xPU versions when using a single CPU or GPU). Note that for optimal performance the miniapp script of interest `<miniapp_code>` should be launched from the shell using the project's dependencies `--project`, disabling array bound checking `--check-bounds=no`, and using optimization level 3 `-O3`.
0 commit comments