Skip to content

Commit 80adbb5

Browse files
authored
Better document CUDA_CACHE_PATH in CP2K docs (#43)
1 parent 88c0f1c commit 80adbb5

File tree

2 files changed

+18
-4
lines changed

2 files changed

+18
-4
lines changed

docs/software/communication/cray-mpich.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ It is available through uenvs like [prgenv-gnu][ref-uenv-prgenv-gnu] and [the ap
77
The [Cray MPICH documentation](https://cpe.ext.hpe.com/docs/latest/mpt/mpich/index.html) contains detailed information about Cray MPICH.
88
On this page we outline the most common workflows and issues that you may encounter on Alps.
99

10+
[](){#ref-communication-cray-mpich-gpu-aware}
1011
## GPU-aware MPI
1112

1213
We recommend using GPU-aware MPI whenever possible, as it almost always provides a significant performance improvement compared to communication through CPU memory.

docs/software/sciapps/cp2k.md

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,8 +36,16 @@ On our systems, CP2K is built with the following dependencies:
3636
* [spla]
3737

3838
!!! note "GPU-aware MPI"
39-
[COSMA] and [DLA-Future] are built with GPU-aware MPI. On the HPC platform, `MPICH_GPU_SUPPORT_ENABLED=1` is set by
40-
default, therefore there is no need to set it manually.
39+
[COSMA] and [DLA-Future] are built with [GPU-aware MPI][ref-communication-cray-mpich-gpu-aware], which requires setting `MPICH_GPU_SUPPORT_ENABLED=1`.
40+
On the HPC platform, `MPICH_GPU_SUPPORT_ENABLED=1` is set by
41+
default.
42+
43+
!!! note "CUDA cache path for JIT compilation"
44+
[DBCSR] uses JIT compilation for CUDA kernels.
45+
The default location is in the home directory, which can put unnecessary burden on the filesystem and lead to performance degradation.
46+
Because of this we set `CUDA_CACHE_PATH` to point to the in-memory filesystem in `/dev/shm`.
47+
On the HPC platform, `CUDA_CACHE_PATH` is set to a directory under `/dev/shm` by
48+
default.
4149

4250
!!! warning "BLAS/LAPACK on Eiger"
4351

@@ -67,7 +75,8 @@ MPS] daemon so that multiple MPI ranks can use the same GPU.
6775
#SBATCH --uenv=<CP2K_UENV>
6876
#SBATCH --view=cp2k
6977

70-
export CUDA_CACHE_PATH="/dev/shm/$RANDOM" # (5)
78+
export CUDA_CACHE_PATH="/dev/shm/$USER/cuda_cache" # (5)
79+
export MPICH_GPU_SUPPORT_ENABLED=1 # (6)
7180
export MPICH_MALLOC_FALLBACK=1
7281
export OMP_NUM_THREADS=$((SLURM_CPUS_PER_TASK - 1)) # (4)
7382

@@ -85,7 +94,11 @@ srun --cpu-bind=socket ./mps-wrapper.sh cp2k.psmp -i <CP2K_INPUT> -o <CP2K_OUTPU
8594
for good performance. With [Intel MKL], this is not necessary and one can set `OMP_NUM_THREADS` to
8695
`SLURM_CPUS_PER_TASK`.
8796

88-
5. [DBCSR] relies on extensive JIT compilation and we store the cache in memory to avoid I/O overhead
97+
5. [DBCSR] relies on extensive JIT compilation and we store the cache in memory to avoid I/O overhead.
98+
This is set by default on the HPC platform, but it's set here explicitly as it's essential to avoid performance degradation.
99+
100+
6. CP2K's dependencies use GPU-aware MPI, which requires enabling support at runtime.
101+
This is set by default on the HPC platform, but it's set here explicitly as it's a requirement in general for enabling GPU-aware MPI.
89102

90103

91104
* Change <ACCOUNT> to your project account name

0 commit comments

Comments
 (0)