Skip to content

Commit 6631974

Browse files
committed
elpa-gpu modify
1 parent 524c959 commit 6631974

File tree

2 files changed

+23
-2
lines changed

2 files changed

+23
-2
lines changed

toolchain/README.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -243,6 +243,28 @@ then just build the abacus executable program by compiling it with `./build_abac
243243

244244
The ELPA method need more parameter setting, but it doesn't seem to be affected by the CUDA toolkits version, and it is no need to manually install and package.
245245

246+
Note: ELPA-2025.01.001 may have problem in nvidia-GPU compilation on some V100-GPU with AMD-CPU machine, error message:
247+
```bash
248+
1872 | static __forceinline void CONCAT_8ARGS(hh_trafo_complex_kernel_,ROW_LENGTH,_,SIMD_SET,_,BLOCK,hv_,WORD_LENGTH) (DATA_TYPE_PTR q, DATA_TYPE_PTR hh, int nb, int ldq
249+
| ^~~~~~~~~~~~~~~~~~~~~~~~
250+
../src/elpa2/kernels/complex_128bit_256bit_512bit_BLOCK_template.c:51:47: note: in definition of macro 'CONCAT2_8ARGS'
251+
51 | #define CONCAT2_8ARGS(a, b, c, d, e, f, g, h) a ## b ## c ## d ## e ## f ## g ## h
252+
| ^
253+
../src/elpa2/kernels/complex_128bit_256bit_512bit_BLOCK_template.c:1872:27: note: in expansion of macro 'CONCAT_8ARGS'
254+
1872 | static __forceinline void CONCAT_8ARGS(hh_trafo_complex_kernel_,ROW_LENGTH,_,SIMD_SET,_,BLOCK,hv_,WORD_LENGTH) (DATA_TYPE_PTR q, DATA_TYPE_PTR hh, int nb, int ldq
255+
| ^~~~~~~~~~~~
256+
PPFC src/GPU/libelpa_openmp_private_la-mod_vendor_agnostic_general_layer.lo
257+
PPFC test/shared/GPU/libelpatest_openmp_la-test_gpu_vendor_agnostic_layer.lo
258+
../src/GPU/CUDA/./cudaFunctions_template.h(942): error: identifier "creal" is undefined
259+
double alpha_real = creal(alpha);
260+
^
261+
262+
../src/GPU/CUDA/./cudaFunctions_template.h(960): error: identifier "creal" is undefined
263+
float alpha_real = creal(alpha);
264+
```
265+
266+
And you may need to change ELPA version to 2024.05.001, edit `toolchain/scripts/stage3/install_elpa.sh` to do it.
267+
246268
2. For the cusolvermp method, toolchain_*.sh does not need to be changed, just follow it directly install dependencies using `./toolchain_*.sh`, and then add
247269
```shell
248270
-DUSE_CUDA=ON \

toolchain/scripts/stage3/install_elpa.sh

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ SCRIPT_DIR="$(cd "$(dirname "$SCRIPT_NAME")/.." && pwd -P)"
1212
# From https://elpa.mpcdf.mpg.de/software/tarball-archive/ELPA_TARBALL_ARCHIVE.html
1313
# elpa_ver="2024.05.001"
1414
# elpa_sha256="9caf41a3e600e2f6f4ce1931bd54185179dade9c171556d0c9b41bbc6940f2f6"
15+
# newer version of elpa may have problem in GPU-ELPA compliation
1516
elpa_ver="2025.01.001"
1617
elpa_sha256="3ef0c6aed9a3e05db6efafe6e14d66eb88b2a1354d61e765b7cde0d3d5f3951e"
1718

@@ -123,7 +124,6 @@ case "$with_elpa" in
123124
--with-cuda-path=${CUDA_PATH:-${CUDA_HOME:-/CUDA_HOME-notset}} \
124125
--enable-nvidia-gpu-kernels=$([ "$TARGET" = "nvidia" ] && echo "yes" || echo "no") \
125126
--with-NVIDIA-GPU-compute-capability=$([ "$TARGET" = "nvidia" ] && echo "sm_$ARCH_NUM" || echo "sm_70") \
126-
CUDA_CFLAGS="-std=c++14 -allow-unsupported-compiler" \
127127
OMPI_MCA_plm_rsh_agent=/bin/false \
128128
FC=${MPIFC} \
129129
CC=${MPICC} \
@@ -153,7 +153,6 @@ case "$with_elpa" in
153153
--enable-nvidia-gpu-kernels=$([ "$TARGET" = "nvidia" ] && echo "yes" || echo "no") \
154154
--with-cuda-path=${CUDA_PATH:-${CUDA_HOME:-/CUDA_HOME-notset}} \
155155
--with-NVIDIA-GPU-compute-capability=$([ "$TARGET" = "nvidia" ] && echo "sm_$ARCH_NUM" || echo "sm_70") \
156-
CUDA_CFLAGS="-std=c++14 -allow-unsupported-compiler" \
157156
FC=${MPIFC} \
158157
CC=${MPICC} \
159158
CXX=${MPICXX} \

0 commit comments

Comments
 (0)