elpa-gpu modify

QuantumMisaka · QuantumMisaka · commit 663197488785 · 2025-05-07T16:22:30.000+08:00
diff --git a/toolchain/README.md b/toolchain/README.md
@@ -243,6 +243,28 @@ then just build the abacus executable program by compiling it with `./build_abac
 
 The ELPA method need more parameter setting, but it doesn't seem to be affected by the CUDA toolkits version, and it is no need to manually install and package. 
 
+Note: ELPA-2025.01.001 may have problem in nvidia-GPU compilation on some V100-GPU with AMD-CPU machine, error message:
+```bash
+ 1872 | static __forceinline void CONCAT_8ARGS(hh_trafo_complex_kernel_,ROW_LENGTH,_,SIMD_SET,_,BLOCK,hv_,WORD_LENGTH) (DATA_TYPE_PTR q, DATA_TYPE_PTR hh, int nb, int ldq
+      |                                        ^~~~~~~~~~~~~~~~~~~~~~~~
+../src/elpa2/kernels/complex_128bit_256bit_512bit_BLOCK_template.c:51:47: note: in definition of macro 'CONCAT2_8ARGS'
+   51 | #define CONCAT2_8ARGS(a, b, c, d, e, f, g, h) a ## b ## c ## d ## e ## f ## g ## h
+      |                                               ^
+../src/elpa2/kernels/complex_128bit_256bit_512bit_BLOCK_template.c:1872:27: note: in expansion of macro 'CONCAT_8ARGS'
+ 1872 | static __forceinline void CONCAT_8ARGS(hh_trafo_complex_kernel_,ROW_LENGTH,_,SIMD_SET,_,BLOCK,hv_,WORD_LENGTH) (DATA_TYPE_PTR q, DATA_TYPE_PTR hh, int nb, int ldq
+      |                           ^~~~~~~~~~~~
+  PPFC     src/GPU/libelpa_openmp_private_la-mod_vendor_agnostic_general_layer.lo
+  PPFC     test/shared/GPU/libelpatest_openmp_la-test_gpu_vendor_agnostic_layer.lo
+../src/GPU/CUDA/./cudaFunctions_template.h(942): error: identifier "creal" is undefined
+    double alpha_real = creal(alpha);
+                        ^
+
+../src/GPU/CUDA/./cudaFunctions_template.h(960): error: identifier "creal" is undefined
+    float alpha_real = creal(alpha);
+```
+
+And you may need to change ELPA version to 2024.05.001, edit `toolchain/scripts/stage3/install_elpa.sh` to do it.
+
 2. For the cusolvermp method, toolchain_*.sh does not need to be changed, just follow it directly install dependencies using `./toolchain_*.sh`, and then add
 ```shell
 -DUSE_CUDA=ON \
diff --git a/toolchain/scripts/stage3/install_elpa.sh b/toolchain/scripts/stage3/install_elpa.sh
@@ -12,6 +12,7 @@ SCRIPT_DIR="$(cd "$(dirname "$SCRIPT_NAME")/.." && pwd -P)"
 # From https://elpa.mpcdf.mpg.de/software/tarball-archive/ELPA_TARBALL_ARCHIVE.html
 # elpa_ver="2024.05.001"
 # elpa_sha256="9caf41a3e600e2f6f4ce1931bd54185179dade9c171556d0c9b41bbc6940f2f6"
+# newer version of elpa may have problem in GPU-ELPA compliation
 elpa_ver="2025.01.001"
 elpa_sha256="3ef0c6aed9a3e05db6efafe6e14d66eb88b2a1354d61e765b7cde0d3d5f3951e"
 
@@ -123,7 +124,6 @@ case "$with_elpa" in
           --with-cuda-path=${CUDA_PATH:-${CUDA_HOME:-/CUDA_HOME-notset}} \
           --enable-nvidia-gpu-kernels=$([ "$TARGET" = "nvidia" ] && echo "yes" || echo "no") \
           --with-NVIDIA-GPU-compute-capability=$([ "$TARGET" = "nvidia" ] && echo "sm_$ARCH_NUM" || echo "sm_70") \
-          CUDA_CFLAGS="-std=c++14 -allow-unsupported-compiler" \
           OMPI_MCA_plm_rsh_agent=/bin/false \
           FC=${MPIFC} \
           CC=${MPICC} \
@@ -153,7 +153,6 @@ case "$with_elpa" in
           --enable-nvidia-gpu-kernels=$([ "$TARGET" = "nvidia" ] && echo "yes" || echo "no") \
           --with-cuda-path=${CUDA_PATH:-${CUDA_HOME:-/CUDA_HOME-notset}} \
           --with-NVIDIA-GPU-compute-capability=$([ "$TARGET" = "nvidia" ] && echo "sm_$ARCH_NUM" || echo "sm_70") \
-          CUDA_CFLAGS="-std=c++14 -allow-unsupported-compiler" \
           FC=${MPIFC} \
           CC=${MPICC} \
           CXX=${MPICXX} \