Skip to content

Commit 76af832

Browse files
Add two LCAO base group GPU version compilation options in toolchain (#6014)
* Add optional LCAO base GPU versions supported by cusolvermp * Add optional LCAO base GPU versions supported by elpa * Add optional LCAO base GPU versions supported by elpa * Add L40S as GPUVER value for sm_89 architecture * Delete a few lines of content to enable Nvidia to compile * Add a specified Fortran mpi compiler for elpa to use * Add CUDA path for use by ELPA-GPU * Add optional LCAO base GPU versions supported by elpa * Modify a small issue * Change to manually specifying the link libraries for CAL and cusolverMp * Add the use of 'cusolvermp' or 'elpa' methods to compile ABACUS GPU-LCAO * Add the use of 'cusolvermp' or 'elpa' methods to compile ABACUS GPU-LCAO * Add the use of 'cusolvermp' or 'elpa' methods to compile ABACUS GPU-LCAO * Add modification - ELPA compiler flags modification - GPU_VER setting modification: user should specify the GPU compability number, but not the GPU name - Modify toolchain_[gnu,intel].sh and build_abacus_[gnu,intel].sh to use the above modification * minor adjustment * update README * give back cmake default option * update README and cusolvermp * Update README.md --------- Co-authored-by: JamesMisaka <[email protected]>
1 parent 65171f9 commit 76af832

12 files changed

+145
-99
lines changed

CMakeLists.txt

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -352,9 +352,19 @@ if(USE_CUDA)
352352
endif()
353353
if (ENABLE_CUSOLVERMP)
354354
add_compile_definitions(__CUSOLVERMP)
355+
find_library(CAL_LIBRARY
356+
NAMES cal
357+
PATHS ${CAL_CUSOLVERMP_PATH}
358+
NO_DEFAULT_PATH
359+
)
360+
find_library(CUSOLVERMP_LIBRARY
361+
NAMES cusolverMp
362+
PATHS ${CAL_CUSOLVERMP_PATH}
363+
NO_DEFAULT_PATH
364+
)
355365
target_link_libraries(${ABACUS_BIN_NAME}
356-
cal
357-
cusolverMp
366+
${CAL_LIBRARY}
367+
${CUSOLVERMP_LIBRARY}
358368
)
359369
endif()
360370
endif()

toolchain/README.md

Lines changed: 76 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Version 2025.1
44

5-
## Author
5+
## Main Developer
66

77
[QuantumMisaka](https://github.com/QuantumMisaka)
88
(Zhaoqing Liu) @PKU @AISI
@@ -26,8 +26,9 @@ and give setup files that you can use to compile ABACUS.
2626
- [x] Automatic installation of [CEREAL](https://github.com/USCiLab/cereal) and [LIBNPY](https://github.com/llohse/libnpy) (by github.com)
2727
- [x] Support for [LibRI](https://github.com/abacusmodeling/LibRI) by submodule or automatic installation from github.com (but installed LibRI via `wget` seems to have some problem, please be cautious)
2828
- [x] A mirror station by Bohrium database, which can download CEREAL, LibNPY, LibRI and LibComm by `wget` in China Internet.
29-
- [x] Support for GPU compilation, users can add `-DUSE_CUDA=1` in builder scripts.
29+
- [x] Support for GPU-PW and GPU-LCAO compilation (elpa, cusolvermp is developing), and `-DUSE_CUDA=1` is needed builder scripts.
3030
- [x] Support for AMD compiler and math lib `AOCL` and `AOCC` (not fully complete due to flang and AOCC-ABACUS compliation error)
31+
- [ ] Support for more GPU device out of Nvidia.
3132
- [ ] Change the downloading url from cp2k mirror to other mirror or directly downloading from official website. (doing)
3233
- [ ] Support a JSON or YAML configuration file for toolchain, which can be easily modified by users.
3334
- [ ] A better README and Detail markdown file.
@@ -138,7 +139,9 @@ Dependencies below are optional, which is NOT installed by default:
138139
- `LibComm` 0.1.1
139140

140141
Users can install them by using `--with-*=install` in toolchain*.sh, which is `no` in default. Also, user can specify the absolute path of the package by `--with-*=path/to/package` in toolchain*.sh to allow toolchain to use the package.
141-
> Notice: LibRI, LibComm and Libnpy is on actively development, you should check-out the package version when using this toolchain. Also, LibRI and LibComm can be installed by github submodule, that is also work for libnpy, which is more recommended.
142+
> Notice: LibTorch always suffer from GLIBC_VERSION problem, if you encounter this, please downgrade LibTorch version to 1.12.1 in scripts/stage4/install_torch.sh
143+
>
144+
> Notice: LibRI, LibComm, Rapidjson and Libnpy is on actively development, you should check-out the package version when using this toolchain.
142145
143146
Users can easily compile and install dependencies of ABACUS
144147
by running these scripts after loading `gcc` or `intel-mkl-mpi`
@@ -187,6 +190,74 @@ or you can also do it in a more completely way:
187190
> rm -rf install build/*/* build/OpenBLAS*/ build/setup_*
188191
```
189192

193+
## GPU version of ABACUS
194+
195+
Toolchain supports compiling GPU version of ABACUS with Nvidia-GPU and CUDA. For usage, adding following options in build*.sh:
196+
197+
```shell
198+
# in build_abacus_gnu.sh
199+
cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
200+
-DCMAKE_CXX_COMPILER=g++ \
201+
-DMPI_CXX_COMPILER=mpicxx \
202+
......
203+
-DUSE_CUDA=ON \
204+
# -DCMAKE_CUDA_COMPILER=${path to cuda toolkit}/bin/nvcc \ # add if needed
205+
......
206+
# in build_abacus_intel.sh
207+
cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
208+
-DCMAKE_CXX_COMPILER=icpc \
209+
-DMPI_CXX_COMPILER=mpiicpc \
210+
......
211+
-DUSE_CUDA=ON \
212+
# -DCMAKE_CUDA_COMPILER=${path to cuda toolkit}/bin/nvcc \ # add if needed
213+
......
214+
```
215+
which will enable GPU version of ABACUS, and the `ks_solver cusolver` method can be directly used for PW and LCAO calculation.
216+
217+
Notice: You CANNOT use `icpx` compiler for GPU version of ABACUS for now, see discussion here [#2906](https://github.com/deepmodeling/abacus-develop/issues/2906) and [#4976](https://github.com/deepmodeling/abacus-develop/issues/4976)
218+
219+
If you wants to use ABACUS GPU-LCAO by `cusolvermp` or `elpa` for multiple-GPU calculation, please compile according to the following usage:
220+
221+
1. For the elpa method, add
222+
```shell
223+
export CUDA_PATH=/path/to/CUDA
224+
# install_abacus_toolchain.sh part options
225+
--enable-cuda \
226+
--gpu-ver=(GPU-compatibility-number) \
227+
```
228+
to the `toolchain_*.sh`, and then follow the normal step to install the dependencies using `./toolchain_*.sh`. For checking the GPU compatibility number, you can refer to the [CUDA compatibility](https://developer.nvidia.com/cuda-gpus).
229+
230+
Afterwards, make sure these option are enable in your `build_abacus_*.sh` script
231+
```shell
232+
-DUSE_ELPA=ON \
233+
-DUSE_CUDA=ON \
234+
```
235+
then just build the abacus executable program by compiling it with `./build_abacus_*.sh`.
236+
237+
The ELPA method need more parameter setting, but it doesn't seem to be affected by the CUDA toolkits version, and it is no need to manually install and package.
238+
239+
2. For the cusolvermp method, toolchain_*.sh does not need to be changed, just follow it directly install dependencies using `./toolchain_*.sh`, and then add
240+
```shell
241+
-DUSE_CUDA=ON \
242+
-DENABLE_CUSOLVERMP=ON \
243+
-D CAL_CUSOLVERMP_PATH=/path/to/math.libs/1x.x/target/x86_64-linux/lib \
244+
```
245+
to the `build.abacus_*.sh` file. add the following three items to the environment (assuming you are using hpcsdk):
246+
```shell
247+
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/comm_libs/1x.x/hpcx/hpcx-x.xx/ucc/lib
248+
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/comm_libs/1x.x/hpcx/hpcx-x.xx/ucx/lib
249+
export CPATH=$CPATH:/path/to/math_libs/1x.x/targets/x86_64-linux/include
250+
```
251+
Just enough to build the abacus executable program by compiling it with `./build_abacus_*.sh`.
252+
253+
You can refer to the linking video for auxiliary compilation and installation. [Bilibili](https://www.bilibili.com/video/BV1eqr5YuETN/).
254+
255+
The cusolverMP requires installation from sources such as apt or yum, which is suitable for containers or local computers.
256+
The second choice is using [NVIDIA HPC_SDK](https://developer.nvidia.com/hpc-sdk-downloads) for installation, which is relatively simple, but the package from NVIDIA HPC_SDK may not be suitable, especially for muitiple-GPU parallel running. To better use cusolvermp and its dependency (libcal, ucx, ucc) in multi-GPU running, please contact your server manager.
257+
258+
After compiling, you can specify `device GPU` in INPUT file to use GPU version of ABACUS.
259+
260+
190261
## Common Problems and Solutions
191262

192263
### Intel-oneAPI problem
@@ -215,7 +286,7 @@ wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/0722521a-34b5-4
215286

216287
Related discussion here [#4976](https://github.com/deepmodeling/abacus-develop/issues/4976)
217288

218-
#### link problem in early 2023 version oneAPI
289+
#### linking problem in early 2023 version oneAPI
219290

220291
Sometimes Intel-oneAPI have problem to link `mpirun`,
221292
which will always show in 2023.2.0 version of MPI in Intel-oneAPI.
@@ -253,23 +324,6 @@ git clone https://github.com/abacusmodeling/LibComm
253324

254325
OpenMPI in version 5 has huge update, lead to compatibility problem. If one wants to use the OpenMPI in version 4 (4.1.6), one can specify `--with-openmpi-4th=yes` in *toolchain_gnu.sh*
255326

256-
### GPU version of ABACUS
257-
258-
For GPU version of ABACUS (do not GPU version installer of ELPA, which is still doing work), add following options in build*.sh:
259-
260-
```shell
261-
cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
262-
-DCMAKE_CXX_COMPILER=icpx \
263-
-DMPI_CXX_COMPILER=mpiicpc \
264-
......
265-
-DUSE_CUDA=1 \
266-
-DCMAKE_CUDA_COMPILER=${path to cuda toolkit}/bin/nvcc \
267-
......
268-
```
269-
270-
Notice: You CANNOT use `icpx` compiler for GPU version of ABACUS for now, see discussion here [#2906](https://github.com/deepmodeling/abacus-develop/issues/2906) and [#4976](https://github.com/deepmodeling/abacus-develop/issues/4976)
271-
272-
If you wants to use ABACUS GPU-LCAO by `cusolvermp` or `elpa`, please contact the coresponding developer, toolchain do not fully support them now.
273327

274328
### Shell problem
275329

@@ -325,4 +379,4 @@ of each packages, which may let the installation more fiexible.
325379

326380
## More
327381

328-
More infomation can be read from `Details.md`.
382+
More infomation can be read from `Details.md`.

toolchain/build_abacus_gnu-aocl.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ cd $ABACUS_DIR
1818
ABACUS_DIR=$(pwd)
1919
#AOCLhome=/opt/aocl # user can specify this parameter
2020

21-
BUILD_DIR=build_abacus_gnu
21+
BUILD_DIR=build_abacus_aocl
2222
rm -rf $BUILD_DIR
2323

2424
PREFIX=$ABACUS_DIR

toolchain/build_abacus_gnu.sh

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ PREFIX=$ABACUS_DIR
2424
LAPACK=$INSTALL_DIR/openblas-0.3.28/lib
2525
SCALAPACK=$INSTALL_DIR/scalapack-2.2.1/lib
2626
ELPA=$INSTALL_DIR/elpa-2025.01.001/cpu
27+
# ELPA=$INSTALL_DIR/elpa-2025.01.001/nvidia # for gpu-lcao
2728
FFTW3=$INSTALL_DIR/fftw-3.3.10
2829
CEREAL=$INSTALL_DIR/cereal-1.3.2/include/cereal
2930
LIBXC=$INSTALL_DIR/libxc-7.0.0
@@ -49,13 +50,16 @@ cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
4950
-DUSE_ELPA=ON \
5051
-DENABLE_RAPIDJSON=ON \
5152
-DRapidJSON_DIR=$RAPIDJSON \
53+
# -DUSE_CUDA=ON \
5254
# -DENABLE_DEEPKS=1 \
5355
# -DTorch_DIR=$LIBTORCH \
5456
# -Dlibnpy_INCLUDE_DIR=$LIBNPY \
5557
# -DENABLE_LIBRI=ON \
5658
# -DLIBRI_DIR=$LIBRI \
5759
# -DLIBCOMM_DIR=$LIBCOMM \
5860
# -DDeePMD_DIR=$DEEPMD \
61+
#-DENABLE_CUSOLVERMP=ON \
62+
#-D CAL_CUSOLVERMP_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/2x.xx/math_libs/1x.x/targets/x86_64-linux/lib
5963

6064
# # add mkl env for libtorch to link
6165
# if one want to install libtorch, mkl should be load in build process
@@ -81,4 +85,4 @@ Done!
8185
To use the installed ABACUS version
8286
You need to source ${TOOL}/abacus_env.sh first !
8387
"""
84-
EOF
88+
EOF

toolchain/build_abacus_intel.sh

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ rm -rf $BUILD_DIR
2323

2424
PREFIX=$ABACUS_DIR
2525
ELPA=$INSTALL_DIR/elpa-2025.01.001/cpu
26+
# ELPA=$INSTALL_DIR/elpa-2025.01.001/nvidia # for gpu-lcao
2627
CEREAL=$INSTALL_DIR/cereal-1.3.2/include/cereal
2728
LIBXC=$INSTALL_DIR/libxc-7.0.0
2829
RAPIDJSON=$INSTALL_DIR/rapidjson-1.1.0/
@@ -32,7 +33,7 @@ RAPIDJSON=$INSTALL_DIR/rapidjson-1.1.0/
3233
# LIBCOMM=$INSTALL_DIR/LibComm-0.1.1
3334
# DEEPMD=$HOME/apps/anaconda3/envs/deepmd # v3.0 might have problem
3435

35-
# if use deepks and deepmd
36+
# Notice: if you are compiling with AMD-CPU or GPU-version ABACUS, then `icpc` and `mpiicpc` compilers are recommended
3637
cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
3738
-DCMAKE_CXX_COMPILER=icpx \
3839
-DMPI_CXX_COMPILER=mpiicpx \
@@ -46,6 +47,7 @@ cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
4647
-DUSE_ELPA=ON \
4748
-DENABLE_RAPIDJSON=ON \
4849
-DRapidJSON_DIR=$RAPIDJSON \
50+
# -DUSE_CUDA=ON \
4951
# -DENABLE_DEEPKS=1 \
5052
# -DTorch_DIR=$LIBTORCH \
5153
# -Dlibnpy_INCLUDE_DIR=$LIBNPY \
@@ -74,4 +76,4 @@ Done!
7476
To use the installed ABACUS version
7577
You need to source ${TOOL}/abacus_env.sh first !
7678
"""
77-
EOF
79+
EOF

toolchain/install_abacus_toolchain.sh

Lines changed: 19 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -328,7 +328,7 @@ export intel_classic="no"
328328
# and will lead to problem in force calculation
329329
# but icx is recommended by intel compiler
330330
# option: --with-intel-classic can change it to yes/no
331-
# JamesMisaka by 2023.08
331+
# QuantumMisaka by 2023.08
332332
export intelmpi_classic="no"
333333
export with_ifx="yes" # whether ifx is used in oneapi
334334
export with_flang="no" # whether flang is used in aocc
@@ -397,7 +397,7 @@ while [ $# -ge 1 ]; do
397397
eval with_${ii}="__INSTALL__"
398398
fi
399399
done
400-
# I'd like to use OpenMPI as default -- zhaoqing liu in 2023.09.17
400+
# I'd like to use OpenMPI as default -- QuantumMisaka in 2023.09.17
401401
export MPI_MODE="openmpi"
402402
;;
403403
--mpi-mode=*)
@@ -448,16 +448,7 @@ while [ $# -ge 1 ]; do
448448
;;
449449
--gpu-ver=*)
450450
user_input="${1#*=}"
451-
case "${user_input}" in
452-
K20X | K40 | K80 | P100 | V100 | A100 | Mi50 | Mi100 | Mi250 | no)
453-
export GPUVER="${user_input}"
454-
;;
455-
*)
456-
report_error ${LINENO} \
457-
"--gpu-ver currently only supports K20X, K40, K80, P100, V100, A100, Mi50, Mi100, Mi250, and no as options"
458-
exit 1
459-
;;
460-
esac
451+
export GPUVER="${user_input}"
461452
;;
462453
--target-cpu=*)
463454
user_input="${1#*=}"
@@ -684,7 +675,7 @@ else
684675
esac
685676
fi
686677
# If MATH_MODE is mkl ,then openblas, scalapack and fftw is not needed
687-
# zhaoqing in 2023-09-17
678+
# QuantumMisaka in 2023-09-17
688679
if [ "${MATH_MODE}" = "mkl" ]; then
689680
if [ "${with_openblas}" != "__DONTUSE__" ]; then
690681
echo "Using MKL, so openblas is disabled."
@@ -700,6 +691,17 @@ if [ "${MATH_MODE}" = "mkl" ]; then
700691
fi
701692
fi
702693

694+
# Select the correct compute number based on the GPU architecture
695+
# QuantumMisaka in 2025-03-19
696+
export ARCH_NUM="${GPUVER//.}"
697+
if [[ "$ARCH_NUM" =~ ^[1-9][0-9]*$ ]] || [ $ARCH_NUM = "no" ]; then
698+
echo "Notice: GPU compilation is enabled, and GPU compatibility is set via --gpu-ver to sm_${ARCH_NUM}."
699+
else
700+
report_error ${LINENO} \
701+
"When GPU compilation is enabled, the --gpu-ver variable should be properly set regarding to GPU compatibility. For check your GPU compatibility, visit https://developer.nvidia.com/cuda-gpus. For example: A100 -> 8.0 (or 80), V100 -> 7.0 (or 70), 4090 -> 8.9 (or 89)"
702+
exit 1
703+
fi
704+
703705
# If CUDA or HIP are enabled, make sure the GPU version has been defined.
704706
if [ "${ENABLE_CUDA}" = "__TRUE__" ] || [ "${ENABLE_HIP}" = "__TRUE__" ]; then
705707
if [ "${GPUVER}" = "no" ]; then
@@ -708,9 +710,10 @@ if [ "${ENABLE_CUDA}" = "__TRUE__" ] || [ "${ENABLE_HIP}" = "__TRUE__" ]; then
708710
fi
709711
fi
710712

711-
# several packages require cmake.
712-
if [ "${with_scalapack}" = "__INSTALL__" ]; then
713-
[ "${with_cmake}" = "__DONTUSE__" ] && with_cmake="__INSTALL__"
713+
# ABACUS itself and some dependencies require cmake.
714+
if [ "${with_cmake}" = "__DONTUSE__" ]; then
715+
report_error "CMake is required for ABACUS and some dependencies. Please enable it."
716+
exit 1
714717
fi
715718

716719

@@ -816,45 +819,6 @@ fi
816819

817820
echo "Compiling with $(get_nprocs) processes for target ${TARGET_CPU}."
818821

819-
# Select the correct compute number based on the GPU architecture
820-
case ${GPUVER} in
821-
K20X)
822-
export ARCH_NUM="35"
823-
;;
824-
K40)
825-
export ARCH_NUM="35"
826-
;;
827-
K80)
828-
export ARCH_NUM="37"
829-
;;
830-
P100)
831-
export ARCH_NUM="60"
832-
;;
833-
V100)
834-
export ARCH_NUM="70"
835-
;;
836-
A100)
837-
export ARCH_NUM="80"
838-
;;
839-
Mi50)
840-
# TODO: export ARCH_NUM=
841-
;;
842-
Mi100)
843-
# TODO: export ARCH_NUM=
844-
;;
845-
Mi250)
846-
# TODO: export ARCH_NUM=
847-
;;
848-
no)
849-
export ARCH_NUM="no"
850-
;;
851-
*)
852-
report_error ${LINENO} \
853-
"--gpu-ver currently only supports K20X, K40, K80, P100, V100, A100, Mi50, Mi100, Mi250, and no as options"
854-
exit 1
855-
;;
856-
esac
857-
858822
write_toolchain_env ${INSTALLDIR}
859823

860824
# write toolchain config

0 commit comments

Comments
 (0)