Skip to content

Conversation

@tang070205
Copy link

@tang070205 tang070205 commented Mar 16, 2025

Firstly, it's cusolverMp merhod

Change the link libraries for the cal and cusolverMp sections in the CMakeLists.txt file to manually specified, and then manually pass the parameters in cmake -D CAL_CUSOLVERMP_PATH=/path/to/lib

Add the following options after build_abacus_gnu.sh
-DUSE_CUDA=ON \
-DENABLE_CUSOLVERMP=ON \
-D CAL_CUSOLVERMP_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/24.11/math_libs/12.6/targets/x86_64-linux/lib
CAL_CUSOLVERMP-PATH needs to be set according to different environments

next it is necessary to set up the environment for cal and hpcx, which can be added to~/. bashrc or manually set up an env.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/nvidia/hpc_sdk/Linux_x86_64/24.11/comm_libs/12.6/hpcx/hpcx-2.20/ucc/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/nvidia/hpc_sdk/Linux_x86_64/24.11/comm_libs/12.6/hpcx/hpcx-2.20/ucx/lib
export CPATH=$CPATH:/opt/nvidia/hpc_sdk/Linux_x86_64/24.11/math_libs/12.6/targets/x86_64-linux/include

The second is the ELPA method

Add the following options after toolchain_gnu.sh
export CUDA_PATH=/usr/local/cuda
--enable-cuda \
--gpu-ver=89 \
The 40 Series here is newly added in the install_abacus_toolchain.sh file, corresponding to sm_89

The above two methods can be compiled successfully by using the build_abacus_gnu.sh file

@QuantumMisaka QuantumMisaka self-assigned this Mar 16, 2025
@mohanchen mohanchen requested a review from QuantumMisaka March 17, 2025 06:33
@mohanchen mohanchen added the Compile & CICD & Docs & Dependencies Issues related to compiling ABACUS label Mar 17, 2025
@mohanchen mohanchen requested a review from dzzz2001 March 17, 2025 06:35
@QuantumMisaka QuantumMisaka mentioned this pull request Mar 19, 2025
10 tasks
@QuantumMisaka
Copy link
Collaborator

QuantumMisaka commented Mar 19, 2025

I'll submit a modification. After that, the ELPA-GPU compilation can be smoothly done in any cuda>11.6 environment. the installation and usage of cusolvermp needs more development

QuantumMisaka and others added 5 commits March 19, 2025 17:16
- ELPA compiler flags modification
- GPU_VER setting modification: user should specify the GPU compability number, but not the GPU name
- Modify toolchain_[gnu,intel].sh and build_abacus_[gnu,intel].sh to use the above modification
@QuantumMisaka
Copy link
Collaborator

We will try to add cusolvermp installation & compilation inside toolchain

@QuantumMisaka
Copy link
Collaborator

We will try to add cusolvermp installation & compilation inside toolchain

Update: The method in deploying cusovermp and the related dependencies (UCC/UCX/libcal)can be multiple (nvhpc-sdk/individual-package) , and different version of these package may have different path, making it very difficult to link them automatcally. Besides, if one wants to use the multiple-GPU calculation, simplily install HPC_SDK automatically is not enough, and the UCC/UCX/cusolvermp need to be complied/deployed according to the server setting. So, we will not add automatically download/link in toolchain now. However, we will add a simple deployment method via HPC-SDK in README for user as reference to install and use the cusolvermp themselves.

Copy link
Collaborator

@QuantumMisaka QuantumMisaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests passed. cusolvermp is too complicated to be incorporate

@QuantumMisaka
Copy link
Collaborator

@mohanchen @dzzz2001 @goodchong We think this version of toolchain is enough now for a simple GPU-LCAO installation with CUDA and ELPA

@mohanchen
Copy link
Collaborator

LGTM

@mohanchen mohanchen added the GPU & DCU & HPC GPU and DCU and HPC related any issues label Mar 22, 2025
@mohanchen mohanchen merged commit 76af832 into deepmodeling:develop Mar 22, 2025
14 checks passed
Fisherd99 pushed a commit to Fisherd99/abacus-BSE that referenced this pull request Mar 31, 2025
…eepmodeling#6014)

* Add optional LCAO base GPU versions supported by cusolvermp

* Add optional LCAO base GPU versions supported by elpa

* Add optional LCAO base GPU versions supported by elpa

* Add L40S as GPUVER value for sm_89 architecture

* Delete a few lines of content to enable Nvidia to compile

* Add a specified Fortran mpi compiler for elpa to use

* Add CUDA path for use by ELPA-GPU

* Add optional LCAO base GPU versions supported by elpa

* Modify a small issue

* Change to manually specifying the link libraries for CAL and cusolverMp

* Add the use of 'cusolvermp' or 'elpa' methods to compile ABACUS GPU-LCAO

* Add the use of 'cusolvermp' or 'elpa' methods to compile ABACUS GPU-LCAO

* Add the use of 'cusolvermp' or 'elpa' methods to compile ABACUS GPU-LCAO

* Add modification
- ELPA compiler flags modification
- GPU_VER setting modification: user should specify the GPU compability number, but not the GPU name
- Modify toolchain_[gnu,intel].sh and build_abacus_[gnu,intel].sh to use the above modification

* minor adjustment

* update README

* give back cmake default option

* update README and cusolvermp

* Update README.md

---------

Co-authored-by: JamesMisaka <[email protected]>
dyzheng pushed a commit that referenced this pull request Apr 1, 2025
…6014)

* Add optional LCAO base GPU versions supported by cusolvermp

* Add optional LCAO base GPU versions supported by elpa

* Add optional LCAO base GPU versions supported by elpa

* Add L40S as GPUVER value for sm_89 architecture

* Delete a few lines of content to enable Nvidia to compile

* Add a specified Fortran mpi compiler for elpa to use

* Add CUDA path for use by ELPA-GPU

* Add optional LCAO base GPU versions supported by elpa

* Modify a small issue

* Change to manually specifying the link libraries for CAL and cusolverMp

* Add the use of 'cusolvermp' or 'elpa' methods to compile ABACUS GPU-LCAO

* Add the use of 'cusolvermp' or 'elpa' methods to compile ABACUS GPU-LCAO

* Add the use of 'cusolvermp' or 'elpa' methods to compile ABACUS GPU-LCAO

* Add modification
- ELPA compiler flags modification
- GPU_VER setting modification: user should specify the GPU compability number, but not the GPU name
- Modify toolchain_[gnu,intel].sh and build_abacus_[gnu,intel].sh to use the above modification

* minor adjustment

* update README

* give back cmake default option

* update README and cusolvermp

* Update README.md

---------

Co-authored-by: JamesMisaka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Compile & CICD & Docs & Dependencies Issues related to compiling ABACUS GPU & DCU & HPC GPU and DCU and HPC related any issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants