Skip to content

Commit 29b3abc

Browse files
authored
Fix: fix cusolvermp compiling error with icpc and update ks_solver doc (#5196)
* fix compilation error of icpc * add cusolvermp * update ks_solver related doc
1 parent 6b116be commit 29b3abc

File tree

4 files changed

+16
-4
lines changed

4 files changed

+16
-4
lines changed

docs/advanced/acceleration/cuda.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ The ABACUS program will automatically determine whether the current ELPA support
3636
## Run with the GPU support by editing the INPUT script:
3737

3838
In `INPUT` file we need to set the input parameter [device](../input_files/input-main.md#device) to `gpu`. If this parameter is not set, ABACUS will try to determine if there are available GPUs.
39-
- Set `ks_solver`: For the PW basis, CG, BPCG and Davidson methods are supported on GPU; set the input parameter [ks_solver](../input_files/input-main.md#ks_solver) to `cg`, `bpcg` or `dav`. For the LCAO basis, `cusolver` and `elpa` is supported on GPU.
39+
- Set `ks_solver`: For the PW basis, CG, BPCG and Davidson methods are supported on GPU; set the input parameter [ks_solver](../input_files/input-main.md#ks_solver) to `cg`, `bpcg` or `dav`. For the LCAO basis, `cusolver`, `cusolvermp` and `elpa` is supported on GPU.
4040
- **multi-card**: ABACUS allows for multi-GPU acceleration. If you have multiple GPU cards, you can run ABACUS with several MPI processes, and each process will utilize one GPU card. For example, the command `mpirun -n 2 abacus` will by default launch two GPUs for computation. If you only have one card, this command will only start one GPU.
4141

4242
## Examples

docs/advanced/input_files/input-main.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -933,6 +933,8 @@ calculations.
933933
- **genelpa**: This method should be used if you choose localized orbitals.
934934
- **scalapack_gvx**: Scalapack can also be used for localized orbitals.
935935
- **cusolver**: This method needs building with CUDA and at least one gpu is available.
936+
- **cusolvermp**: This method supports multi-GPU acceleration and needs building with CUDA。 Note that when using cusolvermp, you should set the number of MPI processes to be equal to the number of GPUs.
937+
- **elpa**: The ELPA solver supports both CPU and GPU. By setting the `device` to GPU, you can launch the ELPA solver with GPU acceleration (provided that you have installed a GPU-supported version of ELPA, which requires you to manually compile and install ELPA, and the ABACUS should be compiled with -DUSE_ELPA=ON and -DUSE_CUDA=ON). The ELPA solver also supports multi-GPU acceleration.
936938

937939
If you set ks_solver=`genelpa` for basis_type=`pw`, the program will be stopped with an error message:
938940

@@ -941,7 +943,13 @@ calculations.
941943
```
942944

943945
Then the user has to correct the input file and restart the calculation.
944-
- **Default**: cg (plane-wave basis), or genelpa (localized atomic orbital basis, if compiling option `USE_ELPA` has been set),lapack (localized atomic orbital basis, if compiling option `ENABLE_MPI` has not been set), scalapack_gvx, (localized atomic orbital basis, if compiling option `USE_ELPA` has not been set and if compiling option `ENABLE_MPI` has been set)
946+
- **Default**:
947+
- **PW basis**: cg.
948+
- **LCAO basis**:
949+
- genelpa (if compiling option `USE_ELPA` has been set)
950+
- lapack (if compiling option `ENABLE_MPI` has not been set)
951+
- scalapack_gvx (if compiling option `USE_ELPA` has not been set and compiling option `ENABLE_MPI` has been set)
952+
- cusolver (if compiling option `USE_CUDA` has been set)
945953

946954
### nbands
947955

source/module_hsolver/kernels/cuda/diag_cusolvermp.cu

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,15 @@ extern "C"
1010
#include "module_hsolver/genelpa/Cblacs.h"
1111
}
1212
#include <iostream>
13+
#include <cstdint>
1314
#include "helper_cusolver.h"
1415
#include "module_base/global_function.h"
1516
#include "module_base/module_device/device.h"
1617
static calError_t allgather(void* src_buf, void* recv_buf, size_t size, void* data, void** request)
1718
{
1819
MPI_Request req;
19-
int err = MPI_Iallgather(src_buf, size, MPI_BYTE, recv_buf, size, MPI_BYTE, (MPI_Comm)(data), &req);
20+
intptr_t ptr = reinterpret_cast<intptr_t>(data);
21+
int err = MPI_Iallgather(src_buf, size, MPI_BYTE, recv_buf, size, MPI_BYTE, (MPI_Comm)ptr, &req);
2022
if (err != MPI_SUCCESS)
2123
{
2224
return CAL_ERROR;
@@ -27,7 +29,8 @@ static calError_t allgather(void* src_buf, void* recv_buf, size_t size, void* da
2729

2830
static calError_t request_test(void* request)
2931
{
30-
MPI_Request req = (MPI_Request)(request);
32+
intptr_t ptr = reinterpret_cast<intptr_t>(request);
33+
MPI_Request req = (MPI_Request)ptr;
3134
int completed;
3235
int err = MPI_Test(&req, &completed, MPI_STATUS_IGNORE);
3336
if (err != MPI_SUCCESS)

source/module_io/read_input_item_elec_stru.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ void ReadInput::item_elec_stru()
6969
"lapack",
7070
"scalapack_gvx",
7171
"cusolver",
72+
"cusolvermp",
7273
"pexsi",
7374
"cg_in_lcao",
7475
};

0 commit comments

Comments
 (0)