Skip to content

Commit 7610dd6

Browse files
pxlxingliangpre-commit-ci-lite[bot]
authored andcommitted
feature: parallel solve subspace diagonalization in dav_subspace (deepmodeling#5549)
* feature: parallel solve subspace diagonalization in dav_subspace * [pre-commit.ci lite] apply automatic fixes * fix Makefile * fix ut * fix * fix * fix test * fix * fix pyabacus * [pre-commit.ci lite] apply automatic fixes * fix doc * update the doc --------- Co-authored-by: root <pxlxingliang> Co-authored-by: pre-commit-ci-lite[bot] <117423508+pre-commit-ci-lite[bot]@users.noreply.github.com>
1 parent b4ffa70 commit 7610dd6

File tree

29 files changed

+1683
-62
lines changed

29 files changed

+1683
-62
lines changed

docs/advanced/input_files/input-main.md

Lines changed: 27 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
- [kspacing](#kspacing)
2222
- [min\_dist\_coef](#min_dist_coef)
2323
- [device](#device)
24+
- [nb2d](#nb2d)
2425
- [precision](#precision)
2526
- [Variables related to input files](#variables-related-to-input-files)
2627
- [stru\_file](#stru_file)
@@ -40,12 +41,12 @@
4041
- [diago\_smooth\_ethr](#diago_smooth_ethr)
4142
- [pw\_diag\_nmax](#pw_diag_nmax)
4243
- [pw\_diag\_ndim](#pw_diag_ndim)
44+
- [diag\_subspace](#diag_subspace)
4345
- [erf\_ecut](#erf_ecut)
4446
- [fft\_mode](#fft_mode)
4547
- [erf\_height](#erf_height)
4648
- [erf\_sigma](#erf_sigma)
4749
- [Numerical atomic orbitals related variables](#numerical-atomic-orbitals-related-variables)
48-
- [nb2d](#nb2d)
4950
- [lmaxmax](#lmaxmax)
5051
- [lcao\_ecut](#lcao_ecut)
5152
- [lcao\_dk](#lcao_dk)
@@ -667,6 +668,19 @@ If only one value is set (such as `kspacing 0.5`), then kspacing values of a/b/c
667668
- cg/bpcg/dav ks_solver: required by the `single` precision options
668669
- **Default**: double
669670

671+
### nb2d
672+
673+
- **Type**: Integer
674+
- **Description**: When using elpa or scalapack to solver the eigenvalue problem, the data should be distributed by the two-dimensional block-cyclic distribution. This paramter specifies the size of the block. It is valid for:
675+
- [ks_solver](#ks_solver) is genelpa or scalapack_gvx. If nb2d is set to 0, then it will be automatically set in the program according to the size of atomic orbital basis:
676+
- if size <= 500: nb2d = 1
677+
- if 500 < size <= 1000: nb2d = 32
678+
- if size > 1000: nb2d = 64;
679+
- [ks_solver](#ks_solver) is dav_subspace, and [diag_subspace](#diag_subspace) is 1 or 2. It is the block size for the diagonization of subspace. If it is set to 0, then it will be automatically set in the program according to the number of band:
680+
- if number of band > 500: nb2d = 32
681+
- if number of band < 500: nb2d = 16
682+
- **Default**: 0
683+
670684
[back to top](#full-list-of-input-keywords)
671685

672686
## Variables related to input files
@@ -794,7 +808,18 @@ These variables are used to control the plane wave related parameters.
794808

795809
- **Type**: Integer
796810
- **Description**: Only useful when you use `ks_solver = dav` or `ks_solver = dav_subspace`. It indicates dimension of workspace(number of wavefunction packets, at least 2 needed) for the Davidson method. A larger value may yield a smaller number of iterations in the algorithm but uses more memory and more CPU time in subspace diagonalization.
797-
- **Default**: 4
811+
- **Default**: 4
812+
813+
### diag_subspace
814+
815+
- **Type**: Integer
816+
- **Description**: The method to diagonalize subspace in dav_subspace method. The available options are:
817+
- 0: by LAPACK
818+
- 1: by GenELPA
819+
- 2: by ScaLAPACK
820+
LAPACK only solve in one core, GenELPA and ScaLAPACK can solve in parallel. If the system is small (such as the band number is less than 100), LAPACK is recommended. If the system is large and MPI parallel is used, then GenELPA or ScaLAPACK is recommended, and GenELPA usually has better performance. For GenELPA and ScaLAPACK, the block size can be set by [nb2d](#nb2d).
821+
822+
- **Default**: 0
798823

799824
### erf_ecut
800825

@@ -837,15 +862,6 @@ These variables are used to control the plane wave related parameters.
837862

838863
These variables are used to control the numerical atomic orbitals related parameters.
839864

840-
### nb2d
841-
842-
- **Type**: Integer
843-
- **Description**: In LCAO calculations, we arrange the total number of processors in an 2D array, so that we can partition the wavefunction matrix (number of bands*total size of atomic orbital basis) and distribute them in this 2D array. When the system is large, we group processors into sizes of nb2d, so that multiple processors take care of one row block (a group of atomic orbitals) in the wavefunction matrix. If set to 0, nb2d will be automatically set in the program according to the size of atomic orbital basis:
844-
- if size <= 500 : nb2d = 1
845-
- if 500 < size <= 1000 : nb2d = 32
846-
- if size > 1000 : nb2d = 64;
847-
- **Default**: 0
848-
849865
### lmaxmax
850866

851867
- **Type**: Integer

python/pyabacus/src/hsolver/CMakeLists.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@ list(APPEND _diago
55
${HSOLVER_PATH}/diago_cg.cpp
66
${HSOLVER_PATH}/diag_const_nums.cpp
77
${HSOLVER_PATH}/diago_iter_assist.cpp
8+
${HSOLVER_PATH}/diag_hs_para.cpp
9+
${HSOLVER_PATH}/diago_pxxxgvx.cpp
10+
811

912
${HSOLVER_PATH}/kernels/dngvd_op.cpp
1013
${HSOLVER_PATH}/kernels/math_kernel_op.cpp

python/pyabacus/src/hsolver/py_diago_dav_subspace.hpp

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,9 @@ class PyDiagoDavSubspace
108108
bool need_subspace,
109109
std::vector<double>& diag_ethr,
110110
bool scf_type,
111-
hsolver::diag_comm_info comm_info
111+
hsolver::diag_comm_info comm_info,
112+
int diag_subspace,
113+
int nb2d
112114
) {
113115
auto hpsi_func = [mm_op] (
114116
std::complex<double> *psi_in,
@@ -138,7 +140,9 @@ class PyDiagoDavSubspace
138140
tol,
139141
max_iter,
140142
need_subspace,
141-
comm_info
143+
comm_info,
144+
diag_subspace,
145+
nb2d
142146
);
143147

144148
return obj->diag(hpsi_func, psi, nbasis, eigenvalue, diag_ethr, scf_type);

python/pyabacus/src/hsolver/py_hsolver.cpp

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,13 @@ void bind_hsolver(py::module& m)
6767
where the initial precision of eigenvalue calculation can be coarse.
6868
If false, it indicates a non-self-consistent field (non-SCF) calculation,
6969
where high precision in eigenvalue calculation is required from the start.
70+
comm_info : diag_comm_info
71+
The communicator information.
72+
diago_subspace : int
73+
The method to solve the generalized eigenvalue problem.
74+
0: LAPACK, 1: Gen-ELPA, 2: ScaLAPACK
75+
nb2d : int
76+
The block size in 2d block cyclic distribution if use elpa or scalapack.
7077
)pbdoc",
7178
"mm_op"_a,
7279
"precond_vec"_a,
@@ -76,7 +83,9 @@ void bind_hsolver(py::module& m)
7683
"need_subspace"_a,
7784
"diag_ethr"_a,
7885
"scf_type"_a,
79-
"comm_info"_a)
86+
"comm_info"_a,
87+
"diago_subspace"_a,
88+
"nb2d"_a)
8089
.def("set_psi", &py_hsolver::PyDiagoDavSubspace::set_psi, R"pbdoc(
8190
Set the initial guess of the eigenvectors, i.e. the wave functions.
8291
)pbdoc", "psi_in"_a)

python/pyabacus/src/pyabacus/hsolver/_hsolver.py

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,9 @@ def dav_subspace(
3434
max_iter: int = 1000,
3535
need_subspace: bool = False,
3636
diag_ethr: Union[List[float], None] = None,
37-
scf_type: bool = False
37+
scf_type: bool = False,
38+
diag_subspace: int = 0,
39+
nb2d: int = 0
3840
) -> Tuple[NDArray[np.float64], NDArray[np.complex128]]:
3941
""" A function to diagonalize a matrix using the Davidson-Subspace method.
4042
@@ -67,6 +69,11 @@ def dav_subspace(
6769
If True, the initial precision of eigenvalue calculation can be coarse.
6870
If False, it indicates a non-self-consistent field (non-SCF) calculation,
6971
where high precision in eigenvalue calculation is required from the start.
72+
diag_subspace : int, optional
73+
The method to do the diagonalization, by default 0.
74+
0: LAPACK, 1: Gen-elpa, 2: Scalapack
75+
nb2d : int, optional
76+
The block size for 2D decomposition, by default 0, which will be automatically set.
7077
7178
Returns
7279
-------
@@ -101,7 +108,9 @@ def dav_subspace(
101108
need_subspace,
102109
diag_ethr,
103110
scf_type,
104-
comm_info
111+
comm_info,
112+
diag_subspace,
113+
nb2d
105114
)
106115

107116
e = _diago_obj_dav_subspace.get_eigenvalue()

source/Makefile.Objects

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -339,6 +339,8 @@ OBJS_HSOLVER=diago_cg.o\
339339
math_kernel_op.o\
340340
dngvd_op.o\
341341
diag_const_nums.o\
342+
diag_hs_para.o\
343+
diago_pxxxgvx.o\
342344

343345
OBJS_HSOLVER_LCAO=hsolver_lcao.o\
344346
diago_scalapack.o\

source/module_base/blacs_connector.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ extern "C"
3939
// Informational and Miscellaneous
4040
void Cblacs_gridinfo(int icontxt, int* nprow, int *npcol, int *myprow, int *mypcol);
4141
void Cblacs_gridinit(int* icontxt, char* layout, int nprow, int npcol);
42-
void Cblacs_gridexit(int* icontxt);
42+
void Cblacs_gridexit(int icontxt);
4343
int Cblacs_pnum(int icontxt, int prow, int pcol);
4444
void Cblacs_pcoord(int icontxt, int pnum, int *prow, int *pcol);
4545
void Cblacs_exit(int icontxt);

source/module_base/scalapack_connector.h

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,12 +80,26 @@ extern "C"
8080
const double* vl, const double* vu, const int* il, const int* iu,
8181
const double* abstol, int* m, int* nz, double* w, const double*orfac, double* Z, const int* iz, const int* jz, const int*descz,
8282
double* work, int* lwork, int*iwork, int*liwork, int* ifail, int*iclustr, double*gap, int* info);
83+
8384
void pzhegvx_(const int* itype, const char* jobz, const char* range, const char* uplo,
8485
const int* n, std::complex<double>* A, const int* ia, const int* ja, const int*desca, std::complex<double>* B, const int* ib, const int* jb, const int*descb,
8586
const double* vl, const double* vu, const int* il, const int* iu,
8687
const double* abstol, int* m, int* nz, double* w, const double*orfac, std::complex<double>* Z, const int* iz, const int* jz, const int*descz,
8788
std::complex<double>* work, int* lwork, double* rwork, int* lrwork, int*iwork, int*liwork, int* ifail, int*iclustr, double*gap, int* info);
8889

90+
void pssygvx_(const int* itype, const char* jobz, const char* range, const char* uplo,
91+
const int* n, float* A, const int* ia, const int* ja, const int*desca, float* B, const int* ib, const int* jb, const int*descb,
92+
const float* vl, const float* vu, const int* il, const int* iu,
93+
const float* abstol, int* m, int* nz, float* w, const float*orfac, float* Z, const int* iz, const int* jz, const int*descz,
94+
float* work, int* lwork, int*iwork, int*liwork, int* ifail, int*iclustr, float*gap, int* info);
95+
96+
void pchegvx_(const int* itype, const char* jobz, const char* range, const char* uplo,
97+
const int* n, std::complex<float>* A, const int* ia, const int* ja, const int*desca, std::complex<float>* B, const int* ib, const int* jb, const int*descb,
98+
const float* vl, const float* vu, const int* il, const int* iu,
99+
const float* abstol, int* m, int* nz, float* w, const float*orfac, std::complex<float>* Z, const int* iz, const int* jz, const int*descz,
100+
std::complex<float>* work, int* lwork, float* rwork, int* lrwork, int*iwork, int*liwork, int* ifail, int*iclustr, float*gap, int* info);
101+
102+
89103
void pzgetri_(
90104
const int *n,
91105
const std::complex<double> *A, const int *ia, const int *ja, const int *desca,

source/module_hsolver/CMakeLists.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@ list(APPEND objects
99
hsolver_pw_sdft.cpp
1010
diago_iter_assist.cpp
1111
hsolver.cpp
12+
diago_pxxxgvx.cpp
13+
diag_hs_para.cpp
14+
1215
)
1316

1417
if(ENABLE_LCAO)

0 commit comments

Comments
 (0)