Skip to content

Commit bd474b2

Browse files
Merge branch 'develop' of https://github.com/deepmodeling/abacus-develop into develop
2 parents 998b19f + 72b1d7c commit bd474b2

File tree

224 files changed

+5361
-2619
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

224 files changed

+5361
-2619
lines changed

docs/advanced/acceleration/cuda.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,12 +29,14 @@ To compile and use ABACUS in CUDA mode, you currently need to have an NVIDIA GPU
2929

3030
Check the [Advanced Installation Options](https://abacus-rtd.readthedocs.io/en/latest/advanced/install.html#build-with-cuda-support) for the installation of CUDA version support.
3131

32-
When the compilation parameter USE_ELPA is ON (which is the default value) and USE_CUDA is also set to ON, the ELPA library needs to [enable GPU support](https://github.com/marekandreas/elpa/blob/master/documentation/INSTALL.md) at compile time.
32+
Setting both USE_ELPA and USE_CUDA to ON does not automatically enable ELPA to run on GPUs. ELPA support for GPUs needs to be enabled when ELPA is compiled. [enable GPU support](https://github.com/marekandreas/elpa/blob/master/documentation/INSTALL.md).
33+
34+
The ABACUS program will automatically determine whether the current ELPA supports GPU based on the elpa/elpa_configured_options.h header file. Users can also check this header file to determine the GPU support of ELPA in their environment. ELPA introduced a new API elpa_setup_gpu in version 2023.11.001. So if you want to enable ELPA GPU in ABACUS, the ELPA version must be greater than or equal to 2023.11.001.
3335

3436
## Run with the GPU support by editing the INPUT script:
3537

3638
In `INPUT` file we need to set the input parameter [device](../input_files/input-main.md#device) to `gpu`. If this parameter is not set, ABACUS will try to determine if there are available GPUs.
37-
- Set `ks_solver`: For the PW basis, CG, BPCG and Davidson methods are supported on GPU; set the input parameter [ks_solver](../input_files/input-main.md#ks_solver) to `cg`, `bpcg` or `dav`. For the LCAO basis, `cusolver` and `elpa` is supported on GPU.
39+
- Set `ks_solver`: For the PW basis, CG, BPCG and Davidson methods are supported on GPU; set the input parameter [ks_solver](../input_files/input-main.md#ks_solver) to `cg`, `bpcg` or `dav`. For the LCAO basis, `cusolver`, `cusolvermp` and `elpa` is supported on GPU.
3840
- **multi-card**: ABACUS allows for multi-GPU acceleration. If you have multiple GPU cards, you can run ABACUS with several MPI processes, and each process will utilize one GPU card. For example, the command `mpirun -n 2 abacus` will by default launch two GPUs for computation. If you only have one card, this command will only start one GPU.
3941

4042
## Examples

docs/advanced/input_files/input-main.md

Lines changed: 29 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,7 @@
161161
- [nbands\_istate](#nbands_istate)
162162
- [bands\_to\_print](#bands_to_print)
163163
- [if\_separate\_k](#if_separate_k)
164+
- [out\_elf](#out_elf)
164165
- [Density of states](#density-of-states)
165166
- [dos\_edelta\_ev](#dos_edelta_ev)
166167
- [dos\_sigma](#dos_sigma)
@@ -666,6 +667,7 @@ These variables are used to control parameters related to input files.
666667
- **Type**: String
667668
- **Description**: the name of the structure file
668669
- Containing various information about atom species, including pseudopotential files, local orbitals files, cell information, atom positions, and whether atoms should be allowed to move.
670+
- When [calculation](#calculation) is set to `md` and [md_restart](#md_restart) is set to `true`, this keyword will NOT work.
669671
- Refer to [Doc](https://github.com/deepmodeling/abacus-develop/blob/develop/docs/advanced/input_files/stru.md)
670672
- **Default**: STRU
671673

@@ -931,6 +933,8 @@ calculations.
931933
- **genelpa**: This method should be used if you choose localized orbitals.
932934
- **scalapack_gvx**: Scalapack can also be used for localized orbitals.
933935
- **cusolver**: This method needs building with CUDA and at least one gpu is available.
936+
- **cusolvermp**: This method supports multi-GPU acceleration and needs building with CUDA。 Note that when using cusolvermp, you should set the number of MPI processes to be equal to the number of GPUs.
937+
- **elpa**: The ELPA solver supports both CPU and GPU. By setting the `device` to GPU, you can launch the ELPA solver with GPU acceleration (provided that you have installed a GPU-supported version of ELPA, which requires you to manually compile and install ELPA, and the ABACUS should be compiled with -DUSE_ELPA=ON and -DUSE_CUDA=ON). The ELPA solver also supports multi-GPU acceleration.
934938

935939
If you set ks_solver=`genelpa` for basis_type=`pw`, the program will be stopped with an error message:
936940

@@ -939,7 +943,13 @@ calculations.
939943
```
940944

941945
Then the user has to correct the input file and restart the calculation.
942-
- **Default**: cg (plane-wave basis), or genelpa (localized atomic orbital basis, if compiling option `USE_ELPA` has been set),lapack (localized atomic orbital basis, if compiling option `ENABLE_MPI` has not been set), scalapack_gvx, (localized atomic orbital basis, if compiling option `USE_ELPA` has not been set and if compiling option `ENABLE_MPI` has been set)
946+
- **Default**:
947+
- **PW basis**: cg.
948+
- **LCAO basis**:
949+
- genelpa (if compiling option `USE_ELPA` has been set)
950+
- lapack (if compiling option `ENABLE_MPI` has not been set)
951+
- scalapack_gvx (if compiling option `USE_ELPA` has not been set and compiling option `ENABLE_MPI` has been set)
952+
- cusolver (if compiling option `USE_CUDA` has been set)
943953

944954
### nbands
945955

@@ -1520,7 +1530,7 @@ These variables are used to control the output of properties.
15201530
- **Type**: Integer \[Integer\](optional)
15211531
- **Description**:
15221532
The first integer controls whether to output the charge density on real space grids:
1523-
- 1. Output the charge density (in Bohr^-3) on real space grids into the density files in the folder `OUT.${suffix}`. The files are named as:
1533+
- 1: Output the charge density (in Bohr^-3) on real space grids into the density files in the folder `OUT.${suffix}`. The files are named as:
15241534
- nspin = 1: SPIN1_CHG.cube;
15251535
- nspin = 2: SPIN1_CHG.cube, and SPIN2_CHG.cube;
15261536
- nspin = 4: SPIN1_CHG.cube, SPIN2_CHG.cube, SPIN3_CHG.cube, and SPIN4_CHG.cube.
@@ -1800,6 +1810,23 @@ The band (KS orbital) energy for each (k-point, spin, band) will be printed in t
18001810
- **Description**: Specifies whether to write the partial charge densities for all k-points to individual files or merge them. **Warning**: Enabling symmetry may produce incorrect results due to incorrect k-point weights. Therefore, when calculating partial charge densities, it is strongly recommended to set `symmetry = -1`.
18011811
- **Default**: false
18021812

1813+
### out_elf
1814+
1815+
- **Type**: Integer \[Integer\](optional)
1816+
- **Availability**: Only for Kohn-Sham DFT and Orbital Free DFT.
1817+
- **Description**: Whether to output the electron localization function (ELF) in the folder `OUT.${suffix}`. The files are named as
1818+
- nspin = 1:
1819+
- ELF.cube: ${\rm{ELF}} = \frac{1}{1+\chi^2}$, $\chi = \frac{\frac{1}{2}\sum_{i}{f_i |\nabla\psi_{i}|^2} - \frac{|\nabla\rho|^2}{8\rho}}{\frac{3}{10}(3\pi^2)^{2/3}\rho^{5/3}}$;
1820+
- nspin = 2:
1821+
- ELF_SPIN1.cube, ELF_SPIN2.cube: ${\rm{ELF}}_\sigma = \frac{1}{1+\chi_\sigma^2}$, $\chi_\sigma = \frac{\frac{1}{2}\sum_{i}{f_i |\nabla\psi_{i,\sigma}|^2} - \frac{|\nabla\rho_\sigma|^2}{8\rho_\sigma}}{\frac{3}{10}(6\pi^2)^{2/3}\rho_\sigma^{5/3}}$;
1822+
- ELF.cube: ${\rm{ELF}} = \frac{1}{1+\chi^2}$, $\chi = \frac{\frac{1}{2}\sum_{i,\sigma}{f_i |\nabla\psi_{i,\sigma}|^2} - \sum_{\sigma}{\frac{|\nabla\rho_\sigma|^2}{8\rho_\sigma}}}{\sum_{\sigma}{\frac{3}{10}(6\pi^2)^{2/3}\rho_\sigma^{5/3}}}$;
1823+
1824+
The second integer controls the precision of the kinetic energy density output, if not given, will use `3` as default. For purpose restarting from this file and other high-precision involved calculation, recommend to use `10`.
1825+
1826+
---
1827+
In molecular dynamics calculations, the output frequency is controlled by [out_interval](#out_interval).
1828+
- **Default**: 0 3
1829+
18031830
[back to top](#full-list-of-input-keywords)
18041831

18051832
## Density of states

python/pyabacus/src/py_diago_dav_subspace.hpp

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -110,23 +110,24 @@ class PyDiagoDavSubspace
110110
bool scf_type,
111111
hsolver::diag_comm_info comm_info
112112
) {
113-
auto hpsi_func = [mm_op] (std::complex<double> *hpsi_out,
114-
std::complex<double> *psi_in, const int nband_in,
115-
const int nbasis_in, const int band_index1,
116-
const int band_index2)
117-
{
113+
auto hpsi_func = [mm_op] (
114+
std::complex<double> *psi_in,
115+
std::complex<double> *hpsi_out,
116+
const int ld_psi,
117+
const int nvec
118+
) {
118119
// Note: numpy's py::array_t is row-major, but
119120
// our raw pointer-array is column-major
120-
py::array_t<std::complex<double>, py::array::f_style> psi({nbasis_in, band_index2 - band_index1 + 1});
121+
py::array_t<std::complex<double>, py::array::f_style> psi({ld_psi, nvec});
121122
py::buffer_info psi_buf = psi.request();
122123
std::complex<double>* psi_ptr = static_cast<std::complex<double>*>(psi_buf.ptr);
123-
std::copy(psi_in + band_index1 * nbasis_in, psi_in + (band_index2 + 1) * nbasis_in, psi_ptr);
124+
std::copy(psi_in, psi_in + nvec * ld_psi, psi_ptr);
124125

125126
py::array_t<std::complex<double>, py::array::f_style> hpsi = mm_op(psi);
126127

127128
py::buffer_info hpsi_buf = hpsi.request();
128129
std::complex<double>* hpsi_ptr = static_cast<std::complex<double>*>(hpsi_buf.ptr);
129-
std::copy(hpsi_ptr, hpsi_ptr + (band_index2 - band_index1 + 1) * nbasis_in, hpsi_out);
130+
std::copy(hpsi_ptr, hpsi_ptr + nvec * ld_psi, hpsi_out);
130131
};
131132

132133
obj = std::make_unique<hsolver::Diago_DavSubspace<std::complex<double>, base_device::DEVICE_CPU>>(

python/pyabacus/src/py_diago_david.hpp

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -109,25 +109,23 @@ class PyDiagoDavid
109109
hsolver::diag_comm_info comm_info
110110
) {
111111
auto hpsi_func = [mm_op] (
112-
std::complex<double> *hpsi_out,
113-
std::complex<double> *psi_in,
114-
const int nband_in,
115-
const int nbasis_in,
116-
const int band_index1,
117-
const int band_index2
112+
std::complex<double> *psi_in,
113+
std::complex<double> *hpsi_out,
114+
const int ld_psi,
115+
const int nvec
118116
) {
119117
// Note: numpy's py::array_t is row-major, but
120118
// our raw pointer-array is column-major
121-
py::array_t<std::complex<double>, py::array::f_style> psi({nbasis_in, band_index2 - band_index1 + 1});
119+
py::array_t<std::complex<double>, py::array::f_style> psi({ld_psi, nvec});
122120
py::buffer_info psi_buf = psi.request();
123121
std::complex<double>* psi_ptr = static_cast<std::complex<double>*>(psi_buf.ptr);
124-
std::copy(psi_in + band_index1 * nbasis_in, psi_in + (band_index2 + 1) * nbasis_in, psi_ptr);
122+
std::copy(psi_in, psi_in + nvec * ld_psi, psi_ptr);
125123

126124
py::array_t<std::complex<double>, py::array::f_style> hpsi = mm_op(psi);
127125

128126
py::buffer_info hpsi_buf = hpsi.request();
129127
std::complex<double>* hpsi_ptr = static_cast<std::complex<double>*>(hpsi_buf.ptr);
130-
std::copy(hpsi_ptr, hpsi_ptr + (band_index2 - band_index1 + 1) * nbasis_in, hpsi_out);
128+
std::copy(hpsi_ptr, hpsi_ptr + nvec * ld_psi, hpsi_out);
131129
};
132130

133131
auto spsi_func = [this] (

python/pyabacus/src/pyabacus/hsolver/_hsolver.py

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ def rank(self) -> int: ...
1616
def nproc(self) -> int: ...
1717

1818
def dav_subspace(
19-
mm_op: Callable[[NDArray[np.complex128]], NDArray[np.complex128]],
19+
mvv_op: Callable[[NDArray[np.complex128]], NDArray[np.complex128]],
2020
init_v: NDArray[np.complex128],
2121
dim: int,
2222
num_eigs: int,
@@ -32,9 +32,10 @@ def dav_subspace(
3232
3333
Parameters
3434
----------
35-
mm_op : Callable[[NDArray[np.complex128]], NDArray[np.complex128]],
36-
The operator to be diagonalized, which is a function that takes a matrix as input
37-
and returns a matrix mv_op(X) = H * X as output.
35+
mvv_op : Callable[[NDArray[np.complex128]], NDArray[np.complex128]],
36+
The operator to be diagonalized, which is a function that takes a set of
37+
vectors X = [x1, ..., xN] as input and returns a matrix(vector block)
38+
mvv_op(X) = H * X ([Hx1, ..., HxN]) as output.
3839
init_v : NDArray[np.complex128]
3940
The initial guess for the eigenvectors.
4041
dim : int
@@ -68,8 +69,8 @@ def dav_subspace(
6869
v : NDArray[np.complex128]
6970
The eigenvectors corresponding to the eigenvalues.
7071
"""
71-
if not callable(mm_op):
72-
raise TypeError("mm_op must be a callable object.")
72+
if not callable(mvv_op):
73+
raise TypeError("mvv_op must be a callable object.")
7374

7475
if is_occupied is None:
7576
is_occupied = [True] * num_eigs
@@ -86,7 +87,7 @@ def dav_subspace(
8687
assert dav_ndim * num_eigs < dim * comm_info.nproc, "dav_ndim * num_eigs must be less than dim * comm_info.nproc."
8788

8889
_ = _diago_obj_dav_subspace.diag(
89-
mm_op,
90+
mvv_op,
9091
pre_condition,
9192
dav_ndim,
9293
tol,
@@ -103,7 +104,7 @@ def dav_subspace(
103104
return e, v
104105

105106
def davidson(
106-
mm_op: Callable[[NDArray[np.complex128]], NDArray[np.complex128]],
107+
mvv_op: Callable[[NDArray[np.complex128]], NDArray[np.complex128]],
107108
init_v: NDArray[np.complex128],
108109
dim: int,
109110
num_eigs: int,
@@ -119,9 +120,10 @@ def davidson(
119120
120121
Parameters
121122
----------
122-
mm_op : Callable[[NDArray[np.complex128]], NDArray[np.complex128]],
123-
The operator to be diagonalized, which is a function that takes a matrix as input
124-
and returns a matrix mv_op(X) = H * X as output.
123+
mvv_op : Callable[[NDArray[np.complex128]], NDArray[np.complex128]],
124+
The operator to be diagonalized, which is a function that takes a set of
125+
vectors X = [x1, ..., xN] as input and returns a matrix(vector block)
126+
mvv_op(X) = H * X ([Hx1, ..., HxN]) as output.
125127
init_v : NDArray[np.complex128]
126128
The initial guess for the eigenvectors.
127129
dim : int
@@ -146,8 +148,8 @@ def davidson(
146148
v : NDArray[np.complex128]
147149
The eigenvectors corresponding to the eigenvalues.
148150
"""
149-
if not callable(mm_op):
150-
raise TypeError("mm_op must be a callable object.")
151+
if not callable(mvv_op):
152+
raise TypeError("mvv_op must be a callable object.")
151153

152154
if init_v.ndim != 1 or init_v.dtype != np.complex128:
153155
init_v = init_v.flatten().astype(np.complex128, order='C')
@@ -159,7 +161,7 @@ def davidson(
159161
comm_info = hsolver.diag_comm_info(0, 1)
160162

161163
_ = _diago_obj_dav_subspace.diag(
162-
mm_op,
164+
mvv_op,
163165
pre_condition,
164166
dav_ndim,
165167
tol,

source/Makefile.Objects

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,7 @@ OBJS_ELECSTAT=elecstate.o\
213213
elecstate_print.o\
214214
elecstate_pw.o\
215215
elecstate_pw_sdft.o\
216+
elecstate_pw_cal_tau.o\
216217
elecstate_op.o\
217218
efield.o\
218219
gatefield.o\
@@ -226,6 +227,7 @@ OBJS_ELECSTAT=elecstate.o\
226227

227228
OBJS_ELECSTAT_LCAO=elecstate_lcao.o\
228229
elecstate_lcao_tddft.o\
230+
elecstate_lcao_cal_tau.o\
229231
density_matrix.o\
230232
cal_dm_psi.o\
231233

@@ -454,7 +456,12 @@ OBJS_XC=xc_functional.o\
454456
xc_functional_gradcorr.o\
455457
xc_functional_wrapper_xc.o\
456458
xc_functional_wrapper_gcxc.o\
457-
xc_functional_wrapper_tauxc.o\
459+
xc_functional_libxc.o\
460+
xc_functional_libxc_tools.o\
461+
xc_functional_libxc_vxc.o\
462+
xc_functional_libxc_wrapper_xc.o\
463+
xc_functional_libxc_wrapper_gcxc.o\
464+
xc_functional_libxc_wrapper_tauxc.o\
458465
xc_funct_exch_lda.o\
459466
xc_funct_corr_lda.o\
460467
xc_funct_exch_gga.o\
@@ -496,6 +503,7 @@ OBJS_IO=input_conv.o\
496503
winput.o\
497504
write_cube.o\
498505
write_elecstat_pot.o\
506+
write_elf.o\
499507
write_dipole.o\
500508
td_current_io.o\
501509
write_wfc_r.o\
@@ -523,6 +531,7 @@ OBJS_IO=input_conv.o\
523531
read_input_item_other.o\
524532
read_input_item_output.o\
525533
read_set_globalv.o\
534+
orb_io.o\
526535

527536
OBJS_IO_LCAO=cal_r_overlap_R.o\
528537
write_orb_info.o\

source/driver_run.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ void Driver::driver_run() {
4040

4141
// the life of ucell should begin here, mohan 2024-05-12
4242
// delete ucell as a GlobalC in near future
43-
GlobalC::ucell.setup_cell(PARAM.inp.stru_file, GlobalV::ofs_running);
43+
GlobalC::ucell.setup_cell(PARAM.globalv.global_in_stru, GlobalV::ofs_running);
4444
Check_Atomic_Stru::check_atomic_stru(GlobalC::ucell,
4545
PARAM.inp.min_dist_coef);
4646

source/module_base/global_variable.cpp

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ namespace GlobalV
2121
int NBANDS = 0;
2222
int NLOCAL = 0; // total number of local basis.
2323

24-
int NSPIN = 1; // LDA
2524
double nupdown = 0.0;
2625

2726
bool use_uspp = false;
@@ -55,8 +54,6 @@ int GSIZE = DSIZE;
5554
//----------------------------------------------------------
5655
// EXPLAIN : The input file name and directory
5756
//----------------------------------------------------------
58-
std::string stru_file = "STRU";
59-
6057
std::ofstream ofs_running;
6158
std::ofstream ofs_warning;
6259
std::ofstream ofs_info; // output math lib info

source/module_base/global_variable.h

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,6 @@ namespace GlobalV
2020
extern int NBANDS;
2121
extern int NLOCAL; // 1.1 // mohan add 2009-05-29
2222

23-
24-
extern int NSPIN; // 7
2523
extern double nupdown;
2624
extern bool use_uspp;
2725

@@ -80,7 +78,6 @@ extern int KPAR_LCAO;
8078
// NAME : ofs_running( contain information during runnnig)
8179
// NAME : ofs_warning( contain warning information, including error)
8280
//==========================================================
83-
extern std::string stru_file;
8481
// extern std::string global_pseudo_type; // mohan add 2013-05-20 (xiaohui add
8582
// 2013-06-23)
8683
extern std::ofstream ofs_running;

0 commit comments

Comments
 (0)