Skip to content

Commit 1c9712a

Browse files
committed
merge to abacus-v3.7850 but don't debug(many, because many functions and interfaces have changed). The scaling_factor_xc issue is not handled
2 parents e0f5436 + 72b1d7c commit 1c9712a

File tree

145 files changed

+3574
-1547
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

145 files changed

+3574
-1547
lines changed

docs/advanced/acceleration/cuda.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ The ABACUS program will automatically determine whether the current ELPA support
3636
## Run with the GPU support by editing the INPUT script:
3737

3838
In `INPUT` file we need to set the input parameter [device](../input_files/input-main.md#device) to `gpu`. If this parameter is not set, ABACUS will try to determine if there are available GPUs.
39-
- Set `ks_solver`: For the PW basis, CG, BPCG and Davidson methods are supported on GPU; set the input parameter [ks_solver](../input_files/input-main.md#ks_solver) to `cg`, `bpcg` or `dav`. For the LCAO basis, `cusolver` and `elpa` is supported on GPU.
39+
- Set `ks_solver`: For the PW basis, CG, BPCG and Davidson methods are supported on GPU; set the input parameter [ks_solver](../input_files/input-main.md#ks_solver) to `cg`, `bpcg` or `dav`. For the LCAO basis, `cusolver`, `cusolvermp` and `elpa` is supported on GPU.
4040
- **multi-card**: ABACUS allows for multi-GPU acceleration. If you have multiple GPU cards, you can run ABACUS with several MPI processes, and each process will utilize one GPU card. For example, the command `mpirun -n 2 abacus` will by default launch two GPUs for computation. If you only have one card, this command will only start one GPU.
4141

4242
## Examples

docs/advanced/input_files/input-main.md

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,7 @@
161161
- [nbands\_istate](#nbands_istate)
162162
- [bands\_to\_print](#bands_to_print)
163163
- [if\_separate\_k](#if_separate_k)
164+
- [out\_elf](#out_elf)
164165
- [Density of states](#density-of-states)
165166
- [dos\_edelta\_ev](#dos_edelta_ev)
166167
- [dos\_sigma](#dos_sigma)
@@ -932,6 +933,8 @@ calculations.
932933
- **genelpa**: This method should be used if you choose localized orbitals.
933934
- **scalapack_gvx**: Scalapack can also be used for localized orbitals.
934935
- **cusolver**: This method needs building with CUDA and at least one gpu is available.
936+
- **cusolvermp**: This method supports multi-GPU acceleration and needs building with CUDA。 Note that when using cusolvermp, you should set the number of MPI processes to be equal to the number of GPUs.
937+
- **elpa**: The ELPA solver supports both CPU and GPU. By setting the `device` to GPU, you can launch the ELPA solver with GPU acceleration (provided that you have installed a GPU-supported version of ELPA, which requires you to manually compile and install ELPA, and the ABACUS should be compiled with -DUSE_ELPA=ON and -DUSE_CUDA=ON). The ELPA solver also supports multi-GPU acceleration.
935938

936939
If you set ks_solver=`genelpa` for basis_type=`pw`, the program will be stopped with an error message:
937940

@@ -940,7 +943,13 @@ calculations.
940943
```
941944

942945
Then the user has to correct the input file and restart the calculation.
943-
- **Default**: cg (plane-wave basis), or genelpa (localized atomic orbital basis, if compiling option `USE_ELPA` has been set),lapack (localized atomic orbital basis, if compiling option `ENABLE_MPI` has not been set), scalapack_gvx, (localized atomic orbital basis, if compiling option `USE_ELPA` has not been set and if compiling option `ENABLE_MPI` has been set)
946+
- **Default**:
947+
- **PW basis**: cg.
948+
- **LCAO basis**:
949+
- genelpa (if compiling option `USE_ELPA` has been set)
950+
- lapack (if compiling option `ENABLE_MPI` has not been set)
951+
- scalapack_gvx (if compiling option `USE_ELPA` has not been set and compiling option `ENABLE_MPI` has been set)
952+
- cusolver (if compiling option `USE_CUDA` has been set)
944953

945954
### nbands
946955

@@ -1521,7 +1530,7 @@ These variables are used to control the output of properties.
15211530
- **Type**: Integer \[Integer\](optional)
15221531
- **Description**:
15231532
The first integer controls whether to output the charge density on real space grids:
1524-
- 1. Output the charge density (in Bohr^-3) on real space grids into the density files in the folder `OUT.${suffix}`. The files are named as:
1533+
- 1: Output the charge density (in Bohr^-3) on real space grids into the density files in the folder `OUT.${suffix}`. The files are named as:
15251534
- nspin = 1: SPIN1_CHG.cube;
15261535
- nspin = 2: SPIN1_CHG.cube, and SPIN2_CHG.cube;
15271536
- nspin = 4: SPIN1_CHG.cube, SPIN2_CHG.cube, SPIN3_CHG.cube, and SPIN4_CHG.cube.
@@ -1801,6 +1810,23 @@ The band (KS orbital) energy for each (k-point, spin, band) will be printed in t
18011810
- **Description**: Specifies whether to write the partial charge densities for all k-points to individual files or merge them. **Warning**: Enabling symmetry may produce incorrect results due to incorrect k-point weights. Therefore, when calculating partial charge densities, it is strongly recommended to set `symmetry = -1`.
18021811
- **Default**: false
18031812

1813+
### out_elf
1814+
1815+
- **Type**: Integer \[Integer\](optional)
1816+
- **Availability**: Only for Kohn-Sham DFT and Orbital Free DFT.
1817+
- **Description**: Whether to output the electron localization function (ELF) in the folder `OUT.${suffix}`. The files are named as
1818+
- nspin = 1:
1819+
- ELF.cube: ${\rm{ELF}} = \frac{1}{1+\chi^2}$, $\chi = \frac{\frac{1}{2}\sum_{i}{f_i |\nabla\psi_{i}|^2} - \frac{|\nabla\rho|^2}{8\rho}}{\frac{3}{10}(3\pi^2)^{2/3}\rho^{5/3}}$;
1820+
- nspin = 2:
1821+
- ELF_SPIN1.cube, ELF_SPIN2.cube: ${\rm{ELF}}_\sigma = \frac{1}{1+\chi_\sigma^2}$, $\chi_\sigma = \frac{\frac{1}{2}\sum_{i}{f_i |\nabla\psi_{i,\sigma}|^2} - \frac{|\nabla\rho_\sigma|^2}{8\rho_\sigma}}{\frac{3}{10}(6\pi^2)^{2/3}\rho_\sigma^{5/3}}$;
1822+
- ELF.cube: ${\rm{ELF}} = \frac{1}{1+\chi^2}$, $\chi = \frac{\frac{1}{2}\sum_{i,\sigma}{f_i |\nabla\psi_{i,\sigma}|^2} - \sum_{\sigma}{\frac{|\nabla\rho_\sigma|^2}{8\rho_\sigma}}}{\sum_{\sigma}{\frac{3}{10}(6\pi^2)^{2/3}\rho_\sigma^{5/3}}}$;
1823+
1824+
The second integer controls the precision of the kinetic energy density output, if not given, will use `3` as default. For purpose restarting from this file and other high-precision involved calculation, recommend to use `10`.
1825+
1826+
---
1827+
In molecular dynamics calculations, the output frequency is controlled by [out_interval](#out_interval).
1828+
- **Default**: 0 3
1829+
18041830
[back to top](#full-list-of-input-keywords)
18051831

18061832
## Density of states

python/pyabacus/src/py_diago_dav_subspace.hpp

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -113,23 +113,21 @@ class PyDiagoDavSubspace
113113
auto hpsi_func = [mm_op] (
114114
std::complex<double> *psi_in,
115115
std::complex<double> *hpsi_out,
116-
const int nband_in,
117-
const int nbasis_in,
118-
const int band_index1,
119-
const int band_index2
116+
const int ld_psi,
117+
const int nvec
120118
) {
121119
// Note: numpy's py::array_t is row-major, but
122120
// our raw pointer-array is column-major
123-
py::array_t<std::complex<double>, py::array::f_style> psi({nbasis_in, band_index2 - band_index1 + 1});
121+
py::array_t<std::complex<double>, py::array::f_style> psi({ld_psi, nvec});
124122
py::buffer_info psi_buf = psi.request();
125123
std::complex<double>* psi_ptr = static_cast<std::complex<double>*>(psi_buf.ptr);
126-
std::copy(psi_in + band_index1 * nbasis_in, psi_in + (band_index2 + 1) * nbasis_in, psi_ptr);
124+
std::copy(psi_in, psi_in + nvec * ld_psi, psi_ptr);
127125

128126
py::array_t<std::complex<double>, py::array::f_style> hpsi = mm_op(psi);
129127

130128
py::buffer_info hpsi_buf = hpsi.request();
131129
std::complex<double>* hpsi_ptr = static_cast<std::complex<double>*>(hpsi_buf.ptr);
132-
std::copy(hpsi_ptr, hpsi_ptr + (band_index2 - band_index1 + 1) * nbasis_in, hpsi_out);
130+
std::copy(hpsi_ptr, hpsi_ptr + nvec * ld_psi, hpsi_out);
133131
};
134132

135133
obj = std::make_unique<hsolver::Diago_DavSubspace<std::complex<double>, base_device::DEVICE_CPU>>(

python/pyabacus/src/py_diago_david.hpp

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -111,23 +111,21 @@ class PyDiagoDavid
111111
auto hpsi_func = [mm_op] (
112112
std::complex<double> *psi_in,
113113
std::complex<double> *hpsi_out,
114-
const int nband_in,
115-
const int nbasis_in,
116-
const int band_index1,
117-
const int band_index2
114+
const int ld_psi,
115+
const int nvec
118116
) {
119117
// Note: numpy's py::array_t is row-major, but
120118
// our raw pointer-array is column-major
121-
py::array_t<std::complex<double>, py::array::f_style> psi({nbasis_in, band_index2 - band_index1 + 1});
119+
py::array_t<std::complex<double>, py::array::f_style> psi({ld_psi, nvec});
122120
py::buffer_info psi_buf = psi.request();
123121
std::complex<double>* psi_ptr = static_cast<std::complex<double>*>(psi_buf.ptr);
124-
std::copy(psi_in + band_index1 * nbasis_in, psi_in + (band_index2 + 1) * nbasis_in, psi_ptr);
122+
std::copy(psi_in, psi_in + nvec * ld_psi, psi_ptr);
125123

126124
py::array_t<std::complex<double>, py::array::f_style> hpsi = mm_op(psi);
127125

128126
py::buffer_info hpsi_buf = hpsi.request();
129127
std::complex<double>* hpsi_ptr = static_cast<std::complex<double>*>(hpsi_buf.ptr);
130-
std::copy(hpsi_ptr, hpsi_ptr + (band_index2 - band_index1 + 1) * nbasis_in, hpsi_out);
128+
std::copy(hpsi_ptr, hpsi_ptr + nvec * ld_psi, hpsi_out);
131129
};
132130

133131
auto spsi_func = [this] (

source/Makefile.Objects

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,7 @@ OBJS_ELECSTAT=elecstate.o\
213213
elecstate_print.o\
214214
elecstate_pw.o\
215215
elecstate_pw_sdft.o\
216+
elecstate_pw_cal_tau.o\
216217
elecstate_op.o\
217218
efield.o\
218219
gatefield.o\
@@ -226,6 +227,7 @@ OBJS_ELECSTAT=elecstate.o\
226227

227228
OBJS_ELECSTAT_LCAO=elecstate_lcao.o\
228229
elecstate_lcao_tddft.o\
230+
elecstate_lcao_cal_tau.o\
229231
density_matrix.o\
230232
cal_dm_psi.o\
231233

@@ -454,7 +456,12 @@ OBJS_XC=xc_functional.o\
454456
xc_functional_gradcorr.o\
455457
xc_functional_wrapper_xc.o\
456458
xc_functional_wrapper_gcxc.o\
457-
xc_functional_wrapper_tauxc.o\
459+
xc_functional_libxc.o\
460+
xc_functional_libxc_tools.o\
461+
xc_functional_libxc_vxc.o\
462+
xc_functional_libxc_wrapper_xc.o\
463+
xc_functional_libxc_wrapper_gcxc.o\
464+
xc_functional_libxc_wrapper_tauxc.o\
458465
xc_funct_exch_lda.o\
459466
xc_funct_corr_lda.o\
460467
xc_funct_exch_gga.o\
@@ -496,6 +503,7 @@ OBJS_IO=input_conv.o\
496503
winput.o\
497504
write_cube.o\
498505
write_elecstat_pot.o\
506+
write_elf.o\
499507
write_dipole.o\
500508
td_current_io.o\
501509
write_wfc_r.o\

source/module_base/global_variable.cpp

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ namespace GlobalV
2121
int NBANDS = 0;
2222
int NLOCAL = 0; // total number of local basis.
2323

24-
int NSPIN = 1; // LDA
2524
double nupdown = 0.0;
2625

2726
bool use_uspp = false;

source/module_base/global_variable.h

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,6 @@ namespace GlobalV
2020
extern int NBANDS;
2121
extern int NLOCAL; // 1.1 // mohan add 2009-05-29
2222

23-
24-
extern int NSPIN; // 7
2523
extern double nupdown;
2624
extern bool use_uspp;
2725

source/module_elecstate/CMakeLists.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ list(APPEND objects
77
elecstate_print.cpp
88
elecstate_pw.cpp
99
elecstate_pw_sdft.cpp
10+
elecstate_pw_cal_tau.cpp
1011
potentials/gatefield.cpp
1112
potentials/efield.cpp
1213
potentials/H_Hartree_pw.cpp
@@ -31,6 +32,7 @@ if(ENABLE_LCAO)
3132
list(APPEND objects
3233
elecstate_lcao.cpp
3334
elecstate_lcao_tddft.cpp
35+
elecstate_lcao_cal_tau.cpp
3436
potentials/H_TDDFT_pw.cpp
3537
module_dm/density_matrix.cpp
3638
module_dm/cal_dm_psi.cpp

source/module_elecstate/elecstate.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,14 @@ class ElecState
5353
}
5454
// virtual void updateRhoK(const psi::Psi<std::complex<double>> &psi) = 0;
5555
// virtual void updateRhoK(const psi::Psi<double> &psi)=0
56+
virtual void cal_tau(const psi::Psi<std::complex<double>>& psi)
57+
{
58+
return;
59+
}
60+
virtual void cal_tau(const psi::Psi<double>& psi)
61+
{
62+
return;
63+
}
5664

5765
// update charge density for next scf step
5866
// in this function, 1. input rho for construct Hamilt and 2. calculated rho from Psi will mix to 3. new charge

source/module_elecstate/elecstate_lcao.cpp

Lines changed: 2 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -67,12 +67,7 @@ void ElecStateLCAO<std::complex<double>>::psiToRho(const psi::Psi<std::complex<d
6767

6868
if (XC_Functional::get_func_type() == 3 || XC_Functional::get_func_type() == 5)
6969
{
70-
for (int is = 0; is < PARAM.inp.nspin; is++)
71-
{
72-
ModuleBase::GlobalFunc::ZEROS(this->charge->kin_r[is], this->charge->nrxx);
73-
}
74-
Gint_inout inout1(this->charge->kin_r, Gint_Tools::job_type::tau);
75-
this->gint_k->cal_gint(&inout1);
70+
this->cal_tau(psi);
7671
}
7772

7873
this->charge->renormalize_rho();
@@ -124,12 +119,7 @@ void ElecStateLCAO<double>::psiToRho(const psi::Psi<double>& psi)
124119

125120
if (XC_Functional::get_func_type() == 3 || XC_Functional::get_func_type() == 5)
126121
{
127-
for (int is = 0; is < PARAM.inp.nspin; is++)
128-
{
129-
ModuleBase::GlobalFunc::ZEROS(this->charge->kin_r[is], this->charge->nrxx);
130-
}
131-
Gint_inout inout1(this->charge->kin_r, Gint_Tools::job_type::tau);
132-
this->gint_gamma->cal_gint(&inout1);
122+
this->cal_tau(psi);
133123
}
134124

135125
this->charge->renormalize_rho();

0 commit comments

Comments
 (0)