Skip to content

Commit 3df2804

Browse files
committed
Merge branch 'pr/1041176461/9' into develop
2 parents 8c7c0e6 + dd862ea commit 3df2804

File tree

42 files changed

+2084
-494
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+2084
-494
lines changed

docs/advanced/input_files/input-main.md

Lines changed: 14 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -243,8 +243,8 @@
243243
- [exx\_opt\_orb\_ecut](#exx_opt_orb_ecut)
244244
- [exx\_opt\_orb\_tolerence](#exx_opt_orb_tolerence)
245245
- [exx\_real\_number](#exx_real_number)
246-
- [exx\_symmetry\_realspace](#exx_symmetry_realspace)
247246
- [rpa\_ccp\_rmesh\_times](#rpa_ccp_rmesh_times)
247+
- [exx\_symmetry\_realspace](#exx_symmetry_realspace)
248248
- [out\_ri\_cv](#out_ri_cv)
249249
- [Molecular dynamics](#molecular-dynamics)
250250
- [md\_type](#md_type)
@@ -273,6 +273,9 @@
273273
- [lj\_epsilon](#lj_epsilon)
274274
- [lj\_sigma](#lj_sigma)
275275
- [pot\_file](#pot_file)
276+
- [dp\_rescaling](#dp_rescaling)
277+
- [dp\_fparam](#dp_fparam)
278+
- [dp\_aparam](#dp_aparam)
276279
- [msst\_direction](#msst_direction)
277280
- [msst\_vel](#msst_vel)
278281
- [msst\_vis](#msst_vis)
@@ -422,11 +425,12 @@
422425
- [nocc](#nocc)
423426
- [nvirt](#nvirt)
424427
- [lr\_nstates](#lr_nstates)
428+
- [lr\_unrestricted](#lr_unrestricted)
425429
- [abs\_wavelen\_range](#abs_wavelen_range)
426430
- [out\_wfc\_lr](#out_wfc_lr)
427431
- [abs\_broadening](#abs_broadening)
428432
- [ri\_hartree\_benchmark](#ri_hartree_benchmark)
429-
- [aims_nbasis](#aims_nbasis)
433+
- [aims\_nbasis](#aims_nbasis)
430434

431435
[back to top](#full-list-of-input-keywords)
432436
## System variables
@@ -2908,46 +2912,38 @@ These variables are used to control vdW-corrected related parameters.
29082912
- **Type**: String
29092913
- **Description**: Specifies the method used for Van der Waals (VdW) correction. Available options are:
29102914
- `d2`: [Grimme's D2](https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.20495) dispersion correction method
2911-
- `d3_0`: [Grimme's DFT-D3(0)](https://aip.scitation.org/doi/10.1063/1.3382344) dispersion correction method
2912-
- `d3_bj`: [Grimme's DFTD3(BJ)](https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.21759) dispersion correction method
2915+
- `d3_0`: [Grimme's DFT-D3(0)](https://aip.scitation.org/doi/10.1063/1.3382344) dispersion correction method (zero-damping)
2916+
- `d3_bj`: [Grimme's DFTD3(BJ)](https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.21759) dispersion correction method (BJ-damping)
29132917
- `none`: no vdW correction
29142918
- **Default**: none
2919+
- **Note**: ABACUS supports automatic setting on DFT-D3 parameters for common functionals after version 3.8.3 (and several develop versions earlier). To benefit from this feature, please specify the parameter `dft_functional` explicitly (for more details on this parameter, please see [dft_functional](#dft_functional)), otherwise the autoset procedure will crash with error message like `cannot find DFT-D3 parameter for XC(***)`. If not satisfied with those in-built parameters, any manually setting on `vdw_s6`, `vdw_s8`, `vdw_a1` and `vdw_a2` will overwrite.
2920+
- **Special**: There are special cases for functional family wB97 (Omega-B97): if want to use the functional wB97X-D3BJ, one needs to specify the `dft_functional` as `HYB_GGA_WB97X_V` and `vdw_method` as `d3_bj`. If want to use the functional wB97X-D3, specify `dft_functional` as `HYB_GGA_WB97X_D3` and `vdw_method` as `d3_0`.
29152921

29162922
### vdw_s6
29172923

29182924
- **Type**: Real
29192925
- **Availability**: `vdw_method` is set to `d2`, `d3_0`, or `d3_bj`
2920-
- **Description**: This scale factor is used to optimize the interaction energy deviations in van der Waals (vdW) corrected calculations. The recommended values of this parameter are dependent on the chosen vdW correction method and the DFT functional being used. For DFT-D2, the recommended values are 0.75 (PBE), 1.2 (BLYP), 1.05 (B-P86), 1.0 (TPSS), and 1.05 (B3LYP). For DFT-D3, recommended values with different DFT functionals can be found on the [here](https://www.chemiebn.uni-bonn.de/pctc/mulliken-center/software/dft-d3/dft-d3). The default value of this parameter in ABACUS is set to be the recommended value for PBE.
2926+
- **Description**: This scale factor is used to optimize the interaction energy deviations in van der Waals (vdW) corrected calculations. The recommended values of this parameter are dependent on the chosen vdW correction method and the DFT functional being used. For DFT-D2, the recommended values are 0.75 (PBE), 1.2 (BLYP), 1.05 (B-P86), 1.0 (TPSS), and 1.05 (B3LYP). If not set, will use values of PBE functional. For DFT-D3, recommended values with different DFT functionals can be found on the [here](https://github.com/dftd3/simple-dftd3/blob/main/assets/parameters.toml). If not set, will search in ABACUS built-in dataset based on the `dft_functional` keywords. User set value will overwrite the searched value.
29212927
- **Default**:
29222928
- 0.75: if `vdw_method` is set to `d2`
2923-
- 1.0: if `vdw_method` is set to `d3_0` or `d3_bj`
29242929

29252930
### vdw_s8
29262931

29272932
- **Type**: Real
29282933
- **Availability**: `vdw_method` is set to `d3_0` or `d3_bj`
2929-
- **Description**: This scale factor is relevant for D3(0) and D3(BJ) van der Waals (vdW) correction methods. The recommended values of this parameter with different DFT functionals can be found on the [webpage](https://www.chemiebn.uni-bonn.de/pctc/mulliken-center/software/dft-d3/dft-d3). The default value of this parameter in ABACUS is set to be the recommended value for PBE.
2930-
- **Default**:
2931-
- 0.722: if `vdw_method` is set to `d3_0`
2932-
- 0.7875: if `vdw_method` is set to `d3_bj`
2934+
- **Description**: This scale factor is relevant for D3(0) and D3(BJ) van der Waals (vdW) correction methods. The recommended values of this parameter with different DFT functionals can be found on the [webpage](https://github.com/dftd3/simple-dftd3/blob/main/assets/parameters.toml). If not set, will search in ABACUS built-in dataset based on the `dft_functional` keywords. User set value will overwrite the searched value.
29332935

29342936
### vdw_a1
29352937

29362938
- **Type**: Real
29372939
- **Availability**: `vdw_method` is set to `d3_0` or `d3_bj`
2938-
- **Description**: This damping function parameter is relevant for D3(0) and D3(BJ) van der Waals (vdW) correction methods. The recommended values of this parameter with different DFT functionals can be found on the [webpage](https://www.chemiebn.uni-bonn.de/pctc/mulliken-center/software/dft-d3/dft-d3). The default value of this parameter in ABACUS is set to be the recommended value for PBE.
2939-
- **Default**:
2940-
- 1.217: if `vdw_method` is set to `d3_0`
2941-
- 0.4289: if `vdw_method` is set to `d3_bj`
2940+
- **Description**: This damping function parameter is relevant for D3(0) and D3(BJ) van der Waals (vdW) correction methods. The recommended values of this parameter with different DFT functionals can be found on the [webpage](https://github.com/dftd3/simple-dftd3/blob/main/assets/parameters.toml). If not set, will search in ABACUS built-in dataset based on the `dft_functional` keywords. User set value will overwrite the searched value.
29422941

29432942
### vdw_a2
29442943

29452944
- **Type**: Real
29462945
- **Availability**: `vdw_method` is set to `d3_0` or `d3_bj`
2947-
- **Description**: This damping function parameter is only relevant for D3(0) and D3(BJ) van der Waals (vdW) correction methods. The recommended values of this parameter with different DFT functionals can be found on the [webpage](https://www.chemiebn.uni-bonn.de/pctc/mulliken-center/software/dft-d3/dft-d3). The default value of this parameter in ABACUS is set to be the recommended value for PBE.
2948-
- **Default**:
2949-
- 1.0: if `vdw_method` is set to `d3_0`
2950-
- 4.4407: if `vdw_method` is set to `d3_bj`
2946+
- **Description**: This damping function parameter is only relevant for D3(0) and D3(BJ) van der Waals (vdW) correction methods. The recommended values of this parameter with different DFT functionals can be found on the [webpage](https://github.com/dftd3/simple-dftd3/blob/main/assets/parameters.toml). If not set, will search in ABACUS built-in dataset based on the `dft_functional` keywords. User set value will overwrite the searched value.
29512947

29522948
### vdw_d
29532949

source/module_base/blas_connector.cpp

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ void BlasConnector::gemm(const char transa, const char transb, const int m, cons
9393
}
9494
#ifdef __DSP
9595
else if (device_type == base_device::AbacusDevice_t::DspDevice){
96-
sgemm_mt_(&transb, &transa, &n, &m, &k,
96+
sgemm_mth_(&transb, &transa, &n, &m, &k,
9797
&alpha, b, &ldb, a, &lda,
9898
&beta, c, &ldc, GlobalV::MY_RANK);
9999
}
@@ -111,7 +111,7 @@ void BlasConnector::gemm(const char transa, const char transb, const int m, cons
111111
}
112112
#ifdef __DSP
113113
else if (device_type == base_device::AbacusDevice_t::DspDevice){
114-
dgemm_mt_(&transb, &transa, &n, &m, &k,
114+
dgemm_mth_(&transb, &transa, &n, &m, &k,
115115
&alpha, b, &ldb, a, &lda,
116116
&beta, c, &ldc, GlobalV::MY_RANK);
117117
}
@@ -129,7 +129,7 @@ void BlasConnector::gemm(const char transa, const char transb, const int m, cons
129129
}
130130
#ifdef __DSP
131131
else if (device_type == base_device::AbacusDevice_t::DspDevice) {
132-
cgemm_mt_(&transb, &transa, &n, &m, &k,
132+
cgemm_mth_(&transb, &transa, &n, &m, &k,
133133
&alpha, b, &ldb, a, &lda,
134134
&beta, c, &ldc, GlobalV::MY_RANK);
135135
}
@@ -147,7 +147,7 @@ void BlasConnector::gemm(const char transa, const char transb, const int m, cons
147147
}
148148
#ifdef __DSP
149149
else if (device_type == base_device::AbacusDevice_t::DspDevice) {
150-
zgemm_mt_(&transb, &transa, &n, &m, &k,
150+
zgemm_mth_(&transb, &transa, &n, &m, &k,
151151
&alpha, b, &ldb, a, &lda,
152152
&beta, c, &ldc, GlobalV::MY_RANK);
153153
}

source/module_base/kernels/dsp/dsp_connector.h

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@
22
#define DSP_CONNECTOR_H
33
#ifdef __DSP
44

5+
#include "module_base/module_device/device.h"
6+
#include "module_base/module_device/memory_op.h"
7+
#include "module_hsolver/diag_comm_info.h"
8+
59
// Base dsp functions
610
void dspInitHandle(int id);
711
void dspDestoryHandle(int id);
@@ -62,5 +66,66 @@ void cgemm_mth_(const char *transa, const char *transb,
6266

6367
//#define zgemm_ zgemm_mt
6468

69+
// The next is dsp utils. It may be moved to other files if this file get too huge
70+
71+
template <typename T>
72+
void dsp_dav_subspace_reduce(T* hcc, T* scc, int nbase, int nbase_x, int notconv, MPI_Comm diag_comm){
73+
74+
using syncmem_complex_op = base_device::memory::synchronize_memory_op<T, base_device::DEVICE_CPU, base_device::DEVICE_CPU>;
75+
76+
auto* swap = new T[notconv * nbase_x];
77+
auto* target = new T[notconv * nbase_x];
78+
syncmem_complex_op()(cpu_ctx, cpu_ctx, swap, hcc + nbase * nbase_x, notconv * nbase_x);
79+
if (base_device::get_current_precision(swap) == "single")
80+
{
81+
MPI_Reduce(swap,
82+
target,
83+
notconv * nbase_x,
84+
MPI_COMPLEX,
85+
MPI_SUM,
86+
0,
87+
diag_comm);
88+
}
89+
else
90+
{
91+
MPI_Reduce(swap,
92+
target,
93+
notconv * nbase_x,
94+
MPI_DOUBLE_COMPLEX,
95+
MPI_SUM,
96+
0,
97+
diag_comm);
98+
}
99+
100+
syncmem_complex_op()(cpu_ctx, cpu_ctx, hcc + nbase * nbase_x, target, notconv * nbase_x);
101+
syncmem_complex_op()(cpu_ctx, cpu_ctx, swap, scc + nbase * nbase_x, notconv * nbase_x);
102+
103+
if (base_device::get_current_precision(swap) == "single")
104+
{
105+
MPI_Reduce(swap,
106+
target,
107+
notconv * nbase_x,
108+
MPI_COMPLEX,
109+
MPI_SUM,
110+
0,
111+
diag_comm);
112+
}
113+
else
114+
{
115+
MPI_Reduce(swap,
116+
target,
117+
notconv * nbase_x,
118+
MPI_DOUBLE_COMPLEX,
119+
MPI_SUM,
120+
0,
121+
diag_comm);
122+
}
123+
124+
syncmem_complex_op()(cpu_ctx, cpu_ctx, scc + nbase * nbase_x, target, notconv * nbase_x);
125+
delete[] swap;
126+
delete[] target;
127+
}
128+
129+
65130
#endif
66131
#endif

source/module_base/module_device/memory_op.cpp

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -346,5 +346,57 @@ template struct delete_memory_op<std::complex<float>, base_device::DEVICE_GPU>;
346346
template struct delete_memory_op<std::complex<double>, base_device::DEVICE_GPU>;
347347
#endif
348348

349+
#ifdef __DSP
350+
351+
template <typename FPTYPE>
352+
struct resize_memory_op_mt<FPTYPE, base_device::DEVICE_CPU>
353+
{
354+
void operator()(const base_device::DEVICE_CPU* dev, FPTYPE*& arr, const size_t size, const char* record_in)
355+
{
356+
if (arr != nullptr)
357+
{
358+
free_ht(arr);
359+
}
360+
arr = (FPTYPE*)malloc_ht(sizeof(FPTYPE) * size, GlobalV::MY_RANK);
361+
std::string record_string;
362+
if (record_in != nullptr)
363+
{
364+
record_string = record_in;
365+
}
366+
else
367+
{
368+
record_string = "no_record";
369+
}
370+
371+
if (record_string != "no_record")
372+
{
373+
ModuleBase::Memory::record(record_string, sizeof(FPTYPE) * size);
374+
}
375+
}
376+
};
377+
378+
template <typename FPTYPE>
379+
struct delete_memory_op_mt<FPTYPE, base_device::DEVICE_CPU>
380+
{
381+
void operator()(const base_device::DEVICE_CPU* dev, FPTYPE* arr)
382+
{
383+
free_ht(arr);
384+
}
385+
};
386+
387+
388+
template struct resize_memory_op_mt<int, base_device::DEVICE_CPU>;
389+
template struct resize_memory_op_mt<float, base_device::DEVICE_CPU>;
390+
template struct resize_memory_op_mt<double, base_device::DEVICE_CPU>;
391+
template struct resize_memory_op_mt<std::complex<float>, base_device::DEVICE_CPU>;
392+
template struct resize_memory_op_mt<std::complex<double>, base_device::DEVICE_CPU>;
393+
394+
template struct delete_memory_op_mt<int, base_device::DEVICE_CPU>;
395+
template struct delete_memory_op_mt<float, base_device::DEVICE_CPU>;
396+
template struct delete_memory_op_mt<double, base_device::DEVICE_CPU>;
397+
template struct delete_memory_op_mt<std::complex<float>, base_device::DEVICE_CPU>;
398+
template struct delete_memory_op_mt<std::complex<double>, base_device::DEVICE_CPU>;
399+
#endif
400+
349401
} // namespace memory
350402
} // namespace base_device

source/module_base/module_device/memory_op.h

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,36 @@ struct delete_memory_op<FPTYPE, base_device::DEVICE_GPU>
146146
};
147147
#endif // __CUDA || __UT_USE_CUDA || __ROCM || __UT_USE_ROCM
148148

149+
#ifdef __DSP
150+
151+
template <typename FPTYPE, typename Device>
152+
struct resize_memory_op_mt
153+
{
154+
/// @brief Allocate memory for a given pointer. Note this op will free the pointer first.
155+
///
156+
/// Input Parameters
157+
/// \param dev : the type of computing device
158+
/// \param size : array size
159+
/// \param record_string : label for memory record
160+
///
161+
/// Output Parameters
162+
/// \param arr : allocated array
163+
void operator()(const Device* dev, FPTYPE*& arr, const size_t size, const char* record_in = nullptr);
164+
};
165+
166+
template <typename FPTYPE, typename Device>
167+
struct delete_memory_op_mt
168+
{
169+
/// @brief free memory for multi-device
170+
///
171+
/// Input Parameters
172+
/// \param dev : the type of computing device
173+
/// \param arr : the input array
174+
void operator()(const Device* dev, FPTYPE* arr);
175+
};
176+
177+
#endif // __DSP
178+
149179
} // end of namespace memory
150180
} // end of namespace base_device
151181

@@ -233,5 +263,4 @@ using castmem_z2c_d2h_op = base_device::memory::
233263

234264
static base_device::DEVICE_CPU* cpu_ctx = {};
235265
static base_device::DEVICE_GPU* gpu_ctx = {};
236-
237266
#endif // MODULE_DEVICE_MEMORY_H_

source/module_base/module_device/types.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ namespace base_device
66

77
struct DEVICE_CPU;
88
struct DEVICE_GPU;
9+
struct DEVICE_DSP;
910

1011
enum AbacusDevice_t
1112
{

source/module_esolver/esolver.cpp

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,6 @@ extern "C"
2323
namespace ModuleESolver
2424
{
2525

26-
void ESolver::printname()
27-
{
28-
std::cout << classname << std::endl;
29-
}
30-
3126
std::string determine_type()
3227
{
3328
std::string esolver_type = "none";

source/module_esolver/esolver.h

Lines changed: 0 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -41,23 +41,6 @@ class ESolver
4141
//! calcualte stress of given cell
4242
virtual void cal_stress(ModuleBase::matrix& stress) = 0;
4343

44-
45-
// Print current classname.
46-
void printname();
47-
48-
// temporarily
49-
// get iterstep used in current scf
50-
virtual int get_niter()
51-
{
52-
return 0;
53-
}
54-
55-
// get maxniter used in current scf
56-
virtual int get_maxniter()
57-
{
58-
return 0;
59-
}
60-
6144
bool conv_esolver = true; // whether esolver is converged
6245

6346
std::string classname;

source/module_esolver/esolver_fp.cpp

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55
#include "module_hamilt_pw/hamilt_pwdft/global.h"
66
#include "module_io/cif_io.h"
77
#include "module_io/cube_io.h"
8+
#include "module_io/json_output/init_info.h"
9+
#include "module_io/json_output/output_info.h"
810
#include "module_io/output_log.h"
911
#include "module_io/print_info.h"
1012
#include "module_io/rhog_io.h"
@@ -260,6 +262,14 @@ void ESolver_FP::after_scf(const int istep)
260262
PARAM.inp.out_elf[1]);
261263
}
262264
}
265+
266+
// #ifdef __RAPIDJSON
267+
// // add Json of efermi energy converge
268+
// Json::add_output_efermi_converge(this->pelec->eferm.ef * ModuleBase::Ry_to_eV, this->conv_esolver);
269+
// // add nkstot,nkstot_ibz to output json
270+
// int Jnkstot = this->pelec->klist->get_nkstot();
271+
// Json::add_nkstot(Jnkstot);
272+
// #endif //__RAPIDJSON
263273
}
264274

265275
void ESolver_FP::init_after_vc(const Input_para& inp, UnitCell& cell)

0 commit comments

Comments
 (0)