Skip to content

Commit 2807490

Browse files
authored
Merge branch 'develop' into psi-ngk
2 parents 5373bdc + c53f445 commit 2807490

File tree

195 files changed

+5445
-4755
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

195 files changed

+5445
-4755
lines changed

docs/advanced/input_files/input-main.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1561,10 +1561,8 @@ These variables are used to control the output of properties.
15611561
### out_freq_elec
15621562

15631563
- **Type**: Integer
1564-
- **Description**: The output frequency of the charge density (controlled by [out_chg](#out_chg)), wavefunction (controlled by [out_wfc_pw](#out_wfc_pw) or [out_wfc_r](#out_wfc_r)), and density matrix of localized orbitals (controlled by [out_dm](#out_dm)).
1565-
- \>0: Output them every `out_freq_elec` iteration numbers in electronic iterations.
1566-
- 0: Output them when the electronic iteration is converged or reaches the maximal iteration number.
1567-
- **Default**: 0
1564+
- **Description**: Output the charge density (only binary format, controlled by [out_chg](#out_chg)), wavefunction (controlled by [out_wfc_pw](#out_wfc_pw) or [out_wfc_r](#out_wfc_r)) per `out_freq_elec` electronic iterations. Note that they are always output when converged or reach the maximum iterations [scf_nmax](#scf_nmax).
1565+
- **Default**: [scf_nmax](#scf_nmax)
15681566

15691567
### out_chg
15701568

@@ -2060,7 +2058,7 @@ Warning: this function is not robust enough for the current version. Please try
20602058
- **Type**: int
20612059
- **Availability**: numerical atomic orbital basis
20622060
- **Description**: Include V_delta label for DeePKS training. When `deepks_out_labels` is true and `deepks_v_delta` > 0, ABACUS will output h_base.npy, v_delta.npy and h_tot.npy(h_tot=h_base+v_delta).
2063-
Meanwhile, when `deepks_v_delta` equals 1, ABACUS will also output v_delta_precalc.npy, which is used to calculate V_delta during DeePKS training. However, when the number of atoms grows, the size of v_delta_precalc.npy will be very large. In this case, it's recommended to set `deepks_v_delta` as 2, and ABACUS will output psialpha.npy and grad_evdm.npy but not v_delta_precalc.npy. These two files are small and can be used to calculate v_delta_precalc in the procedure of training DeePKS.
2061+
Meanwhile, when `deepks_v_delta` equals 1, ABACUS will also output v_delta_precalc.npy, which is used to calculate V_delta during DeePKS training. However, when the number of atoms grows, the size of v_delta_precalc.npy will be very large. In this case, it's recommended to set `deepks_v_delta` as 2, and ABACUS will output phialpha.npy and grad_evdm.npy but not v_delta_precalc.npy. These two files are small and can be used to calculate v_delta_precalc in the procedure of training DeePKS.
20642062
- **Default**: 0
20652063

20662064
### deepks_out_unittest

docs/quick_start/input.md

Lines changed: 18 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -8,17 +8,17 @@ The `INPUT` file contains parameters that control the type of calculation as wel
88

99
Below is an example `INPUT` file with some of the most important parameters that need to be set:
1010

11-
```
11+
```plaintext
1212
INPUT_PARAMETERS
1313
suffix MgO
1414
ntype 2
1515
pseudo_dir ./
16-
orbital_dir ./
17-
ecutwfc 100 # Rydberg
18-
scf_thr 1e-4 # Rydberg
19-
basis_type lcao
20-
calculation scf # this is the key parameter telling abacus to do a scf calculation
21-
out_chg True
16+
orbital_dir ./
17+
ecutwfc 100 # in Rydberg
18+
scf_thr 1e-4 # Rydberg
19+
basis_type lcao
20+
calculation scf # this is the key parameter telling abacus to do a scf calculation
21+
out_chg True
2222
```
2323

2424
The parameter list always starts with key word `INPUT_PARAMETERS`. Any content before `INPUT_PARAMETERS` will be ignored.
@@ -40,22 +40,23 @@ In the above example, the meanings of the parameters are:
4040
- `ntype` : how many types of elements in the unit cell
4141
- `pseudo_dir` : the directory where pseudopotential files are provided
4242
- `orbital_dir` : the directory where orbital files are provided
43-
- `ecutwfc` : the plane-wave energy cutoff for the wave function expansion (UNIT: Rydberg)
44-
- `scf_thr` : the threshold for the convergence of charge density (UNIT: Rydberg)
43+
- `ecutwfc` : the plane-wave energy cutoff for the wave function expansion (UNIT: Rydberg)
44+
- `scf_thr` : the threshold for the convergence of charge density (UNIT: Rydberg)
4545
- `basis_type` : the type of basis set for expanding the electronic wave functions
4646
- `calculation` : the type of calculation to be performed by ABACUS
47-
- `out_chg` : if true, output thee charge density oon real space grid
47+
- `out_chg` : if true, output the charge density on real space grid
4848

4949
For a complete list of input parameters, please consult this [instruction](../advanced/input_files/input-main.md).
5050

51-
> **Note:** Users cannot change the filename “INPUT” to other names. Boolean paramerters such as `out_chg` can be set by using `True` and `False`, `1` and `0`, or `T` and `F`. It is case insensitive so that other preferences such as `true` and `false`, `TRUE` and `FALSE`, and `t` and `f` for setting boolean values are also supported.
51+
> **Note:** Users cannot change the filename “INPUT” to other names. Boolean paramerters such as `out_chg` can be set by using `True` and `False`, `1` and `0`, or `T` and `F`. It is case insensitive so that other preferences such as `true` and `false`, `TRUE` and `FALSE`, and `t` and `f` for setting boolean values are also supported. Specifically for the `out_chg`, `-1` option is also available, which means turn off the checkpoint of charge density in binary (always dumped in `OUT.{suffix}`, whose name ends with `CHARGE-DENSITY.restart`). Some parameters controlling the output also support a second option to control the output precision, e.g., `out_chg True 8` will output the charge density on realspace grid with 8 digits after the decimal point.
5252
5353
## *STRU*
5454

55-
The structure file contains structural information about the system, e.g., lattice constant, lattice vectors, and positions of the atoms within a unit cell. The positions can be given either in direct or Cartesian coordinates.
55+
The structure file contains structural information about the system, e.g., lattice constant, lattice vectors, and positions of the atoms within a unit cell. The positions can be given either in direct or Cartesian coordinates.
5656

5757
An example of the `STRU` file is given as follows :
58-
```
58+
59+
```plaintext
5960
#This is the atom file containing all the information
6061
#about the lattice structure.
6162
@@ -68,7 +69,7 @@ Mg_gga_8au_100Ry_4s2p1d.orb
6869
O_gga_8au_100Ry_2s2p1d.orb
6970
7071
LATTICE_CONSTANT
71-
1.8897259886 # 1.8897259886 Bohr = 1.0 Angstrom
72+
1.8897259886 # 1.8897259886 Bohr = 1.0 Angstrom
7273
7374
LATTICE_VECTORS
7475
4.25648 0.00000 0.00000
@@ -100,9 +101,10 @@ For a more detailed description of STRU file, please consult [here](../advanced/
100101
## *KPT*
101102

102103
This file contains information of the kpoint grid setting for the Brillouin zone sampling.
103-
104+
104105
An example of the `KPT` file is given below:
105-
```
106+
107+
```plaintext
106108
K_POINTS
107109
0
108110
Gamma
@@ -111,7 +113,6 @@ Gamma
111113

112114
> **Note:** users may choose a different name for their k-point file using keyword `kpoint_file`
113115
114-
115116
For a more detailed description, please consult [here](../advanced/input_files/kpt.md).
116117

117118
- The pseudopotential files

examples/lr-tddft/lcao_H2O/INPUT

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ orbital_dir ../../../tests/PP_ORB
66
calculation scf
77
nbands 23
88
symmetry -1
9+
nspin 2
910

1011
#Parameters (2.Iteration)
1112
ecutwfc 60 ###Energy cutoff needs to be tested to ensure your calculation is reliable.[1]
@@ -30,6 +31,7 @@ xc_kernel lda
3031
lr_solver dav
3132
lr_thr 1e-2
3233
pw_diag_ndim 2
34+
# lr_unrestricted 1 ### use this to do TDUKS calculation for closeshell systems (openshell system will force TDUKS)
3335

3436
esolver_type ks-lr
3537
out_alllog 1
@@ -39,6 +41,7 @@ out_alllog 1
3941
nvirt 19
4042
abs_wavelen_range 40 180
4143
abs_broadening 0.01
44+
abs_gauge length
4245

4346
### [1] Energy cutoff determines the quality of numerical quadratures in your calculations.
4447
### So it is strongly recommended to test whether your result (such as converged SCF energies) is

examples/lr-tddft/lcao_Si2/INPUT

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@ pseudo_dir ../../../tests/PP_ORB
55
orbital_dir ../../../tests/PP_ORB
66
calculation scf
77
nbands 23
8-
symmetry 0
8+
symmetry -1
9+
nspin 2
910

1011
#Parameters (2.Iteration)
1112
ecutwfc 60 ###Energy cutoff needs to be tested to ensure your calculation is reliable.[1]
@@ -37,6 +38,8 @@ out_alllog 1
3738

3839
nvirt 19
3940
abs_wavelen_range 100 175
41+
abs_broadening 0.01 # in Ry
42+
abs_gauge velocity ### velocity gauge is recommended for periodic systems
4043

4144

4245
### [1] Energy cutoff determines the quality of numerical quadratures in your calculations.

source/Makefile.Objects

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -187,20 +187,21 @@ OBJS_CELL=atom_pseudo.o\
187187
klist.o\
188188
cell_index.o\
189189
check_atomic_stru.o\
190+
update_cell.o\
191+
bcast_cell.o\
190192

191193
OBJS_DEEPKS=LCAO_deepks.o\
192-
deepks_fgamma.o\
193-
deepks_fk.o\
194-
LCAO_deepks_odelta.o\
194+
deepks_force.o\
195+
deepks_orbital.o\
195196
LCAO_deepks_io.o\
196197
LCAO_deepks_mpi.o\
197198
LCAO_deepks_pdm.o\
198-
LCAO_deepks_psialpha.o\
199+
LCAO_deepks_phialpha.o\
199200
LCAO_deepks_torch.o\
200201
LCAO_deepks_vdelta.o\
201202
deepks_hmat.o\
202203
LCAO_deepks_interface.o\
203-
orbital_precalc.o\
204+
deepks_orbpre.o\
204205
cal_gdmx.o\
205206
cal_gedm.o\
206207
cal_gvx.o\
@@ -731,6 +732,7 @@ OBJS_TENSOR=tensor.o\
731732
xc_kernel.o\
732733
pot_hxc_lrtd.o\
733734
lr_spectrum.o\
735+
lr_spectrum_velocity.o\
734736
hamilt_casida.o\
735737
esolver_lrtd_lcao.o\
736738

source/module_base/blas_connector.cpp

Lines changed: 142 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ double BlasConnector::dot( const int n, const double *X, const int incX, const d
8282
}
8383

8484
// C = a * A.? * B.? + b * C
85+
// Row-Major part
8586
void BlasConnector::gemm(const char transa, const char transb, const int m, const int n, const int k,
8687
const float alpha, const float *a, const int lda, const float *b, const int ldb,
8788
const float beta, float *c, const int ldc, base_device::AbacusDevice_t device_type)
@@ -154,6 +155,147 @@ void BlasConnector::gemm(const char transa, const char transb, const int m, cons
154155
#endif
155156
}
156157

158+
// Col-Major part
159+
void BlasConnector::gemm_cm(const char transa, const char transb, const int m, const int n, const int k,
160+
const float alpha, const float *a, const int lda, const float *b, const int ldb,
161+
const float beta, float *c, const int ldc, base_device::AbacusDevice_t device_type)
162+
{
163+
if (device_type == base_device::AbacusDevice_t::CpuDevice) {
164+
sgemm_(&transa, &transb, &m, &n, &k,
165+
&alpha, a, &lda, b, &ldb,
166+
&beta, c, &ldc);
167+
}
168+
#ifdef __DSP
169+
else if (device_type == base_device::AbacusDevice_t::DspDevice){
170+
sgemm_mth_(&transb, &transa, &m, &n, &k,
171+
&alpha, a, &lda, b, &ldb,
172+
&beta, c, &ldc, GlobalV::MY_RANK);
173+
}
174+
#endif
175+
}
176+
177+
void BlasConnector::gemm_cm(const char transa, const char transb, const int m, const int n, const int k,
178+
const double alpha, const double *a, const int lda, const double *b, const int ldb,
179+
const double beta, double *c, const int ldc, base_device::AbacusDevice_t device_type)
180+
{
181+
if (device_type == base_device::AbacusDevice_t::CpuDevice) {
182+
dgemm_(&transa, &transb, &m, &n, &k,
183+
&alpha, a, &lda, b, &ldb,
184+
&beta, c, &ldc);
185+
}
186+
#ifdef __DSP
187+
else if (device_type == base_device::AbacusDevice_t::DspDevice){
188+
dgemm_mth_(&transa, &transb, &m, &n, &k,
189+
&alpha, a, &lda, b, &ldb,
190+
&beta, c, &ldc, GlobalV::MY_RANK);
191+
}
192+
#endif
193+
}
194+
195+
void BlasConnector::gemm_cm(const char transa, const char transb, const int m, const int n, const int k,
196+
const std::complex<float> alpha, const std::complex<float> *a, const int lda, const std::complex<float> *b, const int ldb,
197+
const std::complex<float> beta, std::complex<float> *c, const int ldc, base_device::AbacusDevice_t device_type)
198+
{
199+
if (device_type == base_device::AbacusDevice_t::CpuDevice) {
200+
cgemm_(&transa, &transb, &m, &n, &k,
201+
&alpha, a, &lda, b, &ldb,
202+
&beta, c, &ldc);
203+
}
204+
#ifdef __DSP
205+
else if (device_type == base_device::AbacusDevice_t::DspDevice) {
206+
cgemm_mth_(&transa, &transb, &m, &n, &k,
207+
&alpha, a, &lda, b, &ldb,
208+
&beta, c, &ldc, GlobalV::MY_RANK);
209+
}
210+
#endif
211+
}
212+
213+
void BlasConnector::gemm_cm(const char transa, const char transb, const int m, const int n, const int k,
214+
const std::complex<double> alpha, const std::complex<double> *a, const int lda, const std::complex<double> *b, const int ldb,
215+
const std::complex<double> beta, std::complex<double> *c, const int ldc, base_device::AbacusDevice_t device_type)
216+
{
217+
if (device_type == base_device::AbacusDevice_t::CpuDevice) {
218+
zgemm_(&transa, &transb, &m, &n, &k,
219+
&alpha, a, &lda, b, &ldb,
220+
&beta, c, &ldc);
221+
}
222+
#ifdef __DSP
223+
else if (device_type == base_device::AbacusDevice_t::DspDevice) {
224+
zgemm_mth_(&transa, &transb, &m, &n, &k,
225+
&alpha, a, &lda, b, &ldb,
226+
&beta, c, &ldc, GlobalV::MY_RANK);
227+
}
228+
#endif
229+
}
230+
231+
// Symm and Hemm part. Only col-major is supported.
232+
233+
void BlasConnector::symm_cm(const char side, const char uplo, const int m, const int n,
234+
const float alpha, const float *a, const int lda, const float *b, const int ldb,
235+
const float beta, float *c, const int ldc, base_device::AbacusDevice_t device_type)
236+
{
237+
if (device_type == base_device::AbacusDevice_t::CpuDevice) {
238+
ssymm_(&side, &uplo, &m, &n,
239+
&alpha, a, &lda, b, &ldb,
240+
&beta, c, &ldc);
241+
}
242+
}
243+
244+
void BlasConnector::symm_cm(const char side, const char uplo, const int m, const int n,
245+
const double alpha, const double *a, const int lda, const double *b, const int ldb,
246+
const double beta, double *c, const int ldc, base_device::AbacusDevice_t device_type)
247+
{
248+
if (device_type == base_device::AbacusDevice_t::CpuDevice) {
249+
dsymm_(&side, &uplo, &m, &n,
250+
&alpha, a, &lda, b, &ldb,
251+
&beta, c, &ldc);
252+
}
253+
}
254+
255+
void BlasConnector::symm_cm(const char side, const char uplo, const int m, const int n,
256+
const std::complex<float> alpha, const std::complex<float> *a, const int lda, const std::complex<float> *b, const int ldb,
257+
const std::complex<float> beta, std::complex<float> *c, const int ldc, base_device::AbacusDevice_t device_type)
258+
{
259+
if (device_type == base_device::AbacusDevice_t::CpuDevice) {
260+
csymm_(&side, &uplo, &m, &n,
261+
&alpha, a, &lda, b, &ldb,
262+
&beta, c, &ldc);
263+
}
264+
}
265+
266+
void BlasConnector::symm_cm(const char side, const char uplo, const int m, const int n,
267+
const std::complex<double> alpha, const std::complex<double> *a, const int lda, const std::complex<double> *b, const int ldb,
268+
const std::complex<double> beta, std::complex<double> *c, const int ldc, base_device::AbacusDevice_t device_type)
269+
{
270+
if (device_type == base_device::AbacusDevice_t::CpuDevice) {
271+
zsymm_(&side, &uplo, &m, &n,
272+
&alpha, a, &lda, b, &ldb,
273+
&beta, c, &ldc);
274+
}
275+
}
276+
277+
void BlasConnector::hemm_cm(char side, char uplo, int m, int n,
278+
std::complex<float> alpha, std::complex<float> *a, int lda, std::complex<float> *b, int ldb,
279+
std::complex<float> beta, std::complex<float> *c, int ldc, base_device::AbacusDevice_t device_type)
280+
{
281+
if (device_type == base_device::AbacusDevice_t::CpuDevice) {
282+
chemm_(&side, &uplo, &m, &n,
283+
&alpha, a, &lda, b, &ldb,
284+
&beta, c, &ldc);
285+
}
286+
}
287+
288+
void BlasConnector::hemm_cm(char side, char uplo, int m, int n,
289+
std::complex<double> alpha, std::complex<double> *a, int lda, std::complex<double> *b, int ldb,
290+
std::complex<double> beta, std::complex<double> *c, int ldc, base_device::AbacusDevice_t device_type)
291+
{
292+
if (device_type == base_device::AbacusDevice_t::CpuDevice) {
293+
zhemm_(&side, &uplo, &m, &n,
294+
&alpha, a, &lda, b, &ldb,
295+
&beta, c, &ldc);
296+
}
297+
}
298+
157299
void BlasConnector::gemv(const char trans, const int m, const int n,
158300
const float alpha, const float* A, const int lda, const float* X, const int incx,
159301
const float beta, float* Y, const int incy, base_device::AbacusDevice_t device_type)
@@ -190,7 +332,6 @@ void BlasConnector::gemv(const char trans, const int m, const int n,
190332
}
191333
}
192334

193-
194335
// out = ||x||_2
195336
float BlasConnector::nrm2( const int n, const float *X, const int incX, base_device::AbacusDevice_t device_type )
196337
{

0 commit comments

Comments
 (0)