Skip to content

Commit 917d689

Browse files
authored
Merge branch 'deepmodeling:develop' into neighbor_atom
2 parents 6ee0913 + f084d42 commit 917d689

File tree

213 files changed

+2421
-3094
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

213 files changed

+2421
-3094
lines changed

CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -709,6 +709,7 @@ target_link_libraries(
709709
hamilt_stodft
710710
psi
711711
psi_initializer
712+
psi_overall_init
712713
esolver
713714
vdw
714715
device

docs/advanced/input_files/input-main.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -989,7 +989,7 @@ calculations.
989989

990990
- **Type**: String
991991
- **Description**: In our package, the XC functional can either be set explicitly using the `dft_functional` keyword in `INPUT` file. If `dft_functional` is not specified, ABACUS will use the xc functional indicated in the pseudopotential file.
992-
On the other hand, if dft_functional is specified, it will overwrite the functional from pseudopotentials and performs calculation with whichever functional the user prefers. We further offer two ways of supplying exchange-correlation functional. The first is using 'short-hand' names such as 'LDA', 'PBE', 'SCAN'. A complete list of 'short-hand' expressions can be found in [the source code](../../../source/module_hamilt_general/module_xc/xc_functional.cpp). The other way is only available when ***compiling with LIBXC***, and it allows for supplying exchange-correlation functionals as combinations of LIBXC keywords for functional components, joined by a plus sign, for example, 'dft_functional='LDA_X_1D_EXPONENTIAL+LDA_C_1D_CSC'. The list of LIBXC keywords can be found on its [website](https://www.tddft.org/programs/libxc/functionals/). In this way, **we support all the LDA,GGA and mGGA functionals provided by LIBXC**.
992+
On the other hand, if dft_functional is specified, it will overwrite the functional from pseudopotentials and performs calculation with whichever functional the user prefers. We further offer two ways of supplying exchange-correlation functional. The first is using 'short-hand' names such as 'LDA', 'PBE', 'SCAN'. A complete list of 'short-hand' expressions can be found in [the source code](../../../source/module_hamilt_general/module_xc/xc_functional.cpp). The other way is only available when ***compiling with LIBXC***, and it allows for supplying exchange-correlation functionals as combinations of LIBXC keywords for functional components, joined by a plus sign, for example, dft_functional='LDA_X_1D_EXPONENTIAL+LDA_C_1D_CSC'. The list of LIBXC keywords can be found on its [website](https://libxc.gitlab.io/functionals/). In this way, **we support all the LDA,GGA and mGGA functionals provided by LIBXC**.
993993

994994
Furthermore, the old INPUT parameter exx_hybrid_type for hybrid functionals has been absorbed into dft_functional. Options are `hf` (pure Hartree-Fock), `pbe0`(PBE0), `hse` (Note: in order to use HSE functional, LIBXC is required). Note also that HSE has been tested while PBE0 has NOT been fully tested yet, and the maximum CPU cores for running exx in parallel is $N(N+1)/2$, with N being the number of atoms. And forces for hybrid functionals are not supported yet.
995995

@@ -1389,7 +1389,7 @@ These variables are used to control the geometry relaxation.
13891389
### relax_nmax
13901390

13911391
- **Type**: Integer
1392-
- **Description**: The maximal number of ionic iteration steps, the minimum value is 1.
1392+
- **Description**: The maximal number of ionic iteration steps. If set to 0, the code performs a quick "dry run", stopping just after initialization. This is useful to check for input correctness and to have the summary printed.
13931393
- **Default**: 1 for SCF, 50 for relax and cell-relax calcualtions
13941394

13951395
### relax_cg_thr
@@ -1760,14 +1760,14 @@ The band (KS orbital) energy for each (k-point, spin, band) will be printed in t
17601760

17611761
- **Type**: Boolean
17621762
- **Availability**: Numerical atomic orbital basis
1763-
- **Description**: Whether to print Hamiltonian matrices H(R)/density matrics DM(R) in npz format. This feature does not work for gamma-only calculations. Currently only intended for internal usage.
1763+
- **Description**: Whether to print Hamiltonian matrices $H(R)$/density matrics $DM(R)$ in npz format. This feature does not work for gamma-only calculations. Currently only intended for internal usage.
17641764
- **Default**: False
17651765

17661766
### dm_to_rho
17671767

17681768
- **Type**: Boolean
17691769
- **Availability**: Numerical atomic orbital basis
1770-
- **Description**: Reads density matrix DM(R) in npz format and creates electron density on grids. This feature does not work for gamma-only calculations. Only supports serial calculations. Currently only intended for internal usage.
1770+
- **Description**: Reads density matrix $DM(R)$ in npz format and creates electron density on grids. This feature does not work for gamma-only calculations. Only supports serial calculations. Currently only intended for internal usage.
17711771
- **Default**: False
17721772

17731773
### out_app_flag

source/Makefile.Objects

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -226,6 +226,7 @@ OBJS_ELECSTAT=elecstate.o\
226226
H_Hartree_pw.o\
227227
H_TDDFT_pw.o\
228228
pot_xc.o\
229+
cal_ux.o\
229230

230231
OBJS_ELECSTAT_LCAO=elecstate_lcao.o\
231232
elecstate_lcao_cal_tau.o\
@@ -245,12 +246,10 @@ OBJS_ESOLVER=esolver.o\
245246
esolver_of.o\
246247
esolver_of_tool.o\
247248
esolver_of_interface.o\
248-
pw_init_globalc.o\
249249
pw_others.o\
250250

251251
OBJS_ESOLVER_LCAO=esolver_ks_lcao.o\
252252
esolver_ks_lcao_tddft.o\
253-
dpks_cal_e_delta_band.o\
254253
lcao_before_scf.o\
255254
esolver_gets.o\
256255
lcao_others.o\
@@ -403,9 +402,7 @@ OBJS_PSI_INITIALIZER=psi_initializer.o\
403402
psi_initializer_nao.o\
404403
psi_initializer_nao_random.o\
405404

406-
OBJS_PW=fft.o\
407-
fft_bundle.o\
408-
fft_base.o\
405+
OBJS_PW=fft_bundle.o\
409406
fft_cpu.o\
410407
pw_basis.o\
411408
pw_basis_k.o\
@@ -668,7 +665,7 @@ OBJS_SRCPW=H_Ewald_pw.o\
668665
symmetry_rhog.o\
669666
wavefunc.o\
670667
wf_atomic.o\
671-
wfinit.o\
668+
psi_init.o\
672669
elecond.o\
673670
sto_tool.o\
674671
sto_elecond.o\

source/driver.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,7 @@ void Driver::atomic_world()
183183
//--------------------------------------------------
184184

185185
// where the actual stuff is done
186-
this->driver_run();
186+
this->driver_run(GlobalC::ucell);
187187

188188
ModuleBase::timer::finish(GlobalV::ofs_running);
189189
ModuleBase::Memory::print_all(GlobalV::ofs_running);

source/driver.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
#ifndef DRIVER_H
22
#define DRIVER_H
33

4+
#include "module_cell/unitcell.h"
5+
46
class Driver
57
{
68
public:
@@ -34,7 +36,7 @@ class Driver
3436
void atomic_world();
3537

3638
// the actual calculations
37-
void driver_run();
39+
void driver_run(UnitCell& ucell);
3840
};
3941

4042
#endif

source/driver_run.cpp

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,8 @@
2424
* the configuration-changing subroutine takes force and stress and updates the
2525
* configuration
2626
*/
27-
void Driver::driver_run() {
27+
void Driver::driver_run(UnitCell& ucell)
28+
{
2829
ModuleBase::TITLE("Driver", "driver_line");
2930
ModuleBase::timer::tick("Driver", "driver_line");
3031

@@ -39,37 +40,35 @@ void Driver::driver_run() {
3940
#endif
4041

4142
// the life of ucell should begin here, mohan 2024-05-12
42-
// delete ucell as a GlobalC in near future
43-
GlobalC::ucell.setup_cell(PARAM.globalv.global_in_stru, GlobalV::ofs_running);
44-
Check_Atomic_Stru::check_atomic_stru(GlobalC::ucell,
45-
PARAM.inp.min_dist_coef);
43+
ucell.setup_cell(PARAM.globalv.global_in_stru, GlobalV::ofs_running);
44+
Check_Atomic_Stru::check_atomic_stru(ucell, PARAM.inp.min_dist_coef);
4645

4746
//! 2: initialize the ESolver (depends on a set-up ucell after `setup_cell`)
48-
ModuleESolver::ESolver* p_esolver = ModuleESolver::init_esolver(PARAM.inp, GlobalC::ucell);
47+
ModuleESolver::ESolver* p_esolver = ModuleESolver::init_esolver(PARAM.inp, ucell);
4948

5049
//! 3: initialize Esolver and fill json-structure
51-
p_esolver->before_all_runners(PARAM.inp, GlobalC::ucell);
50+
p_esolver->before_all_runners(ucell, PARAM.inp);
5251

5352
// this Json part should be moved to before_all_runners, mohan 2024-05-12
5453
#ifdef __RAPIDJSON
55-
Json::gen_stru_wrapper(&GlobalC::ucell);
54+
Json::gen_stru_wrapper(&ucell);
5655
#endif
5756

5857
const std::string cal_type = PARAM.inp.calculation;
5958

6059
//! 4: different types of calculations
6160
if (cal_type == "md")
6261
{
63-
Run_MD::md_line(GlobalC::ucell, p_esolver, PARAM);
62+
Run_MD::md_line(ucell, p_esolver, PARAM);
6463
}
6564
else if (cal_type == "scf" || cal_type == "relax" || cal_type == "cell-relax" || cal_type == "nscf")
6665
{
6766
Relax_Driver rl_driver;
68-
rl_driver.relax_driver(p_esolver);
67+
rl_driver.relax_driver(p_esolver, ucell);
6968
}
7069
else if (cal_type == "get_S")
7170
{
72-
p_esolver->runner(0, GlobalC::ucell);
71+
p_esolver->runner(ucell, 0);
7372
}
7473
else
7574
{
@@ -79,11 +78,11 @@ void Driver::driver_run() {
7978
//! test_neighbour(LCAO),
8079
//! gen_bessel(PW), et al.
8180
const int istep = 0;
82-
p_esolver->others(istep);
81+
p_esolver->others(ucell, istep);
8382
}
8483

8584
//! 5: clean up esolver
86-
p_esolver->after_all_runners();
85+
p_esolver->after_all_runners(ucell);
8786

8887
ModuleESolver::clean_esolver(p_esolver);
8988

source/module_base/blas_connector.h

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,20 @@ extern "C"
3939
double dnrm2_( const int *n, const double *X, const int *incX );
4040
double dznrm2_( const int *n, const std::complex<double> *X, const int *incX );
4141

42+
// symmetric rank-k update
43+
void dsyrk_(
44+
const char* uplo,
45+
const char* trans,
46+
const int* n,
47+
const int* k,
48+
const double* alpha,
49+
const double* a,
50+
const int* lda,
51+
const double* beta,
52+
double* c,
53+
const int* ldc
54+
);
55+
4256
// level 2: matrix-std::vector operations, O(n^2) data and O(n^2) work.
4357
void sgemv_(const char*const transa, const int*const m, const int*const n,
4458
const float*const alpha, const float*const a, const int*const lda, const float*const x, const int*const incx,
@@ -267,4 +281,4 @@ void zgemv_i(const char *trans,
267281
*/
268282

269283
#endif // GATHER_INFO
270-
#endif // BLAS_CONNECTOR_H
284+
#endif // BLAS_CONNECTOR_H

source/module_base/grid/batch.cpp

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
#include "module_base/grid/batch.h"
2+
3+
#include <algorithm>
4+
#include <cassert>
5+
#include <iterator>
6+
7+
#include "module_base/blas_connector.h"
8+
#include "module_base/lapack_connector.h"
9+
10+
namespace {
11+
12+
/**
13+
* @brief Divide a set of points into two subsets by the "MaxMin" algorithm.
14+
*
15+
* This function divides a given set of grid points by a cut plane
16+
* {x|n^T*(x-c) = 0} where the normal vector n and the point c are
17+
* determined by the "MaxMin" problem:
18+
*
19+
* max min sum_{i=1}^{m} [n^T * (r[idx[i]] - c)]^2
20+
* n c
21+
*
22+
* here r[j] = (grid[3*j], grid[3*j+1], grid[3*j+2]) is the position of
23+
* the j-th point.
24+
*
25+
* It can be shown that the optimal c is the centroid of the points, and
26+
* the optimal n is the eigenvector corresponding to the largest eigenvalue
27+
* of the matrix R*R^T, where the i-th column of R is r[idx[i]] - c.
28+
*
29+
* @param[in] grid Coordinates of all grid points.
30+
* grid[3*j], grid[3*j+1], grid[3*j+2] are the
31+
* x, y, z coordinates of the j-th point.
32+
* @param[in,out] idx Indices of the selected points within grid.
33+
* On return, idx will be rearranged such that
34+
* points belonging to the same subset have their
35+
* indices placed together.
36+
* @param[in] m Number of selected points (length of idx).
37+
*
38+
* @return The number of points in the first subset within idx.
39+
*
40+
*/
41+
int _maxmin_divide(const double* grid, int* idx, int m) {
42+
assert(m > 1);
43+
if (m == 2) {
44+
return 1;
45+
}
46+
47+
std::vector<double> centroid(3, 0.0);
48+
for (int i = 0; i < m; ++i) {
49+
int j = idx[i];
50+
centroid[0] += grid[3*j ];
51+
centroid[1] += grid[3*j + 1];
52+
centroid[2] += grid[3*j + 2];
53+
}
54+
centroid[0] /= m;
55+
centroid[1] /= m;
56+
centroid[2] /= m;
57+
58+
// positions w.r.t. the centroid
59+
std::vector<double> R(3*m, 0.0);
60+
for (int i = 0; i < m; ++i) {
61+
int j = idx[i];
62+
R[3*i ] = grid[3*j ] - centroid[0];
63+
R[3*i + 1] = grid[3*j + 1] - centroid[1];
64+
R[3*i + 2] = grid[3*j + 2] - centroid[2];
65+
}
66+
67+
// The normal vector of the cut plane is taken to be the eigenvector
68+
// corresponding to the largest eigenvalue of the 3x3 matrix A = R*R^T.
69+
std::vector<double> A(9, 0.0);
70+
int i3 = 3, i1 = 1;
71+
double d0 = 0.0, d1 = 1.0;
72+
dsyrk_("U", "N", &i3, &m, &d1, R.data(), &i3, &d0, A.data(), &i3);
73+
74+
int info = 0, lwork = 102 /* determined by a work space query */;
75+
std::vector<double> e(3), work(lwork);
76+
dsyev_("V", "U", &i3, A.data(), &i3, e.data(), work.data(), &lwork, &info);
77+
double* n = A.data() + 6; // normal vector of the cut plane
78+
79+
// Rearrange the indices to put points in each subset together by
80+
// examining the signed distances of points to the cut plane (R^T*n).
81+
std::vector<double> dist(m);
82+
dgemv_("T", &i3, &m, &d1, R.data(), &i3, n, &i1, &d0, dist.data(), &i1);
83+
84+
int *head = idx;
85+
std::reverse_iterator<int*> tail(idx + m), rend(idx);
86+
auto is_negative = [&dist, &idx](int& j) { return dist[&j - idx] < 0; };
87+
while ( ( head = std::find_if(head, idx + m, is_negative) ) <
88+
( tail = std::find_if_not(tail, rend, is_negative) ).base() ) {
89+
std::swap(*head, *tail);
90+
std::swap(dist[head - idx], dist[tail.base() - idx - 1]);
91+
++head;
92+
++tail;
93+
}
94+
95+
return head - idx;
96+
}
97+
98+
} // end of anonymous namespace
99+
100+
101+
std::vector<int> Grid::Batch::maxmin(
102+
const double* grid,
103+
int* idx,
104+
int m,
105+
int m_thr
106+
) {
107+
if (m <= m_thr) {
108+
return std::vector<int>{0};
109+
}
110+
111+
int m_left = _maxmin_divide(grid, idx, m);
112+
113+
std::vector<int> left = maxmin(grid, idx, m_left, m_thr);
114+
std::vector<int> right = maxmin(grid, idx + m_left, m - m_left, m_thr);
115+
std::for_each(right.begin(), right.end(),
116+
[m_left](int& x) { x += m_left; }
117+
);
118+
119+
left.insert(left.end(), right.begin(), right.end());
120+
return left;
121+
}
122+
123+

source/module_base/grid/batch.h

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
#ifndef GRID_BATCH_H
2+
#define GRID_BATCH_H
3+
4+
#include <vector>
5+
6+
namespace Grid {
7+
namespace Batch {
8+
9+
/**
10+
* @brief Divide a set of points into batches by the "MaxMin" algorithm.
11+
*
12+
* This function recursively uses cut planes to divide grid points into
13+
* two subsets using the "MaxMin" algorithm, until the number of points
14+
* in each subset (batch) is no more than m_thr.
15+
*
16+
* @param[in] grid Coordinates of all grid points.
17+
* grid[3*j], grid[3*j+1], grid[3*j+2] are
18+
* the x, y, z coordinates of the j-th point.
19+
* @param[in,out] idx Indices of the initial set within grid.
20+
* On return, idx will be rearranged such
21+
* that points belonging to the same batch
22+
* have their indices placed together.
23+
* @param[in] m Number of points in the initial set.
24+
* (length of idx)
25+
* @param[in] m_thr Size limit of a batch.
26+
*
27+
* @return Indices (for idx) of the first point in each batch.
28+
*
29+
* For example, given grid (illustrated by their indices) located as follows:
30+
*
31+
* 0 1 16 2 3 18
32+
* 4 5 17 6 7
33+
*
34+
*
35+
* 8 9 20 10 11
36+
* 12 13 19 14 15
37+
*
38+
* a possible outcome with m_thr = 4 and idx(in) = {0, 1, 2, ..., 15}
39+
* (idx may correspond to a subset of grid and does not have to be sorted,
40+
* but it must not contain duplicates) is:
41+
*
42+
* idx(out): 0, 1, 4, 5, 8, 9, 12, 13, 2, 3, 6, 7, 10, 11, 14, 15
43+
* return : {0, 4, 8, 12}
44+
*
45+
* which means the selected set (labeled 0-15) is divided into 4 batches:
46+
* {0, 1, 4, 5}, {8, 9, 12, 13}, {2, 3, 6, 7}, {10, 11, 14, 15}.
47+
*
48+
*/
49+
std::vector<int> maxmin(const double* grid, int* idx, int m, int m_thr);
50+
51+
} // end of namespace Batch
52+
} // end of namespace Grid
53+
54+
#endif

source/module_base/grid/test/CMakeLists.txt

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,10 @@ AddTest(
2020
../radial.cpp
2121
../delley.cpp
2222
)
23+
24+
AddTest(
25+
TARGET test_batch
26+
SOURCES test_batch.cpp
27+
../batch.cpp
28+
LIBS ${math_libs}
29+
)

0 commit comments

Comments
 (0)