You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/advanced/acceleration/cuda.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,7 +36,7 @@ The ABACUS program will automatically determine whether the current ELPA support
36
36
## Run with the GPU support by editing the INPUT script:
37
37
38
38
In `INPUT` file we need to set the input parameter [device](../input_files/input-main.md#device) to `gpu`. If this parameter is not set, ABACUS will try to determine if there are available GPUs.
39
-
- Set `ks_solver`: For the PW basis, CG, BPCG and Davidson methods are supported on GPU; set the input parameter [ks_solver](../input_files/input-main.md#ks_solver) to `cg`, `bpcg` or `dav`. For the LCAO basis, `cusolver` and `elpa` is supported on GPU.
39
+
- Set `ks_solver`: For the PW basis, CG, BPCG and Davidson methods are supported on GPU; set the input parameter [ks_solver](../input_files/input-main.md#ks_solver) to `cg`, `bpcg` or `dav`. For the LCAO basis, `cusolver`, `cusolvermp` and `elpa` is supported on GPU.
40
40
-**multi-card**: ABACUS allows for multi-GPU acceleration. If you have multiple GPU cards, you can run ABACUS with several MPI processes, and each process will utilize one GPU card. For example, the command `mpirun -n 2 abacus` will by default launch two GPUs for computation. If you only have one card, this command will only start one GPU.
@@ -70,4 +70,6 @@ The results are shown as follows:
70
70
P = 0.8906925 (mod 2.1748536) ( 0.0000000, 0.0000000, 0.8906925) C/m^2
71
71
```
72
72
73
-
The electric polarization **P** is multivalued, which modulo a quantum e**R**/V~cell~. Note: the values in parentheses are the components of the **P** along the c axis in the x, y, z Cartesian coordinates when set gdir = 3 in INPUT file.
73
+
The electric polarization **P** is multivalued, which modulo a quantum e**R**/V~cell~.
74
+
75
+
Note: The vectors R1, R2, and R3 refer to the three lattice vectors of the unit cell. When gdir=3, the calculated polarization is along the R3 direction. The three values in parentheses represent the re-projection of the polarization along the R3 direction onto the Cartesian coordinate system (i.e., the xyz coordinate system). To obtain the full polarization components in the Cartesian system, you need to calculate the polarization for R1, R2, and R3 separately, and then sum their respective x, y, and z components.
Copy file name to clipboardExpand all lines: docs/advanced/input_files/input-main.md
+17-9Lines changed: 17 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,7 +39,6 @@
39
39
-[pw\_diag\_thr](#pw_diag_thr)
40
40
-[pw\_diag\_nmax](#pw_diag_nmax)
41
41
-[pw\_diag\_ndim](#pw_diag_ndim)
42
-
-[diago\_full\_acc](#diago_full_acc)
43
42
-[erf\_ecut](#erf_ecut)
44
43
-[fft\_mode](#fft_mode)
45
44
-[erf\_height](#erf_height)
@@ -779,12 +778,6 @@ These variables are used to control the plane wave related parameters.
779
778
-**Description**: Only useful when you use `ks_solver = dav` or `ks_solver = dav_subspace`. It indicates dimension of workspace(number of wavefunction packets, at least 2 needed) for the Davidson method. A larger value may yield a smaller number of iterations in the algorithm but uses more memory and more CPU time in subspace diagonalization.
780
779
-**Default**: 4
781
780
782
-
### diago_full_acc
783
-
784
-
-**Type**: bool
785
-
-**Description**: Only useful when you use `ks_solver = dav_subspace`. If `TRUE`, all the empty states are diagonalized at the same level of accuracy of the occupied ones. Otherwise the empty states are diagonalized using a larger threshold (10-5) (this should not affect total energy, forces, and other ground-state properties).
786
-
-**Default**: false
787
-
788
781
### erf_ecut
789
782
790
783
-**Type**: Real
@@ -925,14 +918,16 @@ calculations.
925
918
-**cg**: cg method.
926
919
-**bpcg**: bpcg method, which is a block-parallel Conjugate Gradient (CG) method, typically exhibits higher acceleration in a GPU environment.
927
920
-**dav**: the Davidson algorithm.
928
-
-**dav_subspace**: subspace Davidson algorithm
921
+
-**dav_subspace**: Davidson algorithm without orthogonalization operation, this method is the most recommended for efficiency. `pw_diag_ndim` can be set to 2 for this method.
929
922
930
923
For atomic orbitals basis,
931
924
932
925
-**lapack**: This method is only avaliable for serial version. For parallel version please use **scalapack_gvx**.
933
926
-**genelpa**: This method should be used if you choose localized orbitals.
934
927
-**scalapack_gvx**: Scalapack can also be used for localized orbitals.
935
928
-**cusolver**: This method needs building with CUDA and at least one gpu is available.
929
+
-**cusolvermp**: This method supports multi-GPU acceleration and needs building with CUDA。 Note that when using cusolvermp, you should set the number of MPI processes to be equal to the number of GPUs.
930
+
-**elpa**: The ELPA solver supports both CPU and GPU. By setting the `device` to GPU, you can launch the ELPA solver with GPU acceleration (provided that you have installed a GPU-supported version of ELPA, which requires you to manually compile and install ELPA, and the ABACUS should be compiled with -DUSE_ELPA=ON and -DUSE_CUDA=ON). The ELPA solver also supports multi-GPU acceleration.
936
931
937
932
If you set ks_solver=`genelpa` for basis_type=`pw`, the program will be stopped with an error message:
938
933
@@ -941,7 +936,13 @@ calculations.
941
936
```
942
937
943
938
Then the user has to correct the input file and restart the calculation.
944
-
-**Default**: cg (plane-wave basis), or genelpa (localized atomic orbital basis, if compiling option `USE_ELPA` has been set),lapack (localized atomic orbital basis, if compiling option `ENABLE_MPI` has not been set), scalapack_gvx, (localized atomic orbital basis, if compiling option `USE_ELPA` has not been set and if compiling option `ENABLE_MPI` has been set)
939
+
-**Default**:
940
+
-**PW basis**: cg.
941
+
-**LCAO basis**:
942
+
- genelpa (if compiling option `USE_ELPA` has been set)
943
+
- lapack (if compiling option `ENABLE_MPI` has not been set)
944
+
- scalapack_gvx (if compiling option `USE_ELPA` has not been set and compiling option `ENABLE_MPI` has been set)
945
+
- cusolver (if compiling option `USE_CUDA` has been set)
Copy file name to clipboardExpand all lines: docs/advanced/input_files/kpt.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ABACUS uses periodic boundary conditions for both crystals and finite systems. F
8
8
9
9
## Gamma-only Calculations
10
10
11
-
In ABACUS, we offer th option of running gamma-only calculations for LCAO basis by setting [gamma_only](./input-main.md#gamma_only) to be 1. Due to details of implementation, gamma-only calculation will be slightly faster than running a non gamma-only calculation and explicitly setting gamma point to be the only the k-point, but the results should be consistent.
11
+
In ABACUS, we offer the option of running gamma-only calculations for LCAO basis by setting [gamma_only](./input-main.md#gamma_only) to be 1. Due to details of implementation, gamma-only calculation will be slightly faster than running a non gamma-only calculation and explicitly setting gamma point to be the only the k-point, but the results should be consistent.
12
12
13
13
> If gamma_only is set to 1, the KPT file will be overwritten. So make sure to turn off gamma_only for multi-k calculations.
14
14
@@ -23,7 +23,7 @@ method to generate k-mesh, and the following is an example input k-point (`KPT`)
23
23
K_POINTS //keyword for start
24
24
0 //total number of k-point, `0' means generate automatically
25
25
Gamma //which kind of Monkhorst-Pack method, `Gamma' or `MP'
26
-
2 2 2 0 0 0 //first three number: subdivisions along recpri. vectors
26
+
2 2 2 0 0 0 //first three number: subdivisions along reciprocal vectors
27
27
//last three number: shift of the mesh
28
28
```
29
29
@@ -63,8 +63,8 @@ Direct //`Direct' or `Cartesian' coordinate
63
63
## Band structure calculations
64
64
65
65
ABACUS uses specified high-symmetry directions of the Brillouin zone for band structure
66
-
calculations. The third line of k-point file should start with ‘Line’ or ‘Line_Cartesian’ for
67
-
line mode. ‘Line’ means the positions below are in Direct coordinates, while ‘Line_Cartesian’
66
+
calculations. The third line of k-point file should start with 'Line' or 'Line_Cartesian' for
67
+
line mode. 'Line' means the positions below are in Direct coordinates, while 'Line_Cartesian'
DeePMD-kit supports TensorFlow backend but its libraries are placed at another directory, then
46
49
47
-
> `deepmd_c`/`deepmd_cc` and `tensorflow_cc` libraries would be called according to `DeePMD_DIR` and `TensorFlow_DIR`, which is showed in detail in [this page](https://github.com/deepmodeling/deepmd-kit/blob/master/doc/inference/cxx.md). If `TensorFlow_DIR` is not defined, it will be the same as `DeePMD_DIR`. Note that `tensorflow_cc` is not required if `deepmd_c` is found.
@@ -93,9 +102,9 @@ cmake -B build -DUSE_CUDA=1 -DCMAKE_CUDA_COMPILER=${path to cuda toolkit}/bin/nv
93
102
94
103
## Build math library from source
95
104
96
-
> Note: This flag is **enabled by default**. It will get better performance than the standard implementation on `gcc` and `clang`. But it **will be disabled** when using `Intel Compiler` since the math functions will get wrong results and the performance is also unexpectly poor.
105
+
> Note: We recommend using the latest available compiler sets, since they offer faster implementations of math functions.
97
106
98
-
To build math functions from source code, instead of using c++ standard implementation, define `USE_ABACUS_LIBM` flag.
107
+
This flag is disabled by default. To build math functions from source code, define `USE_ABACUS_LIBM` flag. It is expected to get a better performance on legacy versions of `gcc` and `clang`.
99
108
100
109
Currently supported math functions:
101
110
`sin`, `cos`, `sincos`, `exp`, `cexp`
@@ -282,15 +291,21 @@ directly.
282
291
283
292
> Note: This part is only required if you want to load a trained DeeP Potential and run molecular dynamics with that. To train the DeeP Potential with DP-GEN, no extra prerequisite is needed and please refer to [this page](http://abacus.deepmodeling.com/en/latest/advanced/interface/dpgen.html) for ABACUS interface with DP-GEN.
284
293
285
-
To compile ABACUS with DeePMD-kit, you need to define `DeePMD_DIR` and `TensorFlow_DIR` in the file `Makefile.vars` or use
294
+
To compile ABACUS with DeePMD-kit, you need to define `DeePMD_DIR` and `TensorFlow_DIR`(TensorFlow Backend, optional) and/or `LIBTORCH_DIR` (PyTorch Backend, optional) in the file `Makefile.vars`.
286
295
296
+
Or the `tensorflow_cc` and `torch` libraries are in the same directory as the `deepmd_c`/`deepmd_cc` libraries, then
287
297
```makefile
288
-
make DeePMD_DIR=~/deepmd-kit TensorFlow_DIR=~/tensorflow
298
+
make DeePMD_DIR=/dir_to_deepmd-kit
289
299
```
300
+
DeePMD-kit supports TensorFlow backend but its libraries are placed at another directory, then
290
301
291
-
directly.
292
-
293
-
> `deepmd_c`/`deepmd_cc` and `tensorflow_cc` libraries would be called according to `DeePMD_DIR` and `TensorFlow_DIR`, which is showed in detail in [this page](https://github.com/deepmodeling/deepmd-kit/blob/master/doc/inference/cxx.md). If `TensorFlow_DIR` is not defined, it will be the same as `DeePMD_DIR`. Note that `tensorflow_cc` is not required if `deepmd_c` is found.
302
+
```makefile
303
+
make DeePMD_DIR=/dir_to_deepmd-kit TensorFlow_DIR=/dir_to_tensorflow
304
+
```
305
+
Similarly, DeePMD-kit supports PyTorch backend but its libraries are placed at another directory, then
306
+
```makefile
307
+
make DeePMD_DIR=/dir_to_deepmd-kit Torch_DIR=/dir_to_pytorch
308
+
```
294
309
295
310
### Add LibRI Support
296
311
To use new EXX, you need two libraries: [LibRI](https://github.com/abacusmodeling/LibRI) and [LibComm](https://github.com/abacusmodeling/LibComm) and need to define `LIBRI_DIR` and `LIBCOMM_DIR` in the file `Makefile.vars` or use
0 commit comments