You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/advanced/acceleration/cuda.md
+4-2Lines changed: 4 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,12 +29,14 @@ To compile and use ABACUS in CUDA mode, you currently need to have an NVIDIA GPU
29
29
30
30
Check the [Advanced Installation Options](https://abacus-rtd.readthedocs.io/en/latest/advanced/install.html#build-with-cuda-support) for the installation of CUDA version support.
31
31
32
-
When the compilation parameter USE_ELPA is ON (which is the default value) and USE_CUDA is also set to ON, the ELPA library needs to [enable GPU support](https://github.com/marekandreas/elpa/blob/master/documentation/INSTALL.md) at compile time.
32
+
Setting both USE_ELPA and USE_CUDA to ON does not automatically enable ELPA to run on GPUs. ELPA support for GPUs needs to be enabled when ELPA is compiled. [enable GPU support](https://github.com/marekandreas/elpa/blob/master/documentation/INSTALL.md).
33
+
34
+
The ABACUS program will automatically determine whether the current ELPA supports GPU based on the elpa/elpa_configured_options.h header file. Users can also check this header file to determine the GPU support of ELPA in their environment. ELPA introduced a new API elpa_setup_gpu in version 2023.11.001. So if you want to enable ELPA GPU in ABACUS, the ELPA version must be greater than or equal to 2023.11.001.
33
35
34
36
## Run with the GPU support by editing the INPUT script:
35
37
36
38
In `INPUT` file we need to set the input parameter [device](../input_files/input-main.md#device) to `gpu`. If this parameter is not set, ABACUS will try to determine if there are available GPUs.
37
-
- Set `ks_solver`: For the PW basis, CG, BPCG and Davidson methods are supported on GPU; set the input parameter [ks_solver](../input_files/input-main.md#ks_solver) to `cg`, `bpcg` or `dav`. For the LCAO basis, `cusolver` and `elpa` is supported on GPU.
39
+
- Set `ks_solver`: For the PW basis, CG, BPCG and Davidson methods are supported on GPU; set the input parameter [ks_solver](../input_files/input-main.md#ks_solver) to `cg`, `bpcg` or `dav`. For the LCAO basis, `cusolver`, `cusolvermp` and `elpa` is supported on GPU.
38
40
-**multi-card**: ABACUS allows for multi-GPU acceleration. If you have multiple GPU cards, you can run ABACUS with several MPI processes, and each process will utilize one GPU card. For example, the command `mpirun -n 2 abacus` will by default launch two GPUs for computation. If you only have one card, this command will only start one GPU.
@@ -663,6 +667,7 @@ These variables are used to control parameters related to input files.
663
667
-**Type**: String
664
668
-**Description**: the name of the structure file
665
669
- Containing various information about atom species, including pseudopotential files, local orbitals files, cell information, atom positions, and whether atoms should be allowed to move.
670
+
- When [calculation](#calculation) is set to `md` and [md_restart](#md_restart) is set to `true`, this keyword will NOT work.
666
671
- Refer to [Doc](https://github.com/deepmodeling/abacus-develop/blob/develop/docs/advanced/input_files/stru.md)
667
672
-**Default**: STRU
668
673
@@ -928,6 +933,8 @@ calculations.
928
933
-**genelpa**: This method should be used if you choose localized orbitals.
929
934
-**scalapack_gvx**: Scalapack can also be used for localized orbitals.
930
935
-**cusolver**: This method needs building with CUDA and at least one gpu is available.
936
+
-**cusolvermp**: This method supports multi-GPU acceleration and needs building with CUDA。 Note that when using cusolvermp, you should set the number of MPI processes to be equal to the number of GPUs.
937
+
-**elpa**: The ELPA solver supports both CPU and GPU. By setting the `device` to GPU, you can launch the ELPA solver with GPU acceleration (provided that you have installed a GPU-supported version of ELPA, which requires you to manually compile and install ELPA, and the ABACUS should be compiled with -DUSE_ELPA=ON and -DUSE_CUDA=ON). The ELPA solver also supports multi-GPU acceleration.
931
938
932
939
If you set ks_solver=`genelpa` for basis_type=`pw`, the program will be stopped with an error message:
933
940
@@ -936,7 +943,13 @@ calculations.
936
943
```
937
944
938
945
Then the user has to correct the input file and restart the calculation.
939
-
-**Default**: cg (plane-wave basis), or genelpa (localized atomic orbital basis, if compiling option `USE_ELPA` has been set),lapack (localized atomic orbital basis, if compiling option `ENABLE_MPI` has not been set), scalapack_gvx, (localized atomic orbital basis, if compiling option `USE_ELPA` has not been set and if compiling option `ENABLE_MPI` has been set)
946
+
-**Default**:
947
+
-**PW basis**: cg.
948
+
-**LCAO basis**:
949
+
- genelpa (if compiling option `USE_ELPA` has been set)
950
+
- lapack (if compiling option `ENABLE_MPI` has not been set)
951
+
- scalapack_gvx (if compiling option `USE_ELPA` has not been set and compiling option `ENABLE_MPI` has been set)
952
+
- cusolver (if compiling option `USE_CUDA` has been set)
940
953
941
954
### nbands
942
955
@@ -1517,7 +1530,7 @@ These variables are used to control the output of properties.
1517
1530
-**Type**: Integer \[Integer\](optional)
1518
1531
-**Description**:
1519
1532
The first integer controls whether to output the charge density on real space grids:
1520
-
-1. Output the charge density (in Bohr^-3) on real space grids into the density files in the folder `OUT.${suffix}`. The files are named as:
1533
+
- 1: Output the charge density (in Bohr^-3) on real space grids into the density files in the folder `OUT.${suffix}`. The files are named as:
1521
1534
- nspin = 1: SPIN1_CHG.cube;
1522
1535
- nspin = 2: SPIN1_CHG.cube, and SPIN2_CHG.cube;
1523
1536
- nspin = 4: SPIN1_CHG.cube, SPIN2_CHG.cube, SPIN3_CHG.cube, and SPIN4_CHG.cube.
@@ -1797,6 +1810,23 @@ The band (KS orbital) energy for each (k-point, spin, band) will be printed in t
1797
1810
-**Description**: Specifies whether to write the partial charge densities for all k-points to individual files or merge them. **Warning**: Enabling symmetry may produce incorrect results due to incorrect k-point weights. Therefore, when calculating partial charge densities, it is strongly recommended to set `symmetry = -1`.
1798
1811
-**Default**: false
1799
1812
1813
+
### out_elf
1814
+
1815
+
-**Type**: Integer \[Integer\](optional)
1816
+
-**Availability**: Only for Kohn-Sham DFT and Orbital Free DFT.
1817
+
-**Description**: Whether to output the electron localization function (ELF) in the folder `OUT.${suffix}`. The files are named as
The second integer controls the precision of the kinetic energy density output, if not given, will use `3` as default. For purpose restarting from this file and other high-precision involved calculation, recommend to use `10`.
1825
+
1826
+
---
1827
+
In molecular dynamics calculations, the output frequency is controlled by [out_interval](#out_interval).
1828
+
-**Default**: 0 3
1829
+
1800
1830
[back to top](#full-list-of-input-keywords)
1801
1831
1802
1832
## Density of states
@@ -2436,6 +2466,12 @@ These variables are relevant when using hybrid functionals.
2436
2466
- True: rotate both D(k) and Hexx(R) to accelerate both diagonalization and EXX calculation
2437
2467
-**Default**: True
2438
2468
2469
+
### out_ri_cv
2470
+
2471
+
-**Type**: Boolean
2472
+
-**Description**: Whether to output the coefficient tensor C(R) and ABFs-representation Coulomb matrix V(R) for each atom pair and cell in real space.
2473
+
-**Default**: false
2474
+
2439
2475
[back to top](#full-list-of-input-keywords)
2440
2476
2441
2477
## Molecular dynamics
@@ -3946,4 +3982,21 @@ The output files are `OUT.${suffix}/Excitation_Energy.dat` and `OUT.${suffix}/Ex
3946
3982
-**Description**: The broadening factor $\eta$ for the absorption spectrum calculation.
3947
3983
-**Default**: 0.01
3948
3984
3985
+
### ri_hartree_benchmark
3986
+
-**Type**: String
3987
+
-**Description**: Whether to use the localized resolution-of-identity (LRI) approximation for the **Hartree** term of kernel in the $A$ matrix of LR-TDDFT for benchmark (with FHI-aims or another ABACUS calculation). Now it only supports molecular systems running with a single processor, and a large enough supercell should be used to make LRI C, V tensors contain only the R=(0 0 0) cell.
3988
+
-`aims`: The `OUT.${suffix}`directory should contain the FHI-aims output files: RI-LVL tensors`Cs_data_0.txt` and `coulomb_mat_0.txt`, and KS eigenstates from FHI-aims: `band_out`and `KS_eigenvectors.out`. The Casida equation will be constructed under FHI-aims' KS eigenpairs.
3989
+
- LRI tensor files (`Cs_data_0.txt` and `coulomb_mat_0.txt`)and Kohn-Sham eigenvalues (`bands_out`): run FHI-aims with periodic boundary conditions and with `total_energy_method rpa` and `output librpa`.
3990
+
- Kohn-Sham eigenstates under aims NAOs (`KS_eigenvectors.out`): run FHI-aims with `output eigenvectors`.
3991
+
- If the number of atomic orbitals of any atom type in FHI-aims is different from that in ABACUS, the `aims_nbasis` should be set.
3992
+
-`abacus`: The `OUT.${suffix}`directory should contain the RI-LVL tensors `Cs` and `Vs` (written by setting `out_ri_cv` to 1). The Casida equation will be constructed under ABACUS' KS eigenpairs, with the only difference that the Hartree term is constructed with RI approximation.
3993
+
-`none`: Construct the Hartree term by Poisson equation and grid integration as usual.
Copy file name to clipboardExpand all lines: docs/advanced/install.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -93,9 +93,9 @@ cmake -B build -DUSE_CUDA=1 -DCMAKE_CUDA_COMPILER=${path to cuda toolkit}/bin/nv
93
93
94
94
## Build math library from source
95
95
96
-
> Note: This flag is **enabled by default**. It will get better performance than the standard implementation on `gcc` and `clang`. But it **will be disabled** when using `Intel Compiler` since the math functions will get wrong results and the performance is also unexpectly poor.
96
+
> Note: We recommend using the latest available compiler sets, since they offer faster implementations of math functions.
97
97
98
-
To build math functions from source code, instead of using c++ standard implementation, define `USE_ABACUS_LIBM` flag.
98
+
This flag is disabled by default. To build math functions from source code, define `USE_ABACUS_LIBM` flag. It is expected to get a better performance on legacy versions of `gcc` and `clang`.
0 commit comments