diff --git a/docs/community/faq.md b/docs/community/faq.md index 724519a35e..df5db1ec5b 100644 --- a/docs/community/faq.md +++ b/docs/community/faq.md @@ -109,7 +109,7 @@ write(cs_stru, cs_atoms, format='abacus', pp=pp, basis=basis) ABACUS applies the density difference between two SCF steps (labeled as `DRHO` in the screen output) as the convergence criterion, which is considered as a more robust choice compared with the energy difference. `DRHO` is calculated via `DRHO = |rho(G)-rho_previous(G)|^2`. Note that the energy difference between two SCF steps (labed as `EDIFF`) is also printed out in the screen output. -**4. Why EDIFF is much slower than DRHO? +**4. Why EDIFF is much slower than DRHO?** For metaGGA calculations, it is normal because in addition to charge density, kinetic density also needs to be considered in metaGGA calculations. In this case, you can try set `mixing_tau = true`. If you find EDIFF is much slower than DRHO for non-metaGGA calculations, please start a new issue to us. diff --git a/docs/quick_start/easy_install.md b/docs/quick_start/easy_install.md index fb1306396b..332300475a 100644 --- a/docs/quick_start/easy_install.md +++ b/docs/quick_start/easy_install.md @@ -192,6 +192,8 @@ OMP_NUM_THREADS=4 mpirun -n 4 abacus In this case, the total thread count is 16. +> Notice: If the MPI library you are using is OpenMPI, which is commonly the case, when you set the number of processes to 1 or 2, OpenMPI will default to `--bind-to core`. This means that no matter how many threads you set, these threads will be restricted to run on 1 or 2 CPU cores. Therefore, setting a higher number of OpenMP threads might result in slower program execution. Hence, when using `mpirun -n` set to 1 or 2, it is recommended to set `--bind-to none` to avoid performance degradation. For example:`OMP_NUM_THREADS=6 mpirun --bind-to none -n 1 abacus`. The detailed binding strategy of OpenMPI can be referred to at https://docs.open-mpi.org/en/v5.0.x/man-openmpi/man1/mpirun.1.html#quick-summary. + ABACUS will try to determine the number of threads used by each process if `OMP_NUM_THREADS` is not set. However, it is **required** to set `OMP_NUM_THREADS` before running `mpirun` to avoid potential performance issues. Please refer to [hands-on guide](./hands_on.md) for more instructions.