|
1 | 1 | [](){#ref-uenv-vasp} |
2 | 2 | # VASP |
3 | 3 |
|
4 | | -!!! todo |
5 | | - port over docs |
| 4 | +The Vienna Ab initio Simulation Package ([VASP]) is a computer program for atomic scale materials modelling, e.g. electronic structure calculations and quantum-mechanical molecular dynamics, from first principles. |
| 5 | + |
| 6 | +VASP computes an approximate solution to the many-body Schrödinger equation, either within density functional theory (DFT), solving the Kohn-Sham equations, or within the Hartree-Fock (HF) approximation, solving the Roothaan equations. |
| 7 | +Hybrid functionals that mix the Hartree-Fock approach with density functional theory are implemented as well. |
| 8 | +Furthermore, Green's functions methods (GW quasiparticles, and ACFDT-RPA) and many-body perturbation theory (2nd-order Møller-Plesset) are available in VASP. |
| 9 | + |
| 10 | +In VASP, central quantities, like the one-electron orbitals, the electronic charge density, and the local potential are expressed in plane wave basis sets. |
| 11 | +The interactions between the electrons and ions are described using norm-conserving or ultrasoft pseudopotentials, or the projector-augmented-wave method. |
| 12 | +To determine the electronic groundstate, VASP makes use of efficient iterative matrix diagonalisation techniques, like the residual minimisation method with direct inversion of the iterative subspace (RMM-DIIS) or blocked Davidson algorithms. |
| 13 | +These are coupled to highly efficient Broyden and Pulay density mixing schemes to speed up the self-consistency cycle. |
| 14 | + |
| 15 | + |
| 16 | +!!! note "Licensing Terms and Conditions" |
| 17 | + Access to VASP is restricted to users who have purchased a license from VASP Software GmbH. |
| 18 | + CSCS cannot provide free access to the code and needs to inform VASP Software GmbH with an updated list of users. |
| 19 | + Once you have a license, submit a request on the [CSCS service desk](https://jira.cscs.ch/plugins/servlet/desk) (with a copy of your license) to be added to the `vasp6` unix group, which will grant access to the `vasp` uenv. |
| 20 | + Please refer to the VASP web site for more information about licensing. |
| 21 | + Therefore, access to precompiled `VASP.6` executables and library files will be available only to users who have already purchased a `VASP.6` license and upon request will become members of the CSCS unix group `vasp6`. |
| 22 | + |
| 23 | + To access VASP follow the [`Accessing Restricted Software`][ref-uenv-restricted-software] guide. |
| 24 | + Please refer to the [VASP web site](https://www.vasp.at) for more information. |
| 25 | + |
| 26 | + |
| 27 | +## Running VASP |
| 28 | + |
| 29 | +### Running on the HPC platform |
| 30 | +A precompiled uenv containing VASP with MPI, OpenMP, OpenACC, HDF5 and Wannier90 support is available. |
| 31 | +Due to license restrictions, the VASP images are not directly accessible in the same way as other applications. |
| 32 | + |
| 33 | +For accessing VASP uenv images, please see the guide to [accessing restricted software][ref-uenv-restricted-software]. |
| 34 | + |
| 35 | +To load the VASP uenv: |
| 36 | +```bash |
| 37 | +uenv start vasp/v6.5.0:v1 --view=vasp |
| 38 | +``` |
| 39 | +The `vasp_std` , `vasp_ncl` and `vasp_gam` executables are now available for use. |
| 40 | +Loading the uenv can also be directly done inside of a SLURM script. |
| 41 | + |
| 42 | +```bash title="SLURM script for running VASP on a single node" |
| 43 | +#!/bin/bash -l |
| 44 | + |
| 45 | +#SBATCH --job-name=vasp |
| 46 | +#SBATCH --time=24:00:00 |
| 47 | +#SBATCH --nodes=1 |
| 48 | +#SBATCH --ntasks-per-node=4 |
| 49 | +#SBATCH --cpus-per-task=16 |
| 50 | +#SBATCH --gpus-per-task=1 |
| 51 | +#SBATCH --uenv=vasp/v6.5.0:v1 |
| 52 | +#SBATCH --view=vasp |
| 53 | +#SBATCH --account=<ACCOUNT> |
| 54 | +#SBATCH --partition=normal |
| 55 | + |
| 56 | +export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK |
| 57 | +export MPICH_GPU_SUPPORT_ENABLED=1 |
| 58 | + |
| 59 | +srun vasp_std |
| 60 | +``` |
| 61 | + |
| 62 | +!!! note |
| 63 | + It's recommended to use the SLURM option `--gpus-per-task=1`, since VASP may fail to properly assign ranks to GPUs when running on more than one node. |
| 64 | + This is not required when using the CUDA MPS wrapper for oversubscription of GPUs. |
| 65 | + |
| 66 | +!!! note |
| 67 | + VASP relies on CUDA-aware MPI, which requires `MPICH_GPU_SUPPORT_ENABLED=1` to be set when using Cray MPICH. On the HPC platform including `daint`, this is set by default and does not have to be included in SLURM scripts. |
| 68 | + |
| 69 | + |
| 70 | + |
| 71 | +### Multiple Tasks per GPU |
| 72 | +Using more than one task per GPU is possible with VASP and may lead to better GPU utilization. |
| 73 | +However, VASP relies on [NCCL] for efficient communication, but falls back to MPI when using multiple tasks per GPU. |
| 74 | +In many cases, this drawback is the greater factor and it's best to use one task per GPU. |
| 75 | + |
| 76 | +To run with multiple tasks per GPU, a wrapper script is required to start a CUDA MPS service. |
| 77 | +This script can be found at [NVIDIA GH200 GPU nodes: multiple ranks per GPU][ref-slurm-gh200-multi-rank-per-gpu]. |
| 78 | + |
| 79 | +```bash title="SLURM script for running VASP on a single node with two tasks per GPU" |
| 80 | +#!/bin/bash -l |
| 81 | + |
| 82 | +#SBATCH --job-name=vasp |
| 83 | +#SBATCH --time=24:00:00 |
| 84 | +#SBATCH --nodes=1 |
| 85 | +#SBATCH --ntasks-per-node=8 |
| 86 | +#SBATCH --cpus-per-task=16 |
| 87 | +#SBATCH --uenv=vasp/v6.5.0:v1 |
| 88 | +#SBATCH --view=vasp |
| 89 | +#SBATCH --account=<ACCOUNT> |
| 90 | +#SBATCH --partition=normal |
| 91 | + |
| 92 | +export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK |
| 93 | +export MPICH_GPU_SUPPORT_ENABLED=1 |
| 94 | + |
| 95 | +srun ./mps-wrapper.sh vasp_std |
| 96 | +``` |
| 97 | + |
| 98 | +## Building VASP from source |
| 99 | + |
| 100 | +To build VASP from source, the `develop` view must first be loaded: |
| 101 | +``` |
| 102 | +uenv start vasp/v6.5.0:v1 --view=develop |
| 103 | +``` |
| 104 | + |
| 105 | +All required dependencies can now be found in `/user-environment/env/develop`. |
| 106 | +Note that shared libraries might not be found when executing VASP, if the makefile does not include additional rpath linking options or `LD_LIBRARY_PATH` has not been extended. |
| 107 | + |
| 108 | +!!! warning |
| 109 | + The detection of MPI CUDA support does not work properly with Cray MPICH. |
| 110 | + After compiling from source, it's also required to set `export PMPI_GPU_AWARE=1` at runtime to disable the CUDA support check within VASP. |
| 111 | + Alternatively, since version 6.5.0, the build option `-DCRAY_MPICH` can be added to disable the check at compile time. |
| 112 | + The provided precompiled binaries of VASP are patched and do not require special settings. |
| 113 | + |
| 114 | + |
| 115 | +Examples for makefiles that set the necessary rpath and link options on GH200: |
| 116 | + |
| 117 | + |
| 118 | +??? note "Makefile for v6.5.0" |
| 119 | + ```make |
| 120 | + # Default precompiler options |
| 121 | + CPP_OPTIONS = -DHOST=\"LinuxNV\" \ |
| 122 | + -DMPI -DMPI_INPLACE -DMPI_BLOCK=8000 -Duse_collective \ |
| 123 | + -DscaLAPACK \ |
| 124 | + -DCACHE_SIZE=4000 \ |
| 125 | + -Davoidalloc \ |
| 126 | + -Dvasp6 \ |
| 127 | + -Dtbdyn \ |
| 128 | + -Dqd_emulate \ |
| 129 | + -Dfock_dblbuf \ |
| 130 | + -D_OPENMP \ |
| 131 | + -DACC_OFFLOAD \ |
| 132 | + -DNVCUDA \ |
| 133 | + -DUSENCCL \ |
| 134 | + -DCRAY_MPICH |
| 135 | + |
| 136 | + CPP = nvfortran -Mpreprocess -Mfree -Mextend -E $(CPP_OPTIONS) $*$(FUFFIX) > $*$(SUFFIX) |
| 137 | + CPP = nvfortran -Mpreprocess -Mfree -Mextend -E $(CPP_OPTIONS) $*$(FUFFIX) > $*$(SUFFIX) |
| 138 | + |
| 139 | + CUDA_VERSION = $(shell nvcc -V | grep -E -o -m 1 "[0-9][0-9]\.[0-9]," | rev | cut -c 2- | rev) |
| 140 | + |
| 141 | + CC = mpicc -acc -gpu=cc90,cuda${CUDA_VERSION} -mp |
| 142 | + FC = mpif90 -acc -gpu=cc90,cuda${CUDA_VERSION} -mp |
| 143 | + FCL = mpif90 -acc -gpu=cc90,cuda${CUDA_VERSION} -mp -c++libs |
| 144 | + |
| 145 | + FREE = -Mfree |
| 146 | + |
| 147 | + FFLAGS = -Mbackslash -Mlarge_arrays |
| 148 | + |
| 149 | + OFLAG = -fast |
| 150 | + |
| 151 | + DEBUG = -Mfree -O0 -traceback |
| 152 | + |
| 153 | + LLIBS = -cudalib=cublas,cusolver,cufft,nccl -cuda |
| 154 | + |
| 155 | + # Redefine the standard list of O1 and O2 objects |
| 156 | + SOURCE_O1 := pade_fit.o minimax_dependence.o |
| 157 | + SOURCE_O2 := pead.o |
| 158 | + |
| 159 | + # For what used to be vasp.5.lib |
| 160 | + CPP_LIB = $(CPP) |
| 161 | + FC_LIB = $(FC) |
| 162 | + CC_LIB = $(CC) |
| 163 | + CFLAGS_LIB = -O -w |
| 164 | + FFLAGS_LIB = -O1 -Mfixed |
| 165 | + FREE_LIB = $(FREE) |
| 166 | + |
| 167 | + OBJECTS_LIB = linpack_double.o |
| 168 | + |
| 169 | + # For the parser library |
| 170 | + CXX_PARS = nvc++ --no_warnings |
| 171 | + |
| 172 | + ## |
| 173 | + ## Customize as of this point! Of course you may change the preceding |
| 174 | + ## part of this file as well if you like, but it should rarely be |
| 175 | + ## necessary ... |
| 176 | + ## |
| 177 | + # When compiling on the target machine itself , change this to the |
| 178 | + # relevant target when cross-compiling for another architecture |
| 179 | + # |
| 180 | + # NOTE: Using "-tp neoverse-v2" causes some tests to fail. On GH200 architecture, "-tp host" |
| 181 | + # is recommended. |
| 182 | + VASP_TARGET_CPU ?= -tp host |
| 183 | + FFLAGS += $(VASP_TARGET_CPU) |
| 184 | + |
| 185 | + # Specify your NV HPC-SDK installation (mandatory) |
| 186 | + #... first try to set it automatically |
| 187 | + NVROOT =$(shell which nvfortran | awk -F /compilers/bin/nvfortran '{ print $$1 }') |
| 188 | + |
| 189 | + # If the above fails, then NVROOT needs to be set manually |
| 190 | + #NVHPC ?= /opt/nvidia/hpc_sdk |
| 191 | + #NVVERSION = 21.11 |
| 192 | + #NVROOT = $(NVHPC)/Linux_x86_64/$(NVVERSION) |
| 193 | + |
| 194 | + ## Improves performance when using NV HPC-SDK >=21.11 and CUDA >11.2 |
| 195 | + #OFLAG_IN = -fast -Mwarperf |
| 196 | + #SOURCE_IN := nonlr.o |
| 197 | + |
| 198 | + # Software emulation of quadruple precsion (mandatory) |
| 199 | + QD ?= $(NVROOT)/compilers/extras/qd |
| 200 | + LLIBS += -L$(QD)/lib -lqdmod -lqd -Wl,-rpath,$(QD)/lib |
| 201 | + INCS += -I$(QD)/include/qd |
| 202 | + |
| 203 | + # BLAS (mandatory) |
| 204 | + BLAS = -lnvpl_blas_lp64_gomp -lnvpl_blas_core |
| 205 | + |
| 206 | + # LAPACK (mandatory) |
| 207 | + LAPACK = -lnvpl_lapack_lp64_gomp -lnvpl_lapack_core |
| 208 | + |
| 209 | + # scaLAPACK (mandatory) |
| 210 | + SCALAPACK = -lscalapack |
| 211 | + |
| 212 | + LLIBS += $(SCALAPACK) $(LAPACK) $(BLAS) -Wl,-rpath,/user-environment/env/develop/lib -Wl,-rpath,/user-environment/env/develop/lib64 -Wl,--disable-new-dtags |
| 213 | + |
| 214 | + # FFTW (mandatory) |
| 215 | + FFTW_ROOT ?= /user-environment/env/develop |
| 216 | + LLIBS += -L$(FFTW_ROOT)/lib -lfftw3 -lfftw3_omp |
| 217 | + INCS += -I$(FFTW_ROOT)/include |
| 218 | + |
| 219 | + # Use cusolvermp (optional) |
| 220 | + # supported as of NVHPC-SDK 24.1 (and needs CUDA-11.8) |
| 221 | + #CPP_OPTIONS+= -DCUSOLVERMP -DCUBLASMP |
| 222 | + #LLIBS += -cudalib=cusolvermp,cublasmp -lnvhpcwrapcal |
| 223 | + |
| 224 | + # HDF5-support (optional but strongly recommended) |
| 225 | + CPP_OPTIONS+= -DVASP_HDF5 |
| 226 | + HDF5_ROOT ?= /user-environment/env/develop |
| 227 | + LLIBS += -L$(HDF5_ROOT)/lib -lhdf5_fortran |
| 228 | + INCS += -I$(HDF5_ROOT)/include |
| 229 | + |
| 230 | + # For the VASP-2-Wannier90 interface (optional) |
| 231 | + CPP_OPTIONS += -DVASP2WANNIER90 |
| 232 | + WANNIER90_ROOT ?= /user-environment/env/develop |
| 233 | + LLIBS += -L$(WANNIER90_ROOT)/lib -lwannier |
| 234 | + |
| 235 | + # For the fftlib library (recommended) |
| 236 | + #CPP_OPTIONS+= -Dsysv |
| 237 | + #FCL += fftlib.o |
| 238 | + #CXX_FFTLIB = nvc++ -mp --no_warnings -std=c++11 -DFFTLIB_THREADSAFE |
| 239 | + #INCS_FFTLIB = -I./include -I$(FFTW_ROOT)/include |
| 240 | + #LIBS += fftlib |
| 241 | + #LLIBS += -ldl |
| 242 | + ``` |
| 243 | + |
| 244 | + |
| 245 | +??? note "Makefile for v6.4.3" |
| 246 | + ```make |
| 247 | + # Default precompiler options |
| 248 | + CPP_OPTIONS = -DHOST=\"LinuxNV\" \ |
| 249 | + -DMPI -DMPI_INPLACE -DMPI_BLOCK=8000 -Duse_collective \ |
| 250 | + -DscaLAPACK \ |
| 251 | + -DCACHE_SIZE=4000 \ |
| 252 | + -Davoidalloc \ |
| 253 | + -Dvasp6 \ |
| 254 | + -Duse_bse_te \ |
| 255 | + -Dtbdyn \ |
| 256 | + -Dqd_emulate \ |
| 257 | + -Dfock_dblbuf \ |
| 258 | + -D_OPENMP \ |
| 259 | + -D_OPENACC \ |
| 260 | + -DUSENCCL -DUSENCCLP2P |
| 261 | + |
| 262 | + CPP = nvfortran -Mpreprocess -Mfree -Mextend -E $(CPP_OPTIONS) $*$(FUFFIX) > $*$(SUFFIX) |
| 263 | + |
| 264 | + CUDA_VERSION = $(shell nvcc -V | grep -E -o -m 1 "[0-9][0-9]\.[0-9]," | rev | cut -c 2- | rev) |
| 265 | + |
| 266 | + CC = mpicc -acc -gpu=cc90,cuda${CUDA_VERSION} -mp |
| 267 | + FC = mpif90 -acc -gpu=cc90,cuda${CUDA_VERSION} -mp |
| 268 | + FCL = mpif90 -acc -gpu=cc90,cuda${CUDA_VERSION} -mp -c++libs |
| 269 | + |
| 270 | + FREE = -Mfree |
| 271 | + |
| 272 | + FFLAGS = -Mbackslash -Mlarge_arrays |
| 273 | + |
| 274 | + OFLAG = -fast |
| 275 | + |
| 276 | + DEBUG = -Mfree -O0 -traceback |
| 277 | + |
| 278 | + OBJECTS = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o |
| 279 | + |
| 280 | + LLIBS = -cudalib=cublas,cusolver,cufft,nccl -cuda |
| 281 | + |
| 282 | + # Redefine the standard list of O1 and O2 objects |
| 283 | + SOURCE_O1 := pade_fit.o minimax_dependence.o |
| 284 | + SOURCE_O2 := pead.o |
| 285 | + |
| 286 | + # For what used to be vasp.5.lib |
| 287 | + CPP_LIB = $(CPP) |
| 288 | + FC_LIB = $(FC) |
| 289 | + CC_LIB = $(CC) |
| 290 | + CFLAGS_LIB = -O -w |
| 291 | + FFLAGS_LIB = -O1 -Mfixed |
| 292 | + FREE_LIB = $(FREE) |
| 293 | + |
| 294 | + OBJECTS_LIB = linpack_double.o |
| 295 | + |
| 296 | + # For the parser library |
| 297 | + CXX_PARS = nvc++ --no_warnings |
| 298 | + |
| 299 | + ## |
| 300 | + ## Customize as of this point! Of course you may change the preceding |
| 301 | + ## part of this file as well if you like, but it should rarely be |
| 302 | + ## necessary ... |
| 303 | + ## |
| 304 | + # When compiling on the target machine itself , change this to the |
| 305 | + # relevant target when cross-compiling for another architecture |
| 306 | + # |
| 307 | + # NOTE: Using "-tp neoverse-v2" causes some tests to fail. On GH200 architecture, "-tp host" |
| 308 | + # is recommended. |
| 309 | + VASP_TARGET_CPU ?= -tp host |
| 310 | + FFLAGS += $(VASP_TARGET_CPU) |
| 311 | + |
| 312 | + # Specify your NV HPC-SDK installation (mandatory) |
| 313 | + #... first try to set it automatically |
| 314 | + NVROOT =$(shell which nvfortran | awk -F /compilers/bin/nvfortran '{ print $$1 }') |
| 315 | + |
| 316 | + # If the above fails, then NVROOT needs to be set manually |
| 317 | + #NVHPC ?= /opt/nvidia/hpc_sdk |
| 318 | + #NVVERSION = 21.11 |
| 319 | + #NVROOT = $(NVHPC)/Linux_x86_64/$(NVVERSION) |
| 320 | + |
| 321 | + ## Improves performance when using NV HPC-SDK >=21.11 and CUDA >11.2 |
| 322 | + #OFLAG_IN = -fast -Mwarperf |
| 323 | + #SOURCE_IN := nonlr.o |
| 324 | + |
| 325 | + # Software emulation of quadruple precsion (mandatory) |
| 326 | + QD ?= $(NVROOT)/compilers/extras/qd |
| 327 | + LLIBS += -L$(QD)/lib -lqdmod -lqd -Wl,-rpath,$(QD)/lib |
| 328 | + INCS += -I$(QD)/include/qd |
| 329 | + |
| 330 | + # BLAS (mandatory) |
| 331 | + BLAS = -lnvpl_blas_lp64_gomp -lnvpl_blas_core |
| 332 | + |
| 333 | + # LAPACK (mandatory) |
| 334 | + LAPACK = -lnvpl_lapack_lp64_gomp -lnvpl_lapack_core |
| 335 | + |
| 336 | + # scaLAPACK (mandatory) |
| 337 | + SCALAPACK = -lscalapack |
| 338 | + |
| 339 | + LLIBS += $(SCALAPACK) $(LAPACK) $(BLAS) -Wl,-rpath,/user-environment/env/develop/lib -Wl,-rpath,/user-environment/env/develop/lib64 -Wl,--disable-new-dtags |
| 340 | + |
| 341 | + # FFTW (mandatory) |
| 342 | + FFTW_ROOT ?= /user-environment/env/develop |
| 343 | + LLIBS += -L$(FFTW_ROOT)/lib -lfftw3 -lfftw3_omp |
| 344 | + INCS += -I$(FFTW_ROOT)/include |
| 345 | + |
| 346 | + # Use cusolvermp (optional) |
| 347 | + # supported as of NVHPC-SDK 24.1 (and needs CUDA-11.8) |
| 348 | + #CPP_OPTIONS+= -DCUSOLVERMP -DCUBLASMP |
| 349 | + #LLIBS += -cudalib=cusolvermp,cublasmp -lnvhpcwrapcal |
| 350 | + |
| 351 | + # HDF5-support (optional but strongly recommended) |
| 352 | + CPP_OPTIONS+= -DVASP_HDF5 |
| 353 | + HDF5_ROOT ?= /user-environment/env/develop |
| 354 | + LLIBS += -L$(HDF5_ROOT)/lib -lhdf5_fortran |
| 355 | + INCS += -I$(HDF5_ROOT)/include |
| 356 | + |
| 357 | + # For the VASP-2-Wannier90 interface (optional) |
| 358 | + CPP_OPTIONS += -DVASP2WANNIER90 |
| 359 | + WANNIER90_ROOT ?= /user-environment/env/develop |
| 360 | + LLIBS += -L$(WANNIER90_ROOT)/lib -lwannier |
| 361 | + |
| 362 | + # For the fftlib library (recommended) |
| 363 | + #CPP_OPTIONS+= -Dsysv |
| 364 | + #FCL += fftlib.o |
| 365 | + #CXX_FFTLIB = nvc++ -mp --no_warnings -std=c++11 -DFFTLIB_THREADSAFE |
| 366 | + #INCS_FFTLIB = -I./include -I$(FFTW_ROOT)/include |
| 367 | + #LIBS += fftlib |
| 368 | + #LLIBS += -ldl |
| 369 | + ``` |
| 370 | + |
| 371 | + |
| 372 | + |
| 373 | +[VASP]: https://vasp.at/ |
| 374 | +[NCCL]: https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/overview.html |
| 375 | + |
0 commit comments