|
2 | 2 | Installation |
3 | 3 | ============ |
4 | 4 |
|
5 | | -IN DEV! |
| 5 | +# Bulding and installing 2decomp-fft |
| 6 | + |
| 7 | +## Building |
| 8 | + |
| 9 | +The build system is driven by `cmake`. It is good practice to directly point to the MPI Fortran wrapper that you would like to use to guarantee consistency between Fortran compiler and MPI. This can be done by setting the default Fortran environmental variable |
| 10 | +``` |
| 11 | +export FC=my_mpif90 |
| 12 | +``` |
| 13 | +To generate the build system run |
| 14 | +``` |
| 15 | +cmake -S $path_to_sources -B $path_to_build_directory -DOPTION1 -DOPTION2 ... |
| 16 | +``` |
| 17 | +If the directory does not exist it will be generated and it will contain the configuration files. The configuration can be further |
| 18 | +edited by using the `ccmake` utility as |
| 19 | +``` |
| 20 | +ccmake $path_to_build_directory |
| 21 | +``` |
| 22 | +and editing as desired, variables that are likely of interest are: `CMAKE_BUILD_TYPE` and `FFT_Choice`; |
| 23 | +additional variables can be shown by entering "advanced mode" by pressing `t`. |
| 24 | +By default a `RELEASE` build will built, other options for `CMAKE_BUILD_TYPE` are `DEBUG` and `DEV` which |
| 25 | +turn on debugging flags and additionally try to catch coding errors at compile time, respectively. |
| 26 | +The behaviour of debug and development versions of the library can be changed before the |
| 27 | +initialization using the variable ``decomp_debug`` or the environment variable ``DECOMP_2D_DEBUG``. |
| 28 | +The value provided with the environment variable must be a positive integer below 9999. |
| 29 | + |
| 30 | +Two `BUILD_TARGETS` are available namely `mpi` and `gpu`. For the `mpi` target no additional options should be required. whereas for `gpu` extra options are necessary at the configure stage. Please see section [GPU Compilation](#gpu-compilation) |
| 31 | + |
| 32 | +Once the build system has been configured, you can build `2decomp&fft` by running |
| 33 | +``` |
| 34 | +cmake --build $path_to_build_directory -j <nproc> |
| 35 | +``` |
| 36 | +appending `-v` will display additional information about the build, such as compiler flags. |
| 37 | + |
| 38 | +After building the library can be tested. Please see section [Testing and examples](#testing-and-examples) |
| 39 | + |
| 40 | +Options can be added to change the level of verbosity. Finally, the build library can be installed by running |
| 41 | +``` |
| 42 | +cmake --install $path_to_build_directory |
| 43 | +``` |
| 44 | +The default location for `libdecomp2d.a` is `$path_to_build_directory/opt/lib`or `$path_to_build_directory/opt/lib64` unless the variable `CMAKE_INSTALL_PREFIX` is modified. |
| 45 | +The module files generated by the build process will similarly be installed to `$path_to_build_directory/opt/install`, users of the library should add this to the include paths for their program. |
| 46 | + |
| 47 | +As indicated above, by default a static `libdecomp2d.a` will be compiled, if desired a shared library can be built by setting `BUILD_SHARED_LIBS=ON` either on the command line: |
| 48 | +``` |
| 49 | +cmake -S $path_to_sources -B $path_to_build_directory -DBUILD_SHARED_LIBS=ON |
| 50 | +``` |
| 51 | +or by editing the configuration using `ccmake`. |
| 52 | +This might be useful for a centralised install supporting multiple users that is upgraded over time. |
| 53 | + |
| 54 | +Occasionally a clean build is required, this can be performed by running |
| 55 | +``` |
| 56 | +cmake --build $path_to_build_directory --target clean |
| 57 | +``` |
| 58 | + |
| 59 | +## GPU compilation |
| 60 | + |
| 61 | +The library can perform multi GPU offoloading using the NVHPC compiler suite for NVIDIA hardware. |
| 62 | +The implementation is based on CUDA-aware MPI and NVIDIA Collective Communication Library (NCCL). |
| 63 | +The FFT is based on cuFFT. |
| 64 | + |
| 65 | +To properly configure for GPU build the following needs to be used |
| 66 | +``` |
| 67 | +cmake -S $path_to_sources -B $path_to_build_directory -DBUILD_TARGET=gpu |
| 68 | +``` |
| 69 | +Note, further configuration can be performed using `ccmake`, however the initial configuration of GPU builds must include the `-DBUILD_TARGET=gpu` flag as shown above. |
| 70 | + |
| 71 | +By default CUDA aware MPI will be used together with `cuFFT` for the FFT library. The configure will automatically look for the GPU architecture available on the system. If you are building on a HPC system please use a computing node for the installation. Useful variables to be added are |
| 72 | + |
| 73 | + - `-DENABLE_NCCL=yes` to activate the NCCL collectives |
| 74 | + - `-DENABLE_MANAGED=yes` to activate the automatic memory management form the NVHPC compiler |
| 75 | +If you are getting the following error |
| 76 | +``` |
| 77 | +-- The CUDA compiler identification is unknown |
| 78 | +CMake Error at /usr/share/cmake/Modules/CMakeDetermineCUDACompiler.cmake:633 (message): |
| 79 | +Failed to detect a default CUDA architecture. |
| 80 | +``` |
| 81 | +It is possible that your default C compiler is too recent and not supported by `nvcc` . You might be able to solve the issue by adding |
| 82 | + - `-DCMAKE_CUDA_HOST_COMPILER=$supported_gcc` |
| 83 | + |
| 84 | + At the moment the supported CUDA host compilers are `gcc11` and earlier. |
| 85 | + |
| 86 | +## Linking from external codes |
| 87 | + |
| 88 | +### Codes using Makefiles |
| 89 | + |
| 90 | +When building a code that links 2decomp-fft using a Makefile you will need to add the include and link paths as appropriate (`inlude/` and `lib/` under the installation directory, respectively). |
| 91 | +``` |
| 92 | +DECOMP_ROOT = /path/to/2decomp-fft |
| 93 | +DECOMP_BUILD_DIR = $(DECOMP_ROOT)/build |
| 94 | +DECOMP_INSTALL_DIR ?= $(DECOMP_BUILD_DIR)/opt # Use default unless set by user |
| 95 | + |
| 96 | +INC += -I$(DECOMP_INSTALL_DIR)/include |
| 97 | + |
| 98 | +# Users build/link targets |
| 99 | +LIBS = -L$(DECOMP_INSTALL_DIR)/lib64 -L$(DECOMP_INSTALL_DIR)/lib -ldecomp2d |
| 100 | + |
| 101 | +OBJ = my_exec.o |
| 102 | + |
| 103 | +my_exec: $(OBJ) |
| 104 | + $(F90) -o $@ $(OBJ) $(LIBS) |
| 105 | + |
| 106 | +``` |
| 107 | +In case 2decomp-fft has been compiled with an external FFT, such as FFTW3, `LIBS` should also contain the following |
| 108 | +``` |
| 109 | +FFTW3_PATH=/my_path_to_FFTW/lib |
| 110 | +LIBFFT=-L$(FFTW3_PATH) -lfftw3 -lfftw3f |
| 111 | +LIBS += $(LIBFFT) |
| 112 | +``` |
| 113 | +In case of 2decomp-fft compiled for GPU with NVHPC, linking against cuFFT is mandatory |
| 114 | +``` |
| 115 | +LIBS += -cudalib=cufft |
| 116 | +``` |
| 117 | +In case of NCCL the following is required |
| 118 | +``` |
| 119 | +LIBS += -cudalib=cufft,nccl |
| 120 | +``` |
| 121 | +It is also possible to drive the build and installation of 2decomp-fft from a Makefile such as in the following example code |
| 122 | +``` |
| 123 | +FC = mpif90 |
| 124 | +BUILD = Release |
| 125 | + |
| 126 | +DECOMP_ROOT = /path/to/2decomp-fft |
| 127 | +DECOMP_BUILD_DIR = $(DECOMP_ROOT)/build |
| 128 | +DECOMP_INSTALL_DIR ?= $(DECOMP_BUILD_DIR)/opt # Use default unless set by user |
| 129 | + |
| 130 | +INC += -I$(DECOMP_INSTALL_DIR)/include |
| 131 | + |
| 132 | +# Users build/link targets |
| 133 | +LIBS = -L$(DECOMP_INSTALL_DIR)/lib64 -L$(DECOMP_INSTALL_DIR)/lib -ldecomp2d |
| 134 | + |
| 135 | +# Building libdecomp.a |
| 136 | +$(DECOMP_INSTALL_DIR)/lib/libdecomp.a: |
| 137 | + FC=$(FC) cmake -S $(DECOMP_ROOT) -B $(DECOMP_BUILD_DIR) -DCMAKE_BUILD_TYPE=$(BUILD) -DCMAKE_INSTALL_PREFIX=$(DECOMP_INSTALL_DIR) |
| 138 | + cmake --build $(DECOMP_BUILD_DIR) --target decomp2d |
| 139 | + cmake --build $(DECOMP_BUILD_DIR) --target install |
| 140 | + |
| 141 | +# Clean libdecomp.a |
| 142 | +clean-decomp: |
| 143 | + cmake --build $(DECOMP_BUILD_DIR) --target clean |
| 144 | + rm -f $(DECOMP_INSTALL_DIR)/lib/libdecomp.a |
| 145 | +``` |
| 146 | + |
| 147 | +## Profiling |
| 148 | + |
| 149 | +Profiling can be activated via `cmake` configuration, the recommended approach is to run the initial configuration as follows: |
| 150 | +``` |
| 151 | +export caliper_DIR=/path/to/caliper/install/share/cmake/caliper |
| 152 | +export CXX=mpicxx |
| 153 | +cmake -S $path_to_sources -B $path_to_build_directory -DENABLE_PROFILER=caliper |
| 154 | +``` |
| 155 | +where `ENABLE_PROFILER` is set to the profiling tool desired, currently supported values are: `caliper`. |
| 156 | +Note that when using `caliper` a C++ compiler is required as indicated in the above command line. |
| 157 | + |
| 158 | +## Miscellaneous |
| 159 | + |
| 160 | +### List of preprocessor variables |
| 161 | + |
| 162 | +#### DEBUG |
| 163 | + |
| 164 | +This variable is automatically added in debug and dev builds. Extra information is printed when it is present. |
| 165 | + |
| 166 | +#### DOUBLE_PREC |
| 167 | + |
| 168 | +When this variable is not present, the library uses single precision. When it is present, the library uses double precision. This preprocessor variable is driven by the CMake on/off variable `DOUBLE_PRECISION`. |
| 169 | + |
| 170 | +#### SAVE_SINGLE |
| 171 | + |
| 172 | +This variable is valid for double precision builds only. When it is present, snapshots are written in single precision. This preprocessor variable is driven by the CMake on/off variable `SINGLE_PRECISION_OUTPUT`. |
| 173 | + |
| 174 | +#### PROFILER |
| 175 | + |
| 176 | +This variable is automatically added when selecting the profiler. It activates the profiling sections of the code. |
| 177 | + |
| 178 | +#### EVEN |
| 179 | + |
| 180 | +This preprocessor variable is not valid for GPU builds. It leads to padded alltoall operations. This preprocessor variable is driven by the CMake on/off variable `EVEN`. |
| 181 | + |
| 182 | +#### OVERWRITE |
| 183 | + |
| 184 | +This variable leads to overwrite the input array when computing FFT. The support of this flag does not always correspond to in-place transforms, depending on the FFT backend selected, as described above. This preprocessor variable is driven by the CMake on/off variable `ENABLE_INPLACE`. |
| 185 | + |
| 186 | +#### HALO_DEBUG |
| 187 | + |
| 188 | +This variable is used to debug the halo operations. This preprocessor variable is driven by the CMake on/off variable `HALO_DEBUG`. |
| 189 | + |
| 190 | +#### _GPU |
| 191 | + |
| 192 | +This variable is automatically added in GPU builds. |
| 193 | + |
| 194 | +#### _NCCL |
| 195 | + |
| 196 | +This variable is valid only for GPU builds. The NVIDIA Collective Communication Library (NCCL) implements multi-GPU and multi-node communication primitives optimized for NVIDIA GPUs and Networking. |
| 197 | + |
| 198 | +## Optional dependencies |
| 199 | + |
| 200 | +### FFTW |
| 201 | + |
| 202 | +The library [fftw](http://www.fftw.org/index.html) can be used as a backend for the FFT engine. The version 3.3.10 was tested, is supported and can be downloaded [here](http://www.fftw.org/download.html). Please note that one should build fftw and decomp2d against the same compilers. For build instructions, please check [here](http://www.fftw.org/fftw3_doc/Installation-on-Unix.html). Below is a suggestion for the compilation of the library in double precision (add `--enable-single` for a single precision build): |
| 203 | + |
| 204 | +``` |
| 205 | +wget http://www.fftw.org/fftw-3.3.10.tar.gz |
| 206 | +tar xzf fftw-3.3.10.tar.gz |
| 207 | +mkdir fftw-3.3.10_tmp && cd fftw-3.3.10_tmp |
| 208 | +../fftw-3.3.10/configure --prefix=xxxxxxx/fftw3/fftw-3.3.10_bld --enable-shared |
| 209 | +make -j |
| 210 | +make -j check |
| 211 | +make install |
| 212 | +``` |
| 213 | +Please note that the resulting build is not compatible with CMake (https://github.com/FFTW/fftw3/issues/130). As a workaround, one can open the file `/path/to/fftw3/install/lib/cmake/fftw3/FFTW3Config.cmake` and comment the line |
| 214 | +``` |
| 215 | +include ("${CMAKE_CURRENT_LIST_DIR}/FFTW3LibraryDepends.cmake") |
| 216 | +``` |
| 217 | + |
| 218 | +To build `2decomp&fft` against fftw3, one can provide the package configuration for fftw3 in the `PKG_CONFIG_PATH` environment variable, this should be found under `/path/to/fftw3/install/lib/pkgconfig`. One can also provide the option `-DFFTW_ROOT=/path/to/fftw3/install`. Then either specify on the command line when configuring the build |
| 219 | +``` |
| 220 | +cmake -S . -B build -DFFT_Choice=<fftw|fftw_f03> -DFFTW_ROOT=/path/to/fftw3/install |
| 221 | +``` |
| 222 | +or modify the build configuration using `ccmake`. |
| 223 | + |
| 224 | +Note the legacy `fftw` interface lacks interface definitions and will fail when stricter compilation flags are used (e.g. when `-DCMAKE_BUILD_TYPE=Dev`) for this it is recommended to use `fftw_f03` which provides proper interfaces. |
| 225 | + |
| 226 | +### Caliper |
| 227 | + |
| 228 | +The library [caliper](https://github.com/LLNL/Caliper) can be used to profile the execution of the code. The version 2.9.1 was tested and is supported, version 2.8.0 has also been tested and is still expected to work. Please note that one must build caliper and decomp2d against the same C/C++/Fortran compilers and MPI libray. For build instructions, please check [here](https://github.com/LLNL/Caliper#building-and-installing) and [here](https://software.llnl.gov/Caliper/CaliperBasics.html#build-and-install). Below is a suggestion for the compilation of the library using the GNU compilers: |
| 229 | + |
| 230 | +``` |
| 231 | +git clone https://github.com/LLNL/Caliper.git caliper_github |
| 232 | +cd caliper_github |
| 233 | +git checkout v2.9.1 |
| 234 | +mkdir build && cd build |
| 235 | +cmake -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DCMAKE_Fortran_COMPILER=gfortran -DCMAKE_INSTALL_PREFIX=../../caliper_build_2.9.1 -DWITH_FORTRAN=yes -DWITH_MPI=yes -DBUILD_TESTING=yes ../ |
| 236 | +make -j |
| 237 | +make test |
| 238 | +make install |
| 239 | +``` |
| 240 | + |
| 241 | +After installing Caliper ensure to set `caliper_DIR=/path/to/caliper/install/share/cmake/caliper`. |
| 242 | +Following this the `2decomp-fft` build can be configured to use Caliper profiling as |
| 243 | +``` |
| 244 | +cmake -S . -B -DENABLE_PROFILER=caliper |
| 245 | +``` |
| 246 | +or by modifying the configuration to set `ENABLE_PROFILER=caliper` via `ccmake`. |
| 247 | + |
0 commit comments