Kokkos Resilience is an experimental extension to Kokkos for providing convenient resilience and checkpointing to scientific applications.
Kokkos Resilience is built using CMake version 3.17 or later. It has been tested on compilers such as GCC 11.2.0 and LLVM/Clang 11.0.0. It should work on any C++14 supporting compiler, but your mileage may vary.
Kokkos Resilience maintains a spack package for simplified installation of Kokkos Resilience and dependencies with just a few commands.
spack repo add --name kr https://github.com/kokkos/kokkos-resilience.git
spack info kokkos-resilience # Optionally, check the Spack variants/versions/dependencies
spack install kokkos-resilienceFirst and foremost, Kokkos Resilience requires an install of Kokkos. This can be compiled or a version bundled with other software (such as Trilinos) or as a package on a machine.
Note: Kokkos Resilience currently requires the develop branch of Kokkos for compile-time view hooking capabilities.
Kokkos Resilience depends on Boost when KR_ENABLE_DATA_SPACES is enabled.
Kokkos Resilience optionally uses the Veloc library for efficient asynchronous checkpointing.
It is recommended to use the CMake presets to configure the project. More information on presets can be found here. Note that CMake 3.19 or higher is required to use presets, and to inherit from presets bundled with Kokkos Resilience, you need at least CMake 3.21.
Kokkos Resilience includes a set of presets in CMakePresets.json. These can be inherited from and represent common aaplication configurations.
| Path | Description |
|---|---|
| Kokkos_ROOT | Path to the root of the Kokkos install |
| VeloC_ROOT | Path to the root of VeloC if it is enabled (see below) |
| HDF5_ROOT | Path to the root of HDF5 if HDF5 is enabled (see below) |
| Variable | Default | Description |
|---|---|---|
| KR_ENABLE_VELOC | ON | Enables the VeloC backend |
| KR_ENABLE_TRACING | OFF | Enable performance tracing of resilience functions |
| KR_ENABLE_STDIO | OFF | Use stdio for manual checkpoint |
| KR_ENABLE_HDF5 | OFF | Add HDF5 support for manual checkpoint |
| KR_ENABLE_HDF5_PARALLEL | OFF | Use parallel version of HDF5 for manual checkpoint |
| KR_ENABLE_TESTS | ON | Enable tests in the build |
| KR_ENABLE_EXAMPLES | ON | Enable examples in the build |
Kokkos Resilience is designed to work with CMake projects, so using CMake is typically much easier. In your own project, call:
find_package(resilience)
target_link_libraries(target PRIVATE Kokkos::resilience) Ensure that the build or install directory of Kokkos Resilience is in CMAKE_PREFIX_PATH, or the variable
resilience_ROOT points to the build/install directory, or the variable resilience_DIR points to the location of
the Kokkos Resilience resilienceConfig.cmake file. This file is located in the root build directory of Kokkos
Resilience or the path <install directory>/share/resilience/cmake. See the
CMake documentation for more details on how packages
are found.