Quick Start

Using the Stan Math Library

To use the Stan math library include exactly one of the following header files.

include	contents	also includes
`stan/math/prim.hpp`	primitives	n/a
`stan/math/rev.hpp`	reverse-mode autodiff	`prim.hpp`
`stan/math/fwd.hpp`	forward-mode autodiff	`prim.hpp`
`stan/math/mix.hpp`	mixed-mode autodiff	`prim.hpp, fwd.hpp, rev.hpp`

These top-level header files ensure that all necessary traits files are included in the appropriate order. For more detail, see

Multi-core processing with MPI

The message passing interface (MPI) allows the exchange of messages between different processes. We can use MPI to parallelize the computation of a single log probability computation by using multiple processes. The target audience for MPI are those with large computer clusters. For users looking for parallel computation on a single computer, please turn to a threading based approach, which is easier to use and provides similar performance gains.

Supported platforms

Stan supports MPI on Mac OS X and Linux. Windows is not supported, though we have heard that some people have successfully used it.

Requirements for MPI

A base MPI installation must be installed on the system. See the instructions from boost.mpi to verify that there is a working MPI system. For Mac OS X and Linux, any MPI installation that works with boost.mpi is supported. The two major open-source base MPI implementations are mpich and openMPI. The Math library is tested with these two implementations while others supported by boost.mpi may work as well. The base MPI installation provides the command line tools

mpicxx: The recommended compiler command to use when building any MPI application.
mpirun: Wrapper binary used to start a MPI enabled binary on a given machine.

Please ask your system administrator for details on how to compile, execute, and submit MPI applications on a cluster.

Installation on Mac OS X

Install mpich from MacPorts or homebrew.

Installation on Linux

The package distribution system on your version of linux should have pre-built mpich (or openmpi) available. In addition to that, you must have the following packages installed (Ubuntu package names listed): python-dev, libxml2-dev, libxslt-dev, and you may be required to add the following to your make/local: LDLIBS+=-lpthread.

Boost configuration and build

Stan builds it's own boost.mpi and boost.serialization libraries and installs these into its library subfolder. If the operating system provides these Boost libraries and it's required to use them, there is additional configuration that needs to be done (through make/local) to use that installation. The boost libraries are built using the boost build system. Boost build will attempt to auto-detect the MPI installation specifics on your system and the toolset to use. Should Boost's auto-detect fail or a specific configuration be required, then users can configure the boost build system through the configuration file stan-math/lib/boost_1.xx.x/user_config.jam manually as needed.

Recommended compiler tools

We strongly recommend to use the mpicxx command to build any program using MPI within Math. While it is possible to change the compiler used with these commands (openMPI has a -cxx= option, for example), this can only be done with great caution. The complication is that during compilation of the base MPI libraries the exact bit representation of each type is analyzed and strong deviations due to compiler changes may lead to unexpected behavior. In case of compiler mismatch between the base MPI libraries and boost.mpi (and Math) changes in the compiler ABI can lead to unexpected segfaults. Therefore, we recommend to use the mpicxx as compiler and do not recommend to deviate from the compiler used to build MPI. Often this means to use the system default compiler which may be rather old and not ideal for Stan. In such cases a more modern gcc (if gcc is the system compiler) can be considered as long as no ABI changes are known.

Setting up the Math library with MPI

Stan uses the boost.mpi library to interface with the installed MPI implementation. boost.mpi is built automatically by the Math library when the Math library is configured for MPI. To configure MPI for the Math library,

Ensure that a base MPI installation is available and accessible on the system. See Requirements.
Open a text file called make/local; if it does not exist, create one.
Add these lines to the make/local file:

STAN_MPI=true
CXX=mpicxx
TBB_CXX_TYPE=gcc

Optional: instead of using CXX=mpicxx, the user can specify the compiler with the proper compiler and linker options needed to build an MPI enabled binary (the command mpicxx -show displays for mpich what is executed / openmpi uses mpicxx -show-me), but please read the note on compilers above.
Clean all binaries. After changing configuration through make/local, all the tests should be rebuilt. Please type:

make clean-all

Once the Math library is configured for MPI, the tests will be built with MPI. Note that the boost.mpi and boost.serialization library are build and linked against dynamically.

Enabling GPUs

OpenCL is an open-source framework for writing programs that utilize a platform with heterogeneous hardware. Stan uses OpenCL to design the GPU routines for the Cholesky Decomposition and it's derivative. Other routines will be available in the future. These routines are suitable for programs which require solving large NxM matrices (N>600) such as algorithms that utilize large covariance matrices.

Requirements

Users must have suitable hardware (e.g. Nvidia or AMD gpu) that supports OpenCL 1.2, valid OpenCL driver and a suitable C/C++ compiler installed on their computer.

Installation

Linux

The following guide is for Ubuntu, but it should be similar for any other Linux distribution. You should have the GNU compiler suite or clang compiler installed beforehand.

Install the Nvidia CUDA toolkit and clinfo tool if you have an Nvidia GPU

apt update
apt install nvidia-cuda-toolkit clinfo

Those with AMD devices can install the OpenCL driver available through

apt install -y libclc-amdgcn mesa-opencl-icd clinfo

If your device is not supported by the current drivers available you can try Paulo Miguel PPA

add-apt-repository ppa:paulo-miguel-dias/mesa 
apt-get update
apt-get install libclc-amdgcn mesa-opencl-icd

MacOS

Mac's should already have the OpenCL driver installed if you have the appropriate hardware.

Note that if you are building on a mac laptop you may not have a GPU device. You can still use the OpenCL routines for parallelization on your CPU.

Windows

Install the latest Rtools suite if you don't already have it. During the installation make sure that the 64 bit toolchain is installed. You also need to verify that you have the System Enviroment variable Path updated to include the path to the g++ compiler (<Rtools installation path>\mingw_64\bin).

If you have a Nvidia card, install the latest Nvidia CUDA toolkit. AMD users should use AMD APP SDK.

Users can check that their installation is valid by downloading and running clinfo.

download clinfo.exe

Setting up the Math Library to run on a GPU

To turn on GPU computation:

Check and record what device and platform you would like to use with clinfo; you will the platform and device index such as the printout below

clinfo -l
# Platform #0: Clover
# Platform #1: Portable Computing Language
#  `-- Device #0: pthread-AMD Ryzen Threadripper 2950X 16-Core Processor
# Platform #2: NVIDIA CUDA
#  +-- Device #0: TITAN Xp
#  `-- Device #1: GeForce GTX 1080 Ti

In the top level of the math library, open a text file called make/local. If you are using OpenCL functionalities via Cmdstan, you can also open the text file in the make folder of Cmdstan (cmdstan/make/local). If it does not exist, create one.
Add these lines to the make/local file:

STAN_OPENCL=true
OPENCL_DEVICE_ID=${CHOSEN_INDEX}
OPENCL_PLATFORM_ID=${CHOSEN_INDEX}

where the user will replace ${CHOSEN_INDEX} with the index of the device and platform they would like to use. In most cases these two will be 0. If you are using Windows append the following lines at the end of the make/local file in order to link with the appropriate OpenCL library:

Nvidia

CC = g++
LDFLAGS_OPENCL= -L"$(CUDA_PATH)\lib\x64" -lOpenCL

AMD

CC = g++
LDFLAGS_OPENCL= -L"$(AMDAPPSDKROOT)lib\x86_64" -lOpenCL

Running Tests with OpenCL

Once you have done the above step, runTests.py should execute with the GPU enabled. All tests will match the phrase *_opencl_* and tests can be filtered such as

./runTests.py test/unit -f opencl

Using the OpenCL backend

We currently have support for the following methods

bernoulli_logit_glm_lpmf
cholesky_decompose
categorical_logit_glm_lpmf
gp_exp_quad_cov
mdivide_right_tri
mdivide_left_tri
multiplication
neg_binomial_2_log_glm_lpmf
normal_id_glm_lpdf
ordered_logistic_glm_lpmf
poisson_log_glm_lpmf

TODO(Rok): provide example models for GLMs and GP

Troubleshooting

If you see the following error:

clBuildProgram CL_OUT_OF_HOST_MEMORY: Unknown error -6

you have most likely run of out available memory on your host system. OpenCL kernels are compiled just-in-time at the start of any OpenCL-enabled Stan/Stan Math program and thus may require more memory than when running without CPU support. If several CmdStan processes are started at the same time each process needs that memory for a moment. If there is not enough memory to compile OpenCL kernels, you will experience this error. Try running your model with less processes. Upgrading your GPU driver may also reduce the RAM usage for OpenCL kernel compilation.

Home | Users | Developers

[Home]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Quick Start

Using the Stan Math Library

Multi-core processing with MPI

Supported platforms

Requirements for MPI

Installation on Mac OS X

Installation on Linux

Boost configuration and build

Recommended compiler tools

Setting up the Math library with MPI

Enabling GPUs

Requirements

Installation

Linux

MacOS

Windows

Setting up the Math Library to run on a GPU

Running Tests with OpenCL

Using the OpenCL backend

Troubleshooting

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally