To use Horovod with the Intel(R) oneAPI Collective Communications Library (oneCCL), follow the steps below.
- Install oneCCL.
To install oneCCL, follow these steps.
Source setvars.sh to start using oneCCL.
$ source <install_dir>/env/setvars.sh- Install the Intel(R) MPI Library.
To install the Intel MPI Library, follow these steps.
Source the mpivars.sh script to establish the proper environment settings.
$ source <installdir_MPI>/intel64/bin/mpivars.sh release_mt- Set HOROVOD_CPU_OPERATIONS variable
$ export HOROVOD_CPU_OPERATIONS=CCL- Install Horovod from source code
$ python setup.py build
$ python setup.py installor via pip
$ pip install horovodAdvanced: You can specify the affinity for BackgroundThread with the HOROVOD_CCL_BGT_AFFINITY environment variable. See the instructions below.
Set Horovod background thread affinity:
$ export HOROVOD_CCL_BGT_AFFINITY=c0where c0 is a core ID to attach background thread to.
Set the number of oneCCL workers:
$ export CCL_WORKER_COUNT=Xwhere X is the number of threads you’d like to dedicate for driving communication. This means that for every rank there are X oneCCL workers available.
Set oneCCL workers affinity:
$ export CCL_WORKER_AFFINITY=c1,c2,..,cXwhere c1,c2,..,cX are core IDs dedicated to oneCCL workers (uses X ‘last’ cores by default). This variable sets affinity for all oneCCL workers (CCL_WORKER_COUNT * Number of ranks per node) that are available for all the ranks running on one node.