ParLib is a CPU-GPU collaborative parallel library built atop CMake, with GCC and NVCC mixed compilation.
- A modern C++ compiler compliant with the C++17 standard (gcc >= 9)
- CMake (>= 2.8)
- Facebook folly library (>= v2022.11.28.00)
- nvcc (>= 12.4)
- gflags (>= 2.2)
First, clone the project and install dependencies on your environment.
# Clone the project (SSH).
# Make sure you have your public key has been uploaded to GitHub!
git clone git@github.com:luck-seu/parlib.git
# Install dependencies.
$SRC_DIR=`parlib` # top-level parlib source dir
$cd $SRC_DIR
$./dependencies.shThen, build the project.
./build.shMatrix multiplication by CPU multicore parallelism:
./bin/matrix_mul_gpu_exec -n_rows 5000 -n_cols 2000 -lb 0 -ub 100 -parallelism 80 -n_workers 256Matrix multiplication by GPU SIMD parallelism:
./bin/matrix_mul_gpu_exec -n_rows 5000 -n_cols 2000 -lb 0 -ub 100 -use_gpuwhere,
-n_rowsspecifies the #rows of the matrix;-n_colsspecifies the #cols of the matrix (the matrix multiplication takes an matrix and its transpose as input, hecne we only need to specify the shape of a single matrix);-lbspecifies the minimum value that could exists in the matrix;-ubspecifies the maximum value that could exists in the matrix;-parallelismspecifies the maximum parallelism on CPU;-n_workersspecifies the granularity of the task unit (i.e., the task will be partitioned inton_workerstask units, each of which will be processed by a thread (i.e., worker));-use_gpuindicates whether the GPU acceleration is enabled. The default parallelism of GPU is 64 * 64;
Applications that should be compiled by nvcc should ended with _gpu.cpp (e.g., apps/client/matrix_mul_gpu.cpp).
This project originates from several parallel systems developed at the Shenzhen Institute of Computer Science (SICS):
Thanks to their authors: Shuhao Liu (SICS), Xiaoke Zhu (Beihang University), and Yang Liu (Beihang University).