██╗ ██╗████████╗███╗ ███╗██╗ ██╗███████╗███╗ ███╗ █████╗ ████████╗██╗ ██╗
██║ ██║╚══██╔══╝████╗ ████║██║ ██║██╔════╝████╗ ████║██╔══██╗╚══██╔══╝██║ ██║
██║ ██║ ██║ ██╔████╔██║██║ ██║███████╗██╔████╔██║███████║ ██║ ███████║
██║ ██║ ██║ ██║╚██╔╝██║██║ ██║╚════██║██║╚██╔╝██║██╔══██║ ██║ ██╔══██║
███████╗██║ ██║ ██║ ╚═╝ ██║╚██████╔╝███████║██║ ╚═╝ ██║██║ ██║ ██║ ██║ ██║
╚══════╝╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
- Compiler : Tested with
g++-14and above, with--std=c++23 -
- uses
<print>,<span>, and<ranges>.
- uses
- MPI : OpenMPI or equivalent
currently under development, not intended for use.
-
Configure CMake:
cmake -B build -DTESTING=ON -DCMAKE_BUILD_TYPE={Release|RelWithDebInfo|Debug} -
Build and Compile:
cmake --build build-
Default CMAKE_BUILD_TYPE = RelWithDebInfo
-
Default Testing = OFF
-
-
Run Benchmarks:
./run_bench_mm <min_pow_2_size> <max_pow_2_size> <n_iterations>- default config:
./run_bench_mm 2 9 100
- default config:
-
Matrix Multiplication
- Naive implementation
- Transpose before multiplication for cache friendly behaviour
- Strassen's Algorithm
-
Statistics
- Compute basic measures of central tendency
-
Benchmarking and Timing
- Using
chrono::steady_clock
- Using
-
Testing
- Implemented using GoogleTest and is fetched by CMake during build.
Hardware: AMD 5800X3D, 4.1GHz, 32GB DDR4
Compiler: gcc15
Flags: -std=c++23 -O2 -g
- Matrix algebra methods
- More robust benchmarking using RDTSC
- Parallel MPI implementations
- SIMD implementations
- CUDA implementations
Dr. Abhimanyu Bhadauria, Ph.D
![Double Precision Matrix Multiplication Matrix Multiplication [FP64]](/abhimanyu232/LitmusMath/raw/master/data/benchmarks/images/Matrix_Multiplication_%5BFP64%5D.png)
![Single Precision Matrix Multiplication Matrix Multiplication [FP34]](/abhimanyu232/LitmusMath/raw/master/data/benchmarks/images/Matrix_Multiplication_%5BFP32%5D.png)