Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
linear-algebra mpi cuda scalapack matrix-multiplication gpu-acceleration rocm matmul communication-optimal pdgemm
-
Updated
Dec 4, 2025 - C++