cusparseLt - `cusparseMatMul`

Description

This sample demonstrates the usage of cuSPARSELt library and cusparseMatMul APIs for performing structured matrix - dense matrix multiplication by exploiting NVIDIA Sparse Tensor Cores, where the structured matrix is compressed with 50% sparsity ratio. The sample also demonstrates the usage of batched computation, Split-K, ReLU activation function, and bias.

cusparseLt Documentation

C_i = ReLU(A_i * B_i + C_i + bias)

where A is an structured matrix, and B, C, D are dense matrices

Building

Linux

make CUSPARSELT_PATH=<cusparseLt_path> CUDA_TOOLKIT_PATH=<cuda_toolkit_path>

or in alternative:

mkdir build
cd build
cmake -DCUSPARSELT_PATH=<cusparseLt_path> -DCMAKE_CUDA_COMPILER=<nvcc_path> ..
make

Support

Supported SM Architectures: SM 8.0, SM 8.6, SM 8.9, SM 9.0
Supported OSes: Linux, Windows
Supported CPU Architectures: x86_64, arm64
Supported Compilers: gcc, clang, Microsoft msvc, Nvidia HPC SDK nvc
Language: C++14

Prerequisites

CUDA 12.0 toolkit (or above) and compatible driver (see CUDA Driver Release Notes).
cusparseLt 0.6.1 or above
CMake 3.18 or above

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cusparseLt - `cusparseMatMul`

Description

Building

Support

Prerequisites

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

cusparseLt - cusparseMatMul

Description

Building

Support

Prerequisites

cusparseLt - `cusparseMatMul`