Skip to content

Commit 2446158

Browse files
committed
debug
1 parent 1f228fe commit 2446158

File tree

1 file changed

+12
-0
lines changed

1 file changed

+12
-0
lines changed

chapter_accelerator/Programming_Methods.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,18 @@ and $\alpha$ and $\beta$ are parameters provided by users.
6363

6464
### High-level Computation Operators {#sec-accelerator-use-cublas}
6565

66+
Using an operator acceleration library directly is the most
67+
straightforward method. NVIDIA offers two types of operator libraries:
68+
cuBLAS and cuDNN. cuBLAS provides an interface for leveraging Tensor
69+
Cores to accelerate GEMM operations, while cuDNN offers an interface to
70+
hasten neural network operations. To utilize Tensor Cores via cuBLAS
71+
doing GEMM, we can use function `cublasGemmEx`, its signature is shown
72+
in Code [\[lst:cublasGemmEx\]](#lst:cublasGemmEx){reference-type="ref"
73+
reference="lst:cublasGemmEx"}.
74+
75+
[caption={Fragment types}, label={lst:cublasGemmEx}]
76+
cublasStatus_t cublasGemmEx(cublasHandle_t handle, cublasOperation_t transa, cublasOperation_t transb, int m, int n, int k, const void *alpha, const void *A, cudaDataType_t Atype, int lda, const void *B, cudaDataType_t Btype, int ldb, const void *beta, void *C, cudaDataType_t Ctype, int ldc, cublasComputeType_t computeType, cublasGemmAlgo_t algo)
77+
6678

6779
[^1]: available at
6880
<https://docs.nvidia.com/cuda/inline-ptx-assembly/index.html>

0 commit comments

Comments
 (0)