You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ MATLAB Tensor Core models
7
7
8
8
This repository provides accurate tensor core models written in MATLAB. It also includes parts of the model validation data which is used to refine the models as shown in [1].
9
9
10
-
The [models](models/) directory contains the MATLAB models of tensor core in different GPUs, all of which are build on the parameterised model in [Generic_BFMA_TC.m](models/tools/Generic_BFMA_TC.m). For example the [B200TC.m](models/B200TC.m) models the General Matrix Multiply (GEMM) based on the accurate model of a tensor core in the NVIDIA Blackwell B200 GPUs. In the current version of the toolbox, the models take matrices and input and output floating-point formats as inputs and multiply the matrices by using a recursive summation algorithm to accummulate the results of several tensor core invocations.
10
+
The [models](models/) directory contains the MATLAB models of tensor cores of several NVIDIA GPUs, all of which are build on the parameterised model in [Generic_BFMA_TC.m](models/tools/Generic_BFMA_TC.m). For example the [B200TC.m](models/B200TC.m) models the General Matrix Multiply (GEMM) based on the accurate model of a tensor core in the NVIDIA Blackwell B200 GPUs. In the current version of the toolbox, the models take matrices and input and output floating-point formats as inputs and multiply the matrices by using a recursive summation algorithm to accummulate the results of several tensor core invocations.
11
11
12
12
The initial analysis of the behaviour of GPU tensor cores is performed with the code available at [IEEE_HPEC2025_block_FMA_tests](https://github.com/faiziktk/IEEE_HPEC2025_block_FMA_tests).
13
13
It is based on the generalised testing methodology [2] which determines the following features of hardware computing mixed-precision inner products:
@@ -32,7 +32,7 @@ The [experiments](experiments/) directory contains various experiments with some
32
32
## Example: Using in-built models
33
33
34
34
The following example rounds two matrices to fp16 and multiplies them using the model of the B200 tensor core.
35
-
Note that B200TC compute the GEMM and alpha and beta scale factors are set to 1.
35
+
Note that B200TC computes the GEMM, with alpha and beta scale factors set to 1.
0 commit comments