Skip to content

Commit c3c2b12

Browse files
committed
debug
1 parent 357b68e commit c3c2b12

File tree

1 file changed

+0
-11
lines changed

1 file changed

+0
-11
lines changed

chapter_accelerator/Performance_Optimization_Methods.md

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -93,15 +93,4 @@ used to compute the entire matrix $C$.
9393
Eigen is used to generate data and compute the GEMM result on the CPU.
9494
In addition, error computing and time profiling code are implemented for
9595
the GPU computing result. For details, see
96-
[first_attempt.cu](https://github.com/openmlsys/openmlsys-cuda/blob/main/first_attempt.cu).
97-
After the program is compiled and executed, output results are as
98-
follows:
9996

100-
Average time: 48.961 ms
101-
Max error: 0.000092
102-
103-
The peak GPU throughput can be approximated by using the following
104-
formula: 2 $\times$ Frequency $\times$ Number of single-precision
105-
compute units. The number of single-precision compute units equals the
106-
number of SMs in the GPU multiplied by the number of single-precision
107-
compute units in each SM. The results are as follows:

0 commit comments

Comments
 (0)