@@ -14,11 +14,57 @@ Benchmarks for development.
1414
1515** Default flags:**
1616
17+ ** Version: v0.9**
18+ ```
19+ Setup
20+ -------
21+ VERSION: v0.9
22+ TARGET: CUDNN
23+ CORES: 16
24+ EPOCHS: 5
25+ -- C++ flags: -fopenmp
26+ -- C++ flags (release): -O3 -march=native -mtune=native -Ofast -msse -mfpmath=sse -ffast-math -ftree-vectorize
27+ -- C++ flags (debug): -O0 -g
28+
29+ Training/Evaluation:
30+ --------------------
31+ Epoch 1
32+ Batch 300 softmax4 ( loss[softmax_cross_entropy]=0.2405 metric[categorical_accuracy]=0.9279 ) -- 0.0049 secs/batch
33+ 1.4780 secs/epoch
34+ Epoch 2
35+ Batch 300 softmax4 ( loss[softmax_cross_entropy]=0.0799 metric[categorical_accuracy]=0.9760 ) -- 0.0043 secs/batch
36+ 1.2766 secs/epoch
37+ Epoch 3
38+ Batch 300 softmax4 ( loss[softmax_cross_entropy]=0.0507 metric[categorical_accuracy]=0.9841 ) -- 0.0041 secs/batch
39+ 1.2286 secs/epoch
40+
41+ Memory:
42+ --------
43+ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
44+ 4516 salvaca+ 20 0 12,3g 1,8g 544700 R 350,0 11,7 0:13.12 cifar_conv
45+
46+ GPU Memory:
47+ Wed Feb 17 12:56:16 2021
48+ +-----------------------------------------------------------------------------+
49+ | NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
50+ |-------------------------------+----------------------+----------------------+
51+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
52+ | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
53+ | | | MIG M. |
54+ |===============================+======================+======================|
55+ | 0 GeForce GTX 1070 On | 00000000:09:00.0 On | N/A |
56+ | 49% 67C P2 100W / 190W | 1939MiB / 8118MiB | 83% Default |
57+ | | | N/A |
58+ +-------------------------------+----------------------+----------------------+
59+
60+ ```
61+
62+ ** Version: v0.7**
1763```
1864Setup
1965-------
2066VERSION: v0.7
21- TARGET: CPU
67+ TARGET: GPU
2268CORES: 16
2369EPOCHS: 1
2470C++ flags (release): -O3
@@ -47,13 +93,45 @@ GPU Memory:
4793| 0 GeForce GTX 1070 Off | 00000000:09:00.0 On | N/A |
4894| 54% 71C P2 76W / 190W | 1188MiB / 8118MiB | 71% Default |
4995+-------------------------------+----------------------+----------------------+
50-
5196```
5297
5398#### CPU only
5499
55100** Default flags:**
56101
102+
103+ ** Version: v0.9**
104+ ```
105+ Setup
106+ -------
107+ VERSION: v0.7
108+ TARGET: CPU
109+ CORES: 16
110+ EPOCHS: 5
111+ -- C++ flags: -fopenmp
112+ -- C++ flags (release): -O3 -march=native -mtune=native -Ofast -msse -mfpmath=sse -ffast-math -ftree-vectorize
113+ -- C++ flags (debug): -O0 -g
114+
115+ Training/Evaluation:
116+ --------------------
117+ Epoch 1
118+ Batch 300 softmax4 ( loss[softmax_cross_entropy]=0.2374 metric[categorical_accuracy]=0.9297 ) -- 0.0406 secs/batch
119+ 12.1930 secs/epoch
120+ Epoch 2
121+ Batch 300 softmax4 ( loss[softmax_cross_entropy]=0.0790 metric[categorical_accuracy]=0.9764 ) -- 0.0337 secs/batch
122+ 10.1122 secs/epoch
123+ Epoch 3
124+ Batch 300 softmax4 ( loss[softmax_cross_entropy]=0.0515 metric[categorical_accuracy]=0.9839 ) -- 0.0485 secs/batch
125+ 14.5351 secs/epoch
126+
127+ Memory:
128+ --------
129+ PID USER PRI NI VIRT RES S CPU% MEM% TIME+ Command
130+ 309406 salvaca+ 20 0 872076 318928 13228 R 1444 1,9 11:58.60 mnist_mlp
131+
132+ ```
133+
134+ ** Version: v0.7**
57135```
58136Setup
59137-------
@@ -179,6 +257,49 @@ Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
179257
180258** Default flags:**
181259
260+ ** Version: v0.9**
261+ ```
262+ Setup
263+ -------
264+ VERSION: v0.9
265+ TARGET: CUDNN
266+ CORES: 16
267+ EPOCHS: 1
268+ -- C++ flags: -fopenmp
269+ -- C++ flags (release): -O3 -march=native -mtune=native -Ofast -msse -mfpmath=sse -ffast-math -ftree-vectorize
270+ -- C++ flags (debug): -O0 -g
271+
272+ Training/Evaluation:
273+ --------------------
274+ 5 epochs of 500 batches of size 100
275+ Epoch 1
276+ Batch 500 softmax6 ( loss[softmax_cross_entropy]=1.6524 metric[categorical_accuracy]=0.3853 ) -- 0.0074 secs/batch
277+ 3.6942 secs/epoch
278+ Epoch 2
279+ Batch 500 softmax6 ( loss[softmax_cross_entropy]=1.1562 metric[categorical_accuracy]=0.5863 ) -- 0.0069 secs/batch
280+ 3.4294 secs/epoch
281+ Epoch 3
282+ Batch 500 softmax6 ( loss[softmax_cross_entropy]=0.9170 metric[categorical_accuracy]=0.6756 ) -- 0.0067 secs/batch
283+ 3.3702 secs/epoch
284+
285+ Memory:
286+ --------
287+ PID USER PRI NI VIRT RES S CPU% MEM% TIME+ Command
288+ 12366 salvaca+ 20 0 11,137g 1,011g 191004 R 1562 6,5 0:28.25 cifar_conv
289+
290+ GPU Memory:
291+ +-----------------------------------------------------------------------------+
292+ | NVIDIA-SMI 440.82 Driver Version: 440.82 CUDA Version: 10.2 |
293+ |-------------------------------+----------------------+----------------------+
294+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
295+ | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
296+ |===============================+======================+======================|
297+ | 0 GeForce GTX 1070 Off | 00000000:09:00.0 On | N/A |
298+ | 58% 77C P2 88W / 190W | 1369MiB / 8118MiB | 94% Default |
299+ +-------------------------------+----------------------+----------------------+
300+ ```
301+
302+ ** Version: v0.7**
182303```
183304Setup
184305-------
@@ -213,15 +334,49 @@ GPU Memory:
213334| 0 GeForce GTX 1070 Off | 00000000:09:00.0 On | N/A |
214335| 58% 77C P2 88W / 190W | 1369MiB / 8118MiB | 94% Default |
215336+-------------------------------+----------------------+----------------------+
216-
217-
218337```
219338
220339
221340#### CPU only
222341
223342** Default flags:**
224343
344+ ** Version: v0.9**
345+ ```
346+ Setup
347+ -------
348+ VERSION: v0.7
349+ TARGET: CPU
350+ CORES: 16
351+ EPOCHS: 3
352+ -- C++ flags: -fopenmp
353+ -- C++ flags (release): -O3 -march=native -mtune=native -Ofast -msse -mfpmath=sse -ffast-math -ftree-vectorize
354+ -- C++ flags (debug): -O0 -g
355+
356+ Training/Evaluation:
357+ --------------------
358+ 3 epochs of 500 batches of size 100
359+ Epoch 1
360+ Batch 500 softmax6 ( loss[softmax_cross_entropy]=1.6671 metric[categorical_accuracy]=0.3816 ) -- 0.1481 secs/batch
361+ 74.0601 secs/epoch
362+ Epoch 2
363+ Batch 500 softmax6 ( loss[softmax_cross_entropy]=1.1738 metric[categorical_accuracy]=0.5817 ) -- 0.1678 secs/batch
364+ 83.8880 secs/epoch
365+ Epoch 3
366+ Batch 500 softmax6 ( loss[softmax_cross_entropy]=0.9311 metric[categorical_accuracy]=0.6737 ) -- 0.1466 secs/batch
367+ 73.3027 secs/epoch
368+ Evaluate with batch size 100
369+ Batch 100 softmax6 ( loss[softmax_cross_entropy]=0.9291 metric[categorical_accuracy]=0.6796 ) --
370+
371+ Memory:
372+ --------
373+ PID USER PRI NI VIRT RES S CPU% MEM% TIME+ Command
374+ 335469 salvaca+ 20 0 10,4g 1,4g 127808 R 1406 9,2 32:51.72 cifar_conv
375+
376+ ```
377+
378+ ** Version: v0.7**
379+
225380```
226381Setup
227382-------
0 commit comments