Skip to content

Commit 70ba612

Browse files
committed
Repository for E1180 results
1 parent b6aac63 commit 70ba612

File tree

10 files changed

+553
-0
lines changed

10 files changed

+553
-0
lines changed

Code/matmul/E1180/matmul.out.1

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
=========================================================================================================================
2+
Time to run matmul_8x8x8_col_row 100000000 times = 0.769876 seconds (1.330084e+11 flops)
3+
Time to run matmul_8x8x8_col_col 100000000 times = 0.890074 seconds (1.150467e+11 flops)
4+
Time to run matmul_8x8x8_col_row_just_loads 100000000 times = 0.398886 seconds (2.567147e+11 flops)
5+
Time to run matmul_8x8x8_col_row_with_loads 100000000 times = 0.776930 seconds (1.318008e+11 flops)
6+
Time to run matmul_8x8x8_col_col_with_loads 100000000 times = 0.957839 seconds (1.069073e+11 flops)
7+
Time to run matmul_8x8x8_col_row_with_loads_and_stores 100000000 times = 1.532985 seconds (6.679780e+10 flops)
8+
Time to run matmul_8x8x8_col_col_with_loads_and_stores 100000000 times = 1.828421 seconds (5.600461e+10 flops)
9+
Time to run matmul_8x8x16_col_row_with_loads_and_stores 100000000 times = 2.342064 seconds (8.744423e+10 flops)
10+
Time to run matmul_8x8x24_col_row_with_loads_and_stores 100000000 times = 3.108504 seconds (9.882569e+10 flops)
11+
Time to run matmul_8x8x32_col_row_with_loads_and_stores 100000000 times = 3.872548 seconds (1.057701e+11 flops)
12+
Time to run matmul_8x8x40_col_row_with_loads_and_stores 100000000 times = 4.644594 seconds (1.102357e+11 flops)
13+
Time to run matmul_8x8x48_col_row_with_loads_and_stores 100000000 times = 5.396578 seconds (1.138499e+11 flops)
14+
Time to run matmul_8x8x56_col_row_with_loads_and_stores 100000000 times = 6.150837 seconds (1.165370e+11 flops)
15+
Time to run matmul_8x8x64_col_row_with_loads_and_stores 100000000 times = 6.908037 seconds (1.185865e+11 flops)
16+
Time to run matmul_8x8x64_col_col_with_loads_and_stores 100000000 times = 8.524576 seconds (9.609862e+10 flops)
17+
Time to run matmul_8x8x64_col_col_with_loads_and_stores_store_B 100000000 times = 8.815192 seconds (9.293048e+10 flops)
18+
Time to run matmul_16x8x64_col_col_with_loads_and_stores 100000000 times = 15.768924 seconds (1.039006e+11 flops)
19+
Time to run matmul_24x8x64_col_col_with_loads_and_stores 100000000 times = 23.329507 seconds (1.053430e+11 flops)
20+
Time to run matmul_32x8x64_col_col_with_loads_and_stores 100000000 times = 30.236062 seconds (1.083739e+11 flops)
21+
Time to run matmul_40x8x64_col_col_with_loads_and_stores 100000000 times = 37.139175 seconds (1.102879e+11 flops)
22+
Time to run matmul_48x8x64_col_col_with_loads_and_stores 100000000 times = 44.042653 seconds (1.116009e+11 flops)
23+
Time to run matmul_56x8x64_col_col_with_loads_and_stores 100000000 times = 50.933473 seconds (1.125861e+11 flops)
24+
Time to run matmul_64x8x64_col_col_with_loads_and_stores 100000000 times = 57.852084 seconds (1.132820e+11 flops)
25+
26+
Performance counter stats for 'numactl -C 8 ./matmul':
27+
28+
316,780.18 msec task-clock:u # 0.999 CPUs utilized
29+
0 context-switches:u # 0.000 /sec
30+
0 cpu-migrations:u # 0.000 /sec
31+
11,873 page-faults:u # 37.480 /sec
32+
3,756,244,221,231 instructions:u # 2.77 insn per cycle (38.45%)
33+
1,357,548,589,797 cycles:u # 4.285 GHz (46.15%)
34+
6,281,174,371 branches:u # 19.828 M/sec (46.16%)
35+
5,360,488 branch-misses:u # 0.09% of all branches (46.15%)
36+
1,151,249,509,740 L1-dcache-loads:u # 3.634 G/sec (38.47%)
37+
7,050,668 L1-dcache-load-misses:u # 0.00% of all L1-dcache accesses (15.39%)
38+
7,049,902 LLC-loads:u # 22.255 K/sec (15.39%)
39+
3,763 LLC-load-misses:u # 0.05% of all LL-cache accesses (15.38%)
40+
474,687,354,193 L1-icache-loads:u # 1.498 G/sec (23.08%)
41+
646,003 L1-icache-load-misses:u # 0.00% of all L1-icache accesses (30.76%)
42+
20,516 dTLB-load-misses:u (23.07%)
43+
2,990 iTLB-load-misses:u (30.76%)
44+
65,107,925 L1-dcache-prefetches:u # 205.530 K/sec (30.76%)
45+
46+
317.006601119 seconds time elapsed
47+
48+
316.704260000 seconds user
49+
0.069967000 seconds sys
50+
51+
52+
=========================================================================================================================
53+
Time to run matmul_8x8x8_col_row 100000000 times = 0.768971 seconds (1.331650e+11 flops)
54+
Time to run matmul_8x8x8_col_col 100000000 times = 0.890030 seconds (1.150523e+11 flops)
55+
Time to run matmul_8x8x8_col_row_just_loads 100000000 times = 0.399246 seconds (2.564838e+11 flops)
56+
Time to run matmul_8x8x8_col_row_with_loads 100000000 times = 0.778262 seconds (1.315752e+11 flops)
57+
Time to run matmul_8x8x8_col_col_with_loads 100000000 times = 0.958104 seconds (1.068777e+11 flops)
58+
Time to run matmul_8x8x8_col_row_with_loads_and_stores 100000000 times = 1.532526 seconds (6.681778e+10 flops)
59+
Time to run matmul_8x8x8_col_col_with_loads_and_stores 100000000 times = 1.827643 seconds (5.602845e+10 flops)
60+
Time to run matmul_8x8x16_col_row_with_loads_and_stores 100000000 times = 2.340351 seconds (8.750825e+10 flops)
61+
Time to run matmul_8x8x24_col_row_with_loads_and_stores 100000000 times = 3.108394 seconds (9.882917e+10 flops)
62+
Time to run matmul_8x8x32_col_row_with_loads_and_stores 100000000 times = 3.871464 seconds (1.057998e+11 flops)
63+
Time to run matmul_8x8x40_col_row_with_loads_and_stores 100000000 times = 4.635839 seconds (1.104439e+11 flops)
64+
Time to run matmul_8x8x48_col_row_with_loads_and_stores 100000000 times = 5.391231 seconds (1.139628e+11 flops)
65+
Time to run matmul_8x8x56_col_row_with_loads_and_stores 100000000 times = 6.147851 seconds (1.165936e+11 flops)
66+
Time to run matmul_8x8x64_col_row_with_loads_and_stores 100000000 times = 6.903446 seconds (1.186654e+11 flops)
67+
Time to run matmul_8x8x64_col_col_with_loads_and_stores 100000000 times = 8.516832 seconds (9.618600e+10 flops)
68+
Time to run matmul_8x8x64_col_col_with_loads_and_stores_store_B 100000000 times = 8.817721 seconds (9.290383e+10 flops)
69+
Time to run matmul_16x8x64_col_col_with_loads_and_stores 100000000 times = 15.757389 seconds (1.039766e+11 flops)
70+
Time to run matmul_24x8x64_col_col_with_loads_and_stores 100000000 times = 23.322153 seconds (1.053762e+11 flops)
71+
Time to run matmul_32x8x64_col_col_with_loads_and_stores 100000000 times = 30.221118 seconds (1.084275e+11 flops)
72+
Time to run matmul_40x8x64_col_col_with_loads_and_stores 100000000 times = 37.130963 seconds (1.103122e+11 flops)
73+
Time to run matmul_48x8x64_col_col_with_loads_and_stores 100000000 times = 44.022626 seconds (1.116517e+11 flops)
74+
Time to run matmul_56x8x64_col_col_with_loads_and_stores 100000000 times = 50.917068 seconds (1.126224e+11 flops)
75+
Time to run matmul_64x8x64_col_col_with_loads_and_stores 100000000 times = 57.819558 seconds (1.133457e+11 flops)
76+
77+
Performance counter stats for 'numactl -C 8 ./matmul':
78+
79+
316,754.96 msec task-clock:u # 0.999 CPUs utilized
80+
0 context-switches:u # 0.000 /sec
81+
0 cpu-migrations:u # 0.000 /sec
82+
11,881 page-faults:u # 37.508 /sec
83+
3,759,143,257,580 instructions:u # 2.77 insn per cycle (38.46%)
84+
1,357,809,970,649 cycles:u # 4.287 GHz (46.16%)
85+
6,558,107,463 branches:u # 20.704 M/sec (46.16%)
86+
5,149,489 branch-misses:u # 0.08% of all branches (46.17%)
87+
1,151,758,469,873 L1-dcache-loads:u # 3.636 G/sec (38.48%)
88+
6,402,139 L1-dcache-load-misses:u # 0.00% of all L1-dcache accesses (15.37%)
89+
6,470,866 LLC-loads:u # 20.429 K/sec (15.37%)
90+
6,179 LLC-load-misses:u # 0.10% of all LL-cache accesses (15.39%)
91+
474,965,018,980 L1-icache-loads:u # 1.499 G/sec (23.08%)
92+
569,979 L1-icache-load-misses:u # 0.00% of all L1-icache accesses (30.77%)
93+
19,478 dTLB-load-misses:u (23.08%)
94+
3,454 iTLB-load-misses:u (30.77%)
95+
65,201,429 L1-dcache-prefetches:u # 205.842 K/sec (30.77%)
96+
97+
317.017656934 seconds time elapsed
98+
99+
316.700865000 seconds user
100+
0.049959000 seconds sys
101+
102+

Code/matmul/E1180/matmul.out.120

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
=========================================================================================================================
2+
Time to run matmul_8x8x8_col_row 100000000 times = 0.778376 seconds (1.578671e+13 flops)
3+
Time to run matmul_8x8x8_col_col 100000000 times = 0.894057 seconds (1.374409e+13 flops)
4+
Time to run matmul_8x8x8_col_row_just_loads 100000000 times = 0.457109 seconds (2.688201e+13 flops)
5+
Time to run matmul_8x8x8_col_row_with_loads 100000000 times = 0.781024 seconds (1.573318e+13 flops)
6+
Time to run matmul_8x8x8_col_col_with_loads 100000000 times = 0.962016 seconds (1.277318e+13 flops)
7+
Time to run matmul_8x8x8_col_row_with_loads_and_stores 100000000 times = 1.537506 seconds (7.992165e+12 flops)
8+
Time to run matmul_8x8x8_col_col_with_loads_and_stores 100000000 times = 1.833195 seconds (6.703052e+12 flops)
9+
Time to run matmul_8x8x16_col_row_with_loads_and_stores 100000000 times = 2.404619 seconds (1.022033e+13 flops)
10+
Time to run matmul_8x8x24_col_row_with_loads_and_stores 100000000 times = 3.172877 seconds (1.161848e+13 flops)
11+
Time to run matmul_8x8x32_col_row_with_loads_and_stores 100000000 times = 3.936924 seconds (1.248487e+13 flops)
12+
Time to run matmul_8x8x40_col_row_with_loads_and_stores 100000000 times = 4.697589 seconds (1.307905e+13 flops)
13+
Time to run matmul_8x8x48_col_row_with_loads_and_stores 100000000 times = 5.453975 seconds (1.351821e+13 flops)
14+
Time to run matmul_8x8x56_col_row_with_loads_and_stores 100000000 times = 6.266083 seconds (1.372724e+13 flops)
15+
Time to run matmul_8x8x64_col_row_with_loads_and_stores 100000000 times = 7.023742 seconds (1.399596e+13 flops)
16+
Time to run matmul_8x8x64_col_col_with_loads_and_stores 100000000 times = 8.690283 seconds (1.131194e+13 flops)
17+
Time to run matmul_8x8x64_col_col_with_loads_and_stores_store_B 100000000 times = 8.971681 seconds (1.095714e+13 flops)
18+
Time to run matmul_16x8x64_col_col_with_loads_and_stores 100000000 times = 16.033676 seconds (1.226219e+13 flops)
19+
Time to run matmul_24x8x64_col_col_with_loads_and_stores 100000000 times = 23.689740 seconds (1.244893e+13 flops)
20+
Time to run matmul_32x8x64_col_col_with_loads_and_stores 100000000 times = 30.740767 seconds (1.279135e+13 flops)
21+
Time to run matmul_40x8x64_col_col_with_loads_and_stores 100000000 times = 37.766459 seconds (1.301472e+13 flops)
22+
Time to run matmul_48x8x64_col_col_with_loads_and_stores 100000000 times = 46.561308 seconds (1.266769e+13 flops)
23+
Time to run matmul_56x8x64_col_col_with_loads_and_stores 100000000 times = 51.786833 seconds (1.328770e+13 flops)
24+
Time to run matmul_64x8x64_col_col_with_loads_and_stores 100000000 times = 58.801122 seconds (1.337444e+13 flops)
25+
26+
Performance counter stats for './matmul':
27+
28+
38,021,501.78 msec task-clock:u # 87.172 CPUs utilized
29+
0 context-switches:u # 0.000 /sec
30+
0 cpu-migrations:u # 0.000 /sec
31+
1,415,730 page-faults:u # 37.235 /sec
32+
451,037,351,562,395 instructions:u # 2.77 insn per cycle (38.46%)
33+
162,937,455,899,278 cycles:u # 4.285 GHz (46.15%)
34+
769,144,218,797 branches:u # 20.229 M/sec (46.15%)
35+
612,165,963 branch-misses:u # 0.08% of all branches (46.15%)
36+
138,224,398,438,706 L1-dcache-loads:u # 3.635 G/sec (38.46%)
37+
1,016,585,945 L1-dcache-load-misses:u # 0.00% of all L1-dcache accesses (15.39%)
38+
1,020,965,293 LLC-loads:u # 26.852 K/sec (15.39%)
39+
323,803 LLC-load-misses:u # 0.03% of all LL-cache accesses (15.39%)
40+
57,008,794,964,777 L1-icache-loads:u # 1.499 G/sec (23.08%)
41+
61,254,025 L1-icache-load-misses:u # 0.00% of all L1-icache accesses (30.78%)
42+
1,591,669 dTLB-load-misses:u (23.08%)
43+
29,150 iTLB-load-misses:u (30.77%)
44+
8,280,140,931 L1-dcache-prefetches:u # 217.775 K/sec (30.77%)
45+
46+
436.165930350 seconds time elapsed
47+
48+
38013.123558000 seconds user
49+
8.860359000 seconds sys
50+
51+

Code/matmul/E1180/matmul.out.15

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
=========================================================================================================================
2+
Time to run matmul_8x8x8_col_row 100000000 times = 0.771544 seconds (1.990812e+12 flops)
3+
Time to run matmul_8x8x8_col_col 100000000 times = 0.890801 seconds (1.724290e+12 flops)
4+
Time to run matmul_8x8x8_col_row_just_loads 100000000 times = 0.398800 seconds (3.851551e+12 flops)
5+
Time to run matmul_8x8x8_col_row_with_loads 100000000 times = 0.779375 seconds (1.970811e+12 flops)
6+
Time to run matmul_8x8x8_col_col_with_loads 100000000 times = 0.957893 seconds (1.603520e+12 flops)
7+
Time to run matmul_8x8x8_col_row_with_loads_and_stores 100000000 times = 1.530970 seconds (1.003285e+12 flops)
8+
Time to run matmul_8x8x8_col_col_with_loads_and_stores 100000000 times = 1.826280 seconds (8.410538e+11 flops)
9+
Time to run matmul_8x8x16_col_row_with_loads_and_stores 100000000 times = 2.342330 seconds (1.311515e+12 flops)
10+
Time to run matmul_8x8x24_col_row_with_loads_and_stores 100000000 times = 3.106781 seconds (1.483207e+12 flops)
11+
Time to run matmul_8x8x32_col_row_with_loads_and_stores 100000000 times = 3.869740 seconds (1.587704e+12 flops)
12+
Time to run matmul_8x8x40_col_row_with_loads_and_stores 100000000 times = 4.629212 seconds (1.659030e+12 flops)
13+
Time to run matmul_8x8x48_col_row_with_loads_and_stores 100000000 times = 5.387103 seconds (1.710752e+12 flops)
14+
Time to run matmul_8x8x56_col_row_with_loads_and_stores 100000000 times = 6.143676 seconds (1.750092e+12 flops)
15+
Time to run matmul_8x8x64_col_row_with_loads_and_stores 100000000 times = 6.897782 seconds (1.781442e+12 flops)
16+
Time to run matmul_8x8x64_col_col_with_loads_and_stores 100000000 times = 8.508306 seconds (1.444236e+12 flops)
17+
Time to run matmul_8x8x64_col_col_with_loads_and_stores_store_B 100000000 times = 8.811261 seconds (1.394579e+12 flops)
18+
Time to run matmul_16x8x64_col_col_with_loads_and_stores 100000000 times = 15.748019 seconds (1.560577e+12 flops)
19+
Time to run matmul_24x8x64_col_col_with_loads_and_stores 100000000 times = 23.296696 seconds (1.582370e+12 flops)
20+
Time to run matmul_32x8x64_col_col_with_loads_and_stores 100000000 times = 30.198025 seconds (1.627656e+12 flops)
21+
Time to run matmul_40x8x64_col_col_with_loads_and_stores 100000000 times = 37.096264 seconds (1.656231e+12 flops)
22+
Time to run matmul_48x8x64_col_col_with_loads_and_stores 100000000 times = 43.993099 seconds (1.675899e+12 flops)
23+
Time to run matmul_56x8x64_col_col_with_loads_and_stores 100000000 times = 50.886929 seconds (1.690336e+12 flops)
24+
Time to run matmul_64x8x64_col_col_with_loads_and_stores 100000000 times = 57.776226 seconds (1.701461e+12 flops)
25+
26+
Performance counter stats for './matmul':
27+
28+
4,744,636.70 msec task-clock:u # 14.386 CPUs utilized
29+
0 context-switches:u # 0.000 /sec
30+
0 cpu-migrations:u # 0.000 /sec
31+
177,208 page-faults:u # 37.349 /sec
32+
56,378,278,387,563 instructions:u # 2.77 insn per cycle (38.47%)
33+
20,361,157,429,805 cycles:u # 4.291 GHz (46.16%)
34+
96,622,391,433 branches:u # 20.365 M/sec (46.16%)
35+
76,534,441 branch-misses:u # 0.08% of all branches (46.16%)
36+
17,273,553,752,378 L1-dcache-loads:u # 3.641 G/sec (38.46%)
37+
115,300,141 L1-dcache-load-misses:u # 0.00% of all L1-dcache accesses (15.39%)
38+
115,700,561 LLC-loads:u # 24.386 K/sec (15.38%)
39+
48,145 LLC-load-misses:u # 0.04% of all LL-cache accesses (15.38%)
40+
7,125,751,792,313 L1-icache-loads:u # 1.502 G/sec (23.07%)
41+
7,450,214 L1-icache-load-misses:u # 0.00% of all L1-icache accesses (30.77%)
42+
205,645 dTLB-load-misses:u (23.08%)
43+
8,551 iTLB-load-misses:u (30.78%)
44+
1,007,155,040 L1-dcache-prefetches:u # 212.272 K/sec (30.78%)
45+
46+
329.808970036 seconds time elapsed
47+
48+
4743.821554000 seconds user
49+
0.809931000 seconds sys
50+
51+

0 commit comments

Comments
 (0)