Skip to content

Commit e2ca8e1

Browse files
committed
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into threadpool_for_io
2 parents 92818ba + 129859e commit e2ca8e1

File tree

416 files changed

+12383
-3328
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

416 files changed

+12383
-3328
lines changed

.gitignore

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -25,12 +25,3 @@ third_party/
2525

2626
# clion workspace.
2727
cmake-build-*
28-
29-
# generated while compiling
30-
paddle/pybind/pybind.h
31-
CMakeFiles
32-
cmake_install.cmake
33-
paddle/.timestamp
34-
python/paddlepaddle.egg-info/
35-
paddle/fluid/pybind/pybind.h
36-
python/paddle/version.py

benchmark/cluster/README.md

Lines changed: 133 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -36,23 +36,83 @@
3636
- Trainer Count: 100
3737
- Metrics: mini-batch / sec
3838

39-
| Batch Size | 32 | 64 | 128 | 256 |
40-
| -- | -- | -- | -- | -- |
41-
| PaddlePaddle Fluid | - | - | - | - |
42-
| PaddlePaddle v2 | - | - | - | - |
43-
| TensorFlow | - | - | - | - |
39+
40+
<table>
41+
<thead>
42+
<tr>
43+
<th>Batch Size </th>
44+
<th> 32</th>
45+
<th>64</th>
46+
<th>128 </th>
47+
<th>256</th>
48+
</tr>
49+
</thead>
50+
<tbody>
51+
<tr>
52+
<td> PaddlePaddle Fluid</td>
53+
<td>-</td>
54+
<td>- </td>
55+
<td>- </td>
56+
<td>- </td>
57+
</tr>
58+
<tr>
59+
<td>PaddlePaddle v2 </td>
60+
<td>- </td>
61+
<td>- </td>
62+
<td>- </td>
63+
<td>- </td>
64+
</tr>
65+
<tr>
66+
<td>TensorFlow </td>
67+
<td>- </td>
68+
<td>- </td>
69+
<td>- </td>
70+
<td>- </td>
71+
</tr>
72+
</tbody>
73+
</table>
4474

4575
### Measure the Performance for Different PServer Count
4676

4777
- Trainer Count: 100
4878
- Batch Size: 64
4979
- Metrics: mini-batch / sec
5080

51-
| PServer Count | 10 | 20 | 40 | 60 |
52-
| -- | -- | -- | -- | -- |
53-
| PaddlePaddle Fluid | - | - | - | - |
54-
| PaddlePaddle v2 | - | - | - | - |
55-
| TensorFlow | - | - | - | - |
81+
82+
<table>
83+
<thead>
84+
<tr>
85+
<th>PServer Count </th>
86+
<th>10</th>
87+
<th>20</th>
88+
<th>40 </th>
89+
<th>60</th>
90+
</tr>
91+
</thead>
92+
<tbody>
93+
<tr>
94+
<td> PaddlePaddle Fluid</td>
95+
<td>-</td>
96+
<td>- </td>
97+
<td>- </td>
98+
<td>- </td>
99+
</tr>
100+
<tr>
101+
<td>PaddlePaddle v2 </td>
102+
<td>- </td>
103+
<td>- </td>
104+
<td>- </td>
105+
<td>- </td>
106+
</tr>
107+
<tr>
108+
<td>TensorFlow </td>
109+
<td>- </td>
110+
<td>- </td>
111+
<td>- </td>
112+
<td>- </td>
113+
</tr>
114+
</tbody>
115+
</table>
56116

57117
### Measure Parallel Efficiency By Increasing Trainer Count
58118

@@ -67,11 +127,69 @@ The parallel efficiency is:
67127

68128
$E = \div(S, N)$
69129

70-
| Trainer Counter | 1 | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
71-
| -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
72-
| PaddlePaddle Fluid | - | - | - | - | - | - | - | - | - | - | - |
73-
| PaddlePaddle v2 | - | - | - | - | - | - | - | - | - | - | - | - |
74-
| TensorFlow | - | - | - | - | - | - | - | - | - | - | - | - | - |
130+
<table>
131+
<thead>
132+
<tr>
133+
<th>Trainer Counter </th>
134+
<th>1</th>
135+
<th>10</th>
136+
<th>20 </th>
137+
<th>30</th>
138+
<th>40</th>
139+
<th>50</th>
140+
<th>60 </th>
141+
<th>70</th>
142+
<th>80</th>
143+
<th>90</th>
144+
<th>100 </th>
145+
</tr>
146+
</thead>
147+
<tbody>
148+
<tr>
149+
<td> PaddlePaddle Fluid</td>
150+
<td>-</td>
151+
<td>- </td>
152+
<td>- </td>
153+
<td>- </td>
154+
<td>-</td>
155+
<td>- </td>
156+
<td>- </td>
157+
<td>- </td>
158+
<td>-</td>
159+
<td>- </td>
160+
<td>- </td>
161+
</tr>
162+
<tr>
163+
<td>PaddlePaddle v2 </td>
164+
<td>- </td>
165+
<td>- </td>
166+
<td>- </td>
167+
<td>- </td>
168+
<td>-</td>
169+
<td>- </td>
170+
<td>- </td>
171+
<td>- </td>
172+
<td>-</td>
173+
<td>- </td>
174+
<td>- </td>
175+
</tr>
176+
<tr>
177+
<td>TensorFlow </td>
178+
<td>- </td>
179+
<td>- </td>
180+
<td>- </td>
181+
<td>- </td>
182+
<td>-</td>
183+
<td>- </td>
184+
<td>- </td>
185+
<td>- </td>
186+
<td>-</td>
187+
<td>- </td>
188+
<td>- </td>
189+
</tr>
190+
</tbody>
191+
</table>
192+
75193

76194
## Reproduce the benchmark
77195

benchmark/cluster/vgg16/README.md

Lines changed: 139 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -16,48 +16,166 @@ Setting environment variable: `MKL_NUM_THREADS=1`.
1616

1717
- Metrics: samples / sec
1818

19-
| Batch Size | 32 | 64 | 128 | 256 |
20-
| -- | -- | -- | -- | -- |
21-
| PaddlePaddle Fluid | 15.44 | 16.32 | 16.74 | 16.79 |
22-
| PaddlePaddle v2 | 15.97 | 17.04 | 17.60 | 17.83 |
23-
| TensorFlow | 9.09 | 9.10 | 9.24 | 8.66 |
19+
<table>
20+
<thead>
21+
<tr>
22+
<th>Batch Size </th>
23+
<th> 32</th>
24+
<th>64</th>
25+
<th>128 </th>
26+
<th>256</th>
27+
</tr>
28+
</thead>
29+
<tbody>
30+
<tr>
31+
<td> PaddlePaddle Fluid</td>
32+
<td> 15.44 </td>
33+
<td> 16.32 </td>
34+
<td> 16.74 </td>
35+
<td> 16.79 </td>
36+
</tr>
37+
<tr>
38+
<td>PaddlePaddle v2 </td>
39+
<td> 15.97 </td>
40+
<td> 17.04 </td>
41+
<td> 17.60 </td>
42+
<td> 17.83 </td>
43+
</tr>
44+
<tr>
45+
<td>TensorFlow </td>
46+
<td> 9.09 </td>
47+
<td> 9.10 </td>
48+
<td> 9.24 </td>
49+
<td> 8.66 </td>
50+
</tr>
51+
</tbody>
52+
</table>
53+
2454

2555
### Different Batch Size
2656

2757
- PServer Count: 10
2858
- Trainer Count: 20
2959
- Metrics: samples / sec
3060

31-
| Batch Size | 32 | 64 | 128 | 256 |
32-
| -- | -- | -- | -- | -- |
33-
| PaddlePaddle Fluid | 190.20 | 222.15 | 247.40 | 258.18 |
34-
| PaddlePaddle v2 | 170.96 | 233.71 | 256.14 | 329.23 |
35-
| TensorFlow | - | - | - | - |
36-
61+
<table>
62+
<thead>
63+
<tr>
64+
<th>Batch Size </th>
65+
<th> 32</th>
66+
<th>64</th>
67+
<th>128 </th>
68+
<th>256</th>
69+
</tr>
70+
</thead>
71+
<tbody>
72+
<tr>
73+
<td> PaddlePaddle Fluid</td>
74+
<td> 190.20 </td>
75+
<td> 222.15 </td>
76+
<td> 247.40 </td>
77+
<td> 258.18 </td>
78+
</tr>
79+
<tr>
80+
<td>PaddlePaddle v2 </td>
81+
<td> 170.96 </td>
82+
<td> 233.71 </td>
83+
<td> 256.14 </td>
84+
<td> 329.23 </td>
85+
</tr>
86+
<tr>
87+
<td>TensorFlow </td>
88+
<td> - </td>
89+
<td> - </td>
90+
<td> - </td>
91+
<td> - </td>
92+
</tr>
93+
</tbody>
94+
</table>
3795

3896
### Accelerate Rate
3997

4098
- Pserver Count: 20
4199
- Batch Size: 128
42100
- Metrics: samples / sec
43101

44-
| Trainer Count | 20 | 40 | 80 | 100 |
45-
| -- | -- | -- | -- | -- |
46-
| PaddlePaddle Fluid | 263.29 (78.64%) | 518.80 (77.47%) | 836.26 (62.44%) | 1019.29 (60.89%) |
47-
| PaddlePaddle v2 (need more tests) | 326.85 (92.85%) | 534.58 (75.93%) | 853.30 (60.60%) | 1041.99 (59.20%) |
48-
| TensorFlow | - | - | - | - |
102+
<table>
103+
<thead>
104+
<tr>
105+
<th>Trainer Count </th>
106+
<th>20</th>
107+
<th>40</th>
108+
<th>80</th>
109+
<th>100</th>
110+
</tr>
111+
</thead>
112+
<tbody>
113+
<tr>
114+
<td> PaddlePaddle Fluid</td>
115+
<td> 263.29 (78.64%) </td>
116+
<td> 518.80 (77.47%) </td>
117+
<td> 836.26 (62.44%) </td>
118+
<td> 1019.29 (60.89%) </td>
119+
</tr>
120+
<tr>
121+
<td>PaddlePaddle v2 (need more tests) </td>
122+
<td> 326.85 (92.85%) </td>
123+
<td> 534.58 (75.93%) </td>
124+
<td> 853.30 (60.60%) </td>
125+
<td> 1041.99 (59.20%) </td>
126+
</tr>
127+
<tr>
128+
<td>TensorFlow </td>
129+
<td> - </td>
130+
<td> - </td>
131+
<td> - </td>
132+
<td> - </td>
133+
</tr>
134+
</tbody>
135+
</table>
136+
49137

50138
### Different Pserver Count
51139

52140
- Trainer Count: 60
53141
- Batch Size: 128
54142
- Metrics: samples/ sec
55143

56-
| PServer Count | 3 | 6 |10 | 20 |
57-
| -- | -- | -- | -- | -- |
58-
| PaddlePaddle Fluid(should fix in next PR) | 589.1 | 592.6 | 656.4 | 655.8 |
59-
| PaddlePaddle v2 | 593.4 | 791.3 | 729.7 | 821.7 |
60-
| TensorFlow | - | - | - | - |
144+
<table>
145+
<thead>
146+
<tr>
147+
<th>PServer Count </th>
148+
<th>3</th>
149+
<th>6</th>
150+
<th>10</th>
151+
<th>20</th>
152+
</tr>
153+
</thead>
154+
<tbody>
155+
<tr>
156+
<td> PaddlePaddle Fluid(should fix in next PR) </td>
157+
<td> 589.1 </td>
158+
<td> 592.6 </td>
159+
<td> 656.4 </td>
160+
<td> 655.8 </td>
161+
</tr>
162+
<tr>
163+
<td>PaddlePaddle v2 (need more tests) </td>
164+
<td> 593.4 </td>
165+
<td> 791.3 </td>
166+
<td> 729.7 </td>
167+
<td> 821.7 </td>
168+
</tr>
169+
<tr>
170+
<td>TensorFlow </td>
171+
<td> - </td>
172+
<td> - </td>
173+
<td> - </td>
174+
<td> - </td>
175+
</tr>
176+
</tbody>
177+
</table>
178+
61179

62180
*The performance gap between Fuild and v2 comes from the network interference.*
63181

0 commit comments

Comments
 (0)