Skip to content

Commit f3dcd00

Browse files
author
Yibing Liu
committed
Merge branch 'develop' of upstream into ctc_edit_distance_dev
2 parents a1935b2 + 6cff3c9 commit f3dcd00

File tree

338 files changed

+11687
-2336
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

338 files changed

+11687
-2336
lines changed

CMakeLists.txt

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,10 @@ set(PADDLE_BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR})
2020
include(system)
2121

2222
project(paddle CXX C Go)
23-
message(STATUS "CXX compiler: " ${CMAKE_CXX_COMPILER} ", version: " ${CMAKE_CXX_COMPILER_VERSION})
24-
message(STATUS "C compiler: " ${CMAKE_C_COMPILER} ", version: " ${CMAKE_C_COMPILER_VERSION})
23+
message(STATUS "CXX compiler: ${CMAKE_CXX_COMPILER}, version: "
24+
"${CMAKE_CXX_COMPILER_ID} ${CMAKE_CXX_COMPILER_VERSION}")
25+
message(STATUS "C compiler: ${CMAKE_C_COMPILER}, version: "
26+
"${CMAKE_C_COMPILER_ID} ${CMAKE_C_COMPILER_VERSION}")
2527

2628
find_package(Sphinx)
2729
if(NOT CMAKE_CROSSCOMPILING)

benchmark/IntelOptimizedPaddle.md

Lines changed: 23 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,11 @@ Machine:
77

88
System: CentOS release 6.3 (Final), Docker 1.12.1.
99

10-
PaddlePaddle: (TODO: will rerun after 0.11.0)
11-
- paddlepaddle/paddle:latest (for MKLML and MKL-DNN)
10+
PaddlePaddle:
11+
- paddlepaddle/paddle:0.11.0 (for MKLML and MKL-DNN)
1212
- MKL-DNN tag v0.11
1313
- MKLML 2018.0.1.20171007
14-
- paddlepaddle/paddle:latest-openblas (for OpenBLAS)
14+
- paddlepaddle/paddle:0.11.0-openblas (for OpenBLAS)
1515
- OpenBLAS v0.2.20
1616

1717
On each machine, we will test and compare the performance of training on single node using MKL-DNN / MKLML / OpenBLAS respectively.
@@ -56,43 +56,57 @@ Input image size - 3 * 224 * 224, Time: images/second
5656

5757
<img src="figs/googlenet-cpu-train.png" width="500">
5858

59-
- Alexnet
59+
- AlexNet
6060

6161
| BatchSize | 64 | 128 | 256 |
6262
|--------------|--------| ------ | -------|
63-
| OpenBLAS | 2.13 | 2.45 | 2.68 |
63+
| OpenBLAS | 45.62 | 72.79 | 107.22 |
6464
| MKLML | 66.37 | 105.60 | 144.04 |
6565
| MKL-DNN | 399.00 | 498.94 | 626.53 |
6666

67-
chart TBD
67+
<img src="figs/alexnet-cpu-train.png" width="500">
6868

6969
#### Inference
7070
Test on batch size 1, 2, 4, 8, 16 on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
7171
- VGG-19
7272

7373
| BatchSize | 1 | 2 | 4 | 8 | 16 |
7474
|-----------|-------|-------|-------|-------|-------|
75-
| OpenBLAS | 1.07 | 1.08 | 1.06 | 0.88 | 0.65 |
75+
| OpenBLAS | 1.10 | 1.96 | 3.62 | 3.63 | 2.25 |
7676
| MKLML | 5.58 | 9.80 | 15.15 | 21.21 | 28.67 |
7777
| MKL-DNN | 75.07 | 88.64 | 82.58 | 92.29 | 96.75 |
7878

79+
<img src="figs/vgg-cpu-infer.png" width="500">
80+
7981
- ResNet-50
8082

8183
| BatchSize | 1 | 2 | 4 | 8 | 16 |
8284
|-----------|-------|--------|--------|--------|--------|
83-
| OpenBLAS | 3.35 | 3.19 | 3.09 | 2.55 | 1.96 |
85+
| OpenBLAS | 3.31 | 6.72 | 11.59 | 13.17 | 9.27 |
8486
| MKLML | 6.33 | 12.02 | 22.88 | 40.53 | 63.09 |
8587
| MKL-DNN | 107.83| 148.84 | 177.78 | 189.35 | 217.69 |
8688

89+
<img src="figs/resnet-cpu-infer.png" width="500">
8790

8891
- GoogLeNet
8992

9093
| BatchSize | 1 | 2 | 4 | 8 | 16 |
9194
|-----------|--------|--------|--------|--------|--------|
92-
| OpenBLAS | 12.04 | 11.31 | 10.00 | 9.07 | 4.34 |
95+
| OpenBLAS | 12.06 | 23.56 | 34.48 | 36.45 | 23.12 |
9396
| MKLML | 22.74 | 41.56 | 81.22 | 133.47 | 210.53 |
9497
| MKL-DNN | 175.10 | 272.92 | 450.70 | 512.00 | 600.94 |
9598

99+
<img src="figs/googlenet-cpu-infer.png" width="500">
100+
101+
- AlexNet
102+
103+
| BatchSize | 1 | 2 | 4 | 8 | 16 |
104+
|-----------|--------|--------|--------|--------|--------|
105+
| OpenBLAS | 3.53 | 6.23 | 15.04 | 26.06 | 31.62 |
106+
| MKLML | 21.32 | 36.55 | 73.06 | 131.15 | 192.77 |
107+
| MKL-DNN | 442.91 | 656.41 | 719.10 | 847.68 | 850.51 |
108+
109+
<img src="figs/alexnet-cpu-infer.png" width="500">
96110

97111
### Laptop
98112
TBD

benchmark/figs/alexnet-cpu-infer.png

15.1 KB
Loading

benchmark/figs/alexnet-cpu-train.png

15.6 KB
Loading
14.1 KB
Loading
996 Bytes
Loading

benchmark/figs/resnet-cpu-infer.png

13.7 KB
Loading

benchmark/figs/resnet-cpu-train.png

-2.2 KB
Loading

benchmark/figs/vgg-cpu-infer.png

13.7 KB
Loading

benchmark/figs/vgg-cpu-train.png

-1.17 KB
Loading

0 commit comments

Comments
 (0)