Skip to content

Commit dd2be3d

Browse files
committed
Merge branch 'develop' into crf
2 parents 86fd6b6 + f122a5d commit dd2be3d

File tree

251 files changed

+9600
-2378
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

251 files changed

+9600
-2378
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,4 @@ cmake_install.cmake
2828
paddle/.timestamp
2929
python/paddlepaddle.egg-info/
3030
paddle/pybind/pybind.h
31+
python/paddle/v2/framework/tests/tmp/*

CONTRIBUTING.md

Lines changed: 157 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,157 @@
1-
./doc/howto/dev/contribute_to_paddle_en.md
1+
# Contribute Code
2+
3+
We sincerely appreciate your contribution. This document explains our workflow and work style.
4+
5+
## Workflow
6+
7+
PaddlePaddle uses this [Git branching model](http://nvie.com/posts/a-successful-git-branching-model/). The following steps guide usual contributions.
8+
9+
1. Fork
10+
11+
Our development community has been growing fastly; it doesn't make sense for everyone to write into the official repo. So, please file Pull Requests from your fork. To make a fork, just head over to the GitHub page and click the ["Fork" button](https://help.github.com/articles/fork-a-repo/).
12+
13+
1. Clone
14+
15+
To make a copy of your fork to your local computers, please run
16+
17+
```bash
18+
git clone https://github.com/your-github-account/paddle
19+
cd paddle
20+
```
21+
22+
1. Create the local feature branch
23+
24+
For daily works like adding a new feature or fixing a bug, please open your feature branch before coding:
25+
26+
```bash
27+
git checkout -b my-cool-stuff
28+
```
29+
30+
1. Commit
31+
32+
Before issuing your first `git commit` command, please install [`pre-commit`](http://pre-commit.com/) by running the following commands:
33+
34+
```bash
35+
pip install pre-commit
36+
pre-commit install
37+
```
38+
39+
Our pre-commit configuration requires clang-format 3.8 for auto-formating C/C++ code and yapf for Python.
40+
41+
Once installed, `pre-commit` checks the style of code and documentation in every commit. We will see something like the following when you run `git commit`:
42+
43+
```
44+
➜ git commit
45+
CRLF end-lines remover...............................(no files to check)Skipped
46+
yapf.................................................(no files to check)Skipped
47+
Check for added large files..............................................Passed
48+
Check for merge conflicts................................................Passed
49+
Check for broken symlinks................................................Passed
50+
Detect Private Key...................................(no files to check)Skipped
51+
Fix End of Files.....................................(no files to check)Skipped
52+
clang-formater.......................................(no files to check)Skipped
53+
[my-cool-stuff c703c041] add test file
54+
1 file changed, 0 insertions(+), 0 deletions(-)
55+
create mode 100644 233
56+
```
57+
58+
1. Build and test
59+
60+
Users can build PaddlePaddle natively on Linux and Mac OS X. But to unify the building environment and to make it easy for debugging, the recommended way is [using Docker](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/howto/dev/build_en.md).
61+
62+
1. Keep pulling
63+
64+
An experienced Git user pulls from the official repo often -- daily or even hourly, so they notice conflicts with others work early, and it's easier to resolve smaller conflicts.
65+
66+
```bash
67+
git remote add upstream https://github.com/PaddlePaddle/Paddle
68+
git pull upstream develop
69+
```
70+
71+
1. Push and file a pull request
72+
73+
You can "push" your local work into your forked repo:
74+
75+
```bash
76+
git push origin my-cool-stuff
77+
```
78+
79+
The push allows you to create a pull request, requesting owners of this [official repo](https://github.com/PaddlePaddle/Paddle) to pull your change into the official one.
80+
81+
To create a pull request, please follow [these steps](https://help.github.com/articles/creating-a-pull-request/).
82+
83+
If your change is for fixing an issue, please write ["Fixes <issue-URL>"](https://help.github.com/articles/closing-issues-using-keywords/) in the description section of your pull request. Github would close the issue when the owners merge your pull request.
84+
85+
Please remember to specify some reviewers for your pull request. If you don't know who are the right ones, please follow Github's recommendation.
86+
87+
88+
1. Delete local and remote branches
89+
90+
To keep your local workspace and your fork clean, you might want to remove merged branches:
91+
92+
```bash
93+
git push origin :my-cool-stuff
94+
git checkout develop
95+
git pull upstream develop
96+
git branch -d my-cool-stuff
97+
```
98+
99+
### Code Review
100+
101+
- Please feel free to ping your reviewers by sending them the URL of your pull request via IM or email. Please do this after your pull request passes the CI.
102+
103+
- Please answer reviewers' every comment. If you are to follow the comment, please write "Done"; please give a reason otherwise.
104+
105+
- If you don't want your reviewers to get overwhelmed by email notifications, you might reply their comments by [in a batch](https://help.github.com/articles/reviewing-proposed-changes-in-a-pull-request/).
106+
107+
- Reduce the unnecessary commits. Some developers commit often. It is recommended to append a sequence of small changes into one commit by running `git commit --amend` instead of `git commit`.
108+
109+
110+
## Coding Standard
111+
112+
### Code Style
113+
114+
Our C/C++ code follows the [Google style guide](http://google.github.io/styleguide/cppguide.html).
115+
116+
Our Python code follows the [PEP8 style guide](https://www.python.org/dev/peps/pep-0008/).
117+
118+
Our build process helps to check the code style. In [`build.sh`](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/paddle/scripts/docker/build.sh#L42), the entry point of our [builder Docker image](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/Dockerfile#L88), the CMake argument `WITH_STYLE_CHECK` is set to `ON` by default. This flag is on
119+
120+
Please install pre-commit, which automatically reformat the changes to C/C++ and Python code whenever we run `git commit`. To check the whole codebase, we can run the command `pre-commit run -a`, as in the [`check_style.sh` file](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/paddle/scripts/travis/check_style.sh#L30), which is invoked by [our Travis CI configuration](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/.travis.yml#L43).
121+
122+
### Unit Tests
123+
124+
Please remember to add related unit tests.
125+
126+
- For C/C++ code, please follow [`google-test` Primer](https://github.com/google/googletest/blob/master/googletest/docs/Primer.md).
127+
128+
- For Python code, please use [Python's standard `unittest` package](http://pythontesting.net/framework/unittest/unittest-introduction/).
129+
130+
131+
### Writing Logs
132+
133+
We use [glog](https://github.com/google/glog) for logging in our C/C++ code.
134+
135+
For general information, please use `LOG`. For debug information, please use [`VLOG`](http://htmlpreview.github.io/?https://github.com/google/glog/blob/master/doc/glog.html#verbose). The reason is at [here](https://groups.google.com/a/chromium.org/d/msg/chromium-dev/3NDNd1KzXeY/AZKMMx37fdQJ).
136+
137+
`VLOG` requires a *verbose level* parameter. For example:
138+
139+
```c++
140+
VLOG(3) << "Operator FC is taking " << num_inputs << "inputs."
141+
```
142+
143+
When we run a PaddlePaddle application or test, we can specify a verbose threshold. For example:
144+
145+
```bash
146+
GLOG_vmodule=buddy_allocator=2 \
147+
GLOG_v=10 \
148+
python \
149+
../python/paddle/v2/framework/tests/test_recurrent_op.py
150+
```
151+
152+
This will enable VLOG messages generated by `buddy_allocator.{h,cc}` and in the verbose range of 0 to 3, so you will see above example VLOG message, which is in level 3. This suggests that we output overall messages in lower verbose levels, so they display with higher probability. When coding C++, please follow the verbose level convention as follows:
153+
154+
- verbose level 1: [framework](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/framework)
155+
- verbose level 3: [operators](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/operators)
156+
- verbose level 5: [memory](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/memory), [platform](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/platform)
157+
- verbose level 7: [math](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/math)

doc/design/model_format.md

Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -12,24 +12,22 @@ The topology is saved as a plain text in a detailed self-contain protobuf file.
1212

1313
The parameters are saved as a binary file. As we all know, the protobuf message has a limit of [64M size](https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details). We have done a [benchmark experiment](https://github.com/PaddlePaddle/Paddle/pull/4610), which shows that protobuf is not fit for the task.
1414

15-
As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md), and has a description information proto of [LoDTensorDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99). We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, the `name` of the tensor, and the `LoD` information in [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md). A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,
16-
17-
|HeaderLength|ContentLength|**LoDTensorDesc**|**TensorValue**|
15+
As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md), and has a description information proto of [LoDTensorDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99). We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, and the `LoD` information in [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md). A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,
1816

1917
The table below shows a tensor's byte view in detail. Note that all the signed values are written in the little-endian format.
2018

21-
```text
22-
[offset] [type] [description]
23-
0004 4 bytes integer HeaderLength, the length of LoDTensorDesc
24-
0008 4 bytes integer ContentLength, the length of LodTensor Buffer
25-
0009 1 bytes char TensorDesc
26-
00010 1 bytes char TensorDesc
27-
...
28-
00100 1 bytes char TensorValue
29-
00101 1 bytes char TensorValue
30-
00102 1 bytes char TensorValue ..
31-
...
32-
```
19+
|field name | type | description |
20+
| --- | --- | --- |
21+
| version | uint32_t | Version of saved file. Always 0 now. |
22+
| tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
23+
| tensor desc | void* | TensorDesc protobuf binary message |
24+
| tensor data | void* | Tensor's data in binary format. The length of `tensor_data` is decided by `TensorDesc.dims()` and `TensorDesc.data_type()` |
25+
| lod_level | uint64_t | Level of LoD |
26+
| length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
27+
| data of lod[0] | uint64_t* | [Optional] lod[0].data() |
28+
| ... | ... | ... |
29+
30+
3331

3432
## Summary
3533

Lines changed: 16 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,36 @@
11
# 构建Raspberry Pi平台上的PaddlePaddle库
22

3-
对于Rasspberry Pi系统,用户可通过ssh等方式登录到Raspberry Pi系统上,按照[源码编译PaddlePaddle](http://www.paddlepaddle.org/doc_cn/getstarted/build_and_install/cmake/build_from_source_cn.html)相关文档所述,直接编译Raspberry Pi平台上适用的PaddlePaddle库。
3+
通常有两个方法来构建基于 Rasspberry Pi 的版本:
44

5-
用户也可以在自己熟悉的开发平台上,通过交叉编译的方式来编译。这篇文档将以Linux x86-64平台为例,介绍交叉编译Raspberry Pi平台上适用的PaddlePaddle的方法和步骤
5+
1. 通过ssh等方式登录到Raspberry Pi系统上来构建。所需的开发工具和第三方库可以参考 [`/Dockerfile`](https://github.com/PaddlePaddle/Paddle/blob/develop/Dockerfile)
66

7-
## 准备交叉编译环境
7+
1. 另一个方法是交叉编译。这篇文档介绍在 Linux/x64 上交叉编译Raspberry Pi平台上适用的PaddlePaddle的方法和步骤。
88

9-
从源码交叉编译PaddlePaddle,用户需要提前准备好交叉编译环境。用户可自行前往[github](https://github.com/raspberrypi/tools)下载Raspberry Pi平台使用的C/C++交叉编译工具链,也可通过以下命令获取:
9+
## 安装交叉编译器
10+
11+
克隆下面 Github repo
1012

1113
```bash
1214
git clone https://github.com/raspberrypi/tools.git
1315
```
1416

15-
该github仓库中包含若干个预编译好的、针对不同平台的编译工具。宿主机是Linux x86-64环境,则需选用`arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64`下的作为编译工具,所使用的编译器为arm-linux-gnueabihf-gcc 4.8.3。
16-
17-
注意,该编译工具链需要系统glibc支持2.14以上。
17+
即可在 `./tools/tree/master/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64` 目录里找到交叉编译器 arm-linux-gnueabihf-gcc 4.8.3。运行该编译工具链需要一台 Linux x64 机器上以及 2.14版本以上的 glibc。
1818

1919
## 配置交叉编译参数
2020

21-
CMake系统对交叉编译提供了支持[cmake-toolchains](https://cmake.org/cmake/help/v3.0/manual/cmake-toolchains.7.html#cross-compiling)为了简化cmake配置,PaddlePaddle为交叉编译提供了工具链配置文档[cmake/cross_compiling/raspberry_pi.cmake](https://github.com/PaddlePaddle/Paddle/blob/develop/cmake/cross_compiling/raspberry_pi.cmake),以提供一些默认的编译器和编译参数相关配置
21+
CMake[支持交叉编译](https://cmake.org/cmake/help/v3.0/manual/cmake-toolchains.7.html#cross-compiling)PaddlePaddle for Raspberry Pi的配置信息在[cmake/cross_compiling/raspberry_pi.cmake](https://github.com/PaddlePaddle/Paddle/blob/develop/cmake/cross_compiling/raspberry_pi.cmake)
2222

2323
交叉编译Raspberry Pi版本PaddlePaddle库时,有一些必须配置的参数:
2424

25-
- `CMAKE_SYSTEM_NAME`,CMake编译的目标平台,必须配置为`RPi`。在设置`CMAKE_SYSTEM_NAME=RPi`后,PaddlePaddle的CMake系统才认为在是在交叉编译Raspberry Pi系统的版本,并自动编译宿主机版protoc可执行文件、目标机版protobuf库、以及目标机版OpenBLAS库。
26-
27-
Raspberry Pi平台可选配置参数:
25+
- `CMAKE_SYSTEM_NAME`:CMake编译的目标平台,必须配置为`RPi`。在设置`CMAKE_SYSTEM_NAME=RPi`后,PaddlePaddle的CMake系统才认为在是在交叉编译Raspberry Pi系统的版本,并自动编译宿主机版protoc可执行文件、目标机版protobuf库、以及目标机版OpenBLAS库。
2826

29-
- `RPI_TOOLCHAIN`,编译工具链所在的绝对路径,或者相对于构建目录的相对路径。PaddlePaddle的CMake系统将根据该值自动设置需要使用的交叉编译器;否则,用户需要在cmake时手动设置这些值。无默认值。
30-
- `RPI_ARM_NEON`,是否使用NEON指令。目前必须设置成`ON`,默认值为`ON`
27+
- `RPI_TOOLCHAIN`:编译工具链所在的绝对路径,或者相对于构建目录的相对路径。PaddlePaddle的CMake系统将根据该值自动设置需要使用的交叉编译器;否则,用户需要在cmake时手动设置这些值。无默认值。
3128

32-
其他配置参数:
29+
- `RPI_ARM_NEON`:是否使用NEON指令。目前必须设置成`ON`,默认值为`ON`
3330

3431
- `HOST_C/CXX_COMPILER`,宿主机的C/C++编译器。在编译宿主机版protoc可执行文件和目标机版OpenBLAS库时需要用到。默认设置成环境变量`CC`的值;若环境变量`CC`没有设置,则设置成`cc`编译器。
3532

36-
cmake参数如下;
33+
一个常用的CMake配置如下:
3734

3835
```
3936
cmake -DCMAKE_SYSTEM_NAME=RPi \
@@ -47,7 +44,9 @@ cmake -DCMAKE_SYSTEM_NAME=RPi \
4744
..
4845
```
4946

50-
用户还可根据自己的需求设置其他编译参数。比如希望最小化生成的库的大小,可以设置`CMAKE_BUILD_TYPE``MinSizeRel`;若希望最快的执行速度,则可设置`CMAKE_BUILD_TYPE``Release`。亦可以通过手动设置`CMAKE_C/CXX_FLAGS_MINSIZEREL/RELEASE`来影响PaddlePaddle的编译过程。
47+
其中`WITH_C_API=ON`表示需要构建推理库。
48+
49+
用户还可根据自己的需求设置其他编译参数。比如希望最小化生成的库的大小,可以设置`CMAKE_BUILD_TYPE``MinSizeRel`;若希望最快的执行速度,则可设置`CMAKE_BUILD_TYPE``Release`
5150

5251
## 编译和安装
5352

@@ -60,6 +59,4 @@ make install
6059

6160
注意:如果你曾经在源码目录下编译过其他平台的PaddlePaddle库,请先使用`rm -rf`命令删除`third_party`目录和`build`目录,以确保所有的第三方依赖库和PaddlePaddle代码都是针对新的CMake配置重新编译的。
6261

63-
执行完安装命令后,由于上一步cmake配置中`WITH_C_API`设置为`ON``your/path/to/install`目录中会包含`include``lib`目录,其中`include`中包含C-API的头文件,`lib`中包含一个Raspberry Pi版本的库。
64-
65-
更多的编译配置见[源码编译PaddlePaddle](http://www.paddlepaddle.org/doc_cn/getstarted/build_and_install/cmake/build_from_source_cn.html)相关文档。
62+
执行完安装命令后,,`your/path/to/install`目录中会包含`include``lib`目录,其中`include`中包含C-API的头文件,`lib`中包含一个Raspberry Pi版本的库。

0 commit comments

Comments
 (0)