Skip to content

Commit 331bfd9

Browse files
author
gx_wind
committed
Merge remote-tracking branch 'upstream/develop' into develop
2 parents f382aa7 + 3388e52 commit 331bfd9

File tree

148 files changed

+2066
-5831
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

148 files changed

+2066
-5831
lines changed

.copyright.hook

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -49,12 +49,12 @@ def generate_copyright(template, lang='C'):
4949
LANG_COMMENT_MARK = "//"
5050

5151
lines = template.split(NEW_LINE_MARK)
52-
ans = LANG_COMMENT_MARK + COPYRIGHT_HEADER + NEW_LINE_MARK
52+
ans = LANG_COMMENT_MARK + " " + COPYRIGHT_HEADER + NEW_LINE_MARK
5353
for lino, line in enumerate(lines):
5454
if lino == 0 or lino == 1 or lino == len(lines) - 1: continue
55-
ans += LANG_COMMENT_MARK + line + NEW_LINE_MARK
55+
ans += LANG_COMMENT_MARK + " " + line + NEW_LINE_MARK
5656

57-
return ans
57+
return ans + "\n"
5858

5959

6060
def lang_type(filename):
@@ -90,7 +90,7 @@ def main(argv=None):
9090
retv = 0
9191
for filename in args.filenames:
9292
first_line = io.open(filename).readline()
93-
if "Copyright" in first_line: continue
93+
if "COPYRIGHT" in first_line.upper() : continue
9494
original_contents = io.open(filename).read()
9595
new_contents = generate_copyright(
9696
COPYRIGHT, lang_type(filename)) + original_contents

CODE_OF_CONDUCT.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Contributor Covenant Code of Conduct
2+
3+
## Our Pledge
4+
5+
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation.
6+
7+
## Our Standards
8+
9+
Examples of behavior that contributes to creating a positive environment include:
10+
11+
* Using welcoming and inclusive language
12+
* Being respectful of differing viewpoints and experiences
13+
* Gracefully accepting constructive criticism
14+
* Focusing on what is best for the community
15+
* Showing empathy towards other community members
16+
17+
Examples of unacceptable behavior by participants include:
18+
19+
* The use of sexualized language or imagery and unwelcome sexual attention or advances
20+
* Trolling, insulting/derogatory comments, and personal or political attacks
21+
* Public or private harassment
22+
* Publishing others' private information, such as a physical or electronic address, without explicit permission
23+
* Other conduct which could reasonably be considered inappropriate in a professional setting
24+
25+
## Our Responsibilities
26+
27+
Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
28+
29+
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
30+
31+
## Scope
32+
33+
This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
34+
35+
## Enforcement
36+
37+
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at [email protected]. The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
38+
39+
Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
40+
41+
## Attribution
42+
43+
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version]
44+
45+
[homepage]: http://contributor-covenant.org
46+
[version]: http://contributor-covenant.org/version/1/4/

CODE_OF_CONDUCT_cn.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# 貢獻者公約
2+
3+
## 我們的承諾
4+
5+
為了促進一個開放透明且受歡迎的環境,我們作為貢獻者和維護者保證,無論年齡、種族、民族、性別認同和表達、體型、殘疾、經驗水平、國籍、個人表現、宗教或性別取向,在我們的專案以及社群的參與者都有不被騷擾的體驗。
6+
7+
## 我們的準則
8+
9+
舉例來說有助於創造正面環境的行為包括:
10+
* 使用歡迎和包容性語言
11+
* 尊重不同的觀點和經驗
12+
* 優雅地接受建設性批評
13+
* 關注在對於社群最好的事情上
14+
* 對其他社群成員的表現友善
15+
16+
舉例來說身為參與者不能接受的行為包括:
17+
* 使用與性有關的言語或是圖像,以及不受歡迎的性騷擾
18+
* 酸民/反串/釣魚行為或進行侮辱/貶損的評論,人身攻擊及政治攻擊
19+
* 公開或私下的騷擾
20+
* 未經許可地發布他人的個人資料,例如住址或是電子地址
21+
* 其他可以被合理地認定為不恰當或者違反職業操守的行為
22+
23+
## 我們的責任
24+
25+
專案維護者有責任為"可接受的行為"準則做出詮釋,以及對已發生的不被接受的行為採取恰當且公平的糾正措施。
26+
27+
專案維護者有權力及責任去刪除、編輯、拒絕與本行為準則有所違背的評論(comments)、提交(commits)、程式碼、wiki 編輯、問題(issues)和其他貢獻,以及專案維護者可暫時或永久性的禁止任何他們認為有不適當、威脅、冒犯、有害行為的貢獻者。
28+
29+
## 使用範圍
30+
31+
當一個人代表該專案或是其社群時,本行為準則適用於其專案平台和公共平台。
32+
33+
代表專案或是社群的情況,舉例來說包括使用官方專案的電子郵件地址、通過官方的社群媒體帳號發布或線上或線下事件中擔任指定代表。
34+
35+
該專案的呈現方式可由其專案維護者進行進一步的定義及解釋。
36+
37+
## 強制執行
38+
39+
可以透過[email protected],來聯繫專案團隊來報告濫用、騷擾或其他不被接受的行為。
40+
41+
任何維護團隊認為有必要且適合的所有投訴都將進行審查及調查,並做出相對應的回應。專案小組有對事件回報者有保密的義務。具體執行的方針近一步細節可能會單獨公佈。
42+
43+
沒有真誠的遵守或是執行本行為準則的專案維護人員,可能會因專案領導人或是其他成員的決定,暫時或是永久的取消其身份。
44+
45+
## 來源
46+
47+
本行為準則改編自[貢獻者公約][首頁],版本 1.4
48+
可在此觀看https://www.contributor-covenant.org/zh-tw/version/1/4/code-of-conduct.html
49+
50+
[首頁]: https://www.contributor-covenant.org

benchmark/tensorflow/image/googlenet_multi_gpu.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,16 @@
1+
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
2+
#
3+
#Licensed under the Apache License, Version 2.0 (the "License");
4+
#you may not use this file except in compliance with the License.
5+
#You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
#Unless required by applicable law or agreed to in writing, software
10+
#distributed under the License is distributed on an "AS IS" BASIS,
11+
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
#See the License for the specific language governing permissions and
13+
#limitations under the License.
114
from six.moves import xrange # pylint: disable=redefined-builtin
215
from datetime import datetime
316
import math

doc/api/v2/fluid/layers.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -364,6 +364,12 @@ split
364364
.. autofunction:: paddle.v2.fluid.layers.split
365365
:noindex:
366366

367+
368+
matmul
369+
------
370+
.. autofunction:: paddle.v2.fluid.layers.matmul
371+
:noindex:
372+
367373
logsigmoid
368374
----------
369375
.. autofunction:: paddle.v2.fluid.layers.logsigmoid
@@ -493,3 +499,8 @@ swish
493499
------
494500
.. autofunction:: paddle.v2.fluid.layers.swish
495501
:noindex:
502+
503+
l2_normalize
504+
------------
505+
.. autofunction:: paddle.v2.fluid.layers.l2_normalize
506+
:noindex:

doc/api/v2/fluid/nets.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,9 @@ glu
2525
.. autofunction:: paddle.v2.fluid.nets.glu
2626
:noindex:
2727

28+
29+
dot_product_attention
30+
---------------------
31+
.. autofunction:: paddle.v2.fluid.nets.dot_product_attention
32+
:noindex:
33+

doc/getstarted/concepts/src/infer.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,16 @@
1+
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
2+
#
3+
#Licensed under the Apache License, Version 2.0 (the "License");
4+
#you may not use this file except in compliance with the License.
5+
#You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
#Unless required by applicable law or agreed to in writing, software
10+
#distributed under the License is distributed on an "AS IS" BASIS,
11+
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
#See the License for the specific language governing permissions and
13+
#limitations under the License.
114
import paddle.v2 as paddle
215
import numpy as np
316

doc/howto/usage/capi/organization_of_the_inputs_cn.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919

2020
### 基本使用概念
2121

22-
- 在PaddlePaddle内部,神经网络中一个计算层的输入/输出被组织为一个 `Argument` 结构体,如果神经网络有多个输入或者多个输入,每一个输入/输入都会对应有自己的`Argument`
22+
- 在PaddlePaddle内部,神经网络中一个计算层的输入/输出被组织为一个 `Argument` 结构体,如果神经网络有多个输入或者多个输出,每一个输入/输出都会对应有自己的`Argument`
2323
- `Argument` 并不真正“存储”数据,而是将输入/输出信息有机地组织在一起。
2424
-`Argument`内部由`IVector`(对应着上文提到的一维整型数组)和`Matrix`(对应着上文提到的二维浮点型矩阵)来实际存储数据;由 `Sequence Start Positions` (下文详细解释) 来描述输入/输出的序列信息。
2525

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
# Fluid Distributed Training
2+
3+
## Introduction
4+
5+
In this article, we'll explain how to config and run distributed training jobs with PaddlePaddle Fluid in a bare metal cluster.
6+
7+
## Preparations
8+
9+
### Get your cluster ready
10+
11+
Prepare your computer nodes in the cluster. Nodes in this cluster can be of any specification that runs PaddlePaddle, and with a unique IP address assigned to it. Make sure they can communicate with each other.
12+
13+
### Have PaddlePaddle installed
14+
15+
PaddlePaddle must be installed on all nodes. If you have GPU cards on your nodes, be sure to properly install drivers and CUDA libraries.
16+
17+
PaddlePaddle build and installation guide can be found from [here](http://www.paddlepaddle.org/docs/develop/documentation/en/getstarted/build_and_install/index_en.html).
18+
19+
### Update training script
20+
21+
#### Non-cluster training script
22+
23+
Let's take [Deep Learning 101](http://www.paddlepaddle.org/docs/develop/book/01.fit_a_line/index.html)'s first chapter: "fit a line" as an example.
24+
25+
This demo's non-cluster version with fluid API is as follows:
26+
27+
``` python
28+
import paddle.v2 as paddle
29+
import paddle.v2.fluid as fluid
30+
31+
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
32+
y_predict = fluid.layers.fc(input=x, size=1, act=None)
33+
y = fluid.layers.data(name='y', shape=[1], dtype='float32')
34+
35+
cost = fluid.layers.square_error_cost(input=y_predict, label=y)
36+
avg_cost = fluid.layers.mean(x=cost)
37+
38+
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
39+
sgd_optimizer.minimize(avg_cost)
40+
41+
BATCH_SIZE = 20
42+
43+
train_reader = paddle.batch(
44+
paddle.reader.shuffle(
45+
paddle.dataset.uci_housing.train(), buf_size=500),
46+
batch_size=BATCH_SIZE)
47+
48+
place = fluid.CPUPlace()
49+
feeder = fluid.DataFeeder(place=place, feed_list=[x, y])
50+
exe = fluid.Executor(place)
51+
52+
exe.run(fluid.default_startup_program())
53+
54+
PASS_NUM = 100
55+
for pass_id in range(PASS_NUM):
56+
fluid.io.save_persistables(exe, "./fit_a_line.model/")
57+
fluid.io.load_persistables(exe, "./fit_a_line.model/")
58+
for data in train_reader():
59+
avg_loss_value, = exe.run(fluid.default_main_program(),
60+
feed=feeder.feed(data),
61+
fetch_list=[avg_cost])
62+
63+
if avg_loss_value[0] < 10.0:
64+
exit(0) # if avg cost less than 10.0, we think our code is good.
65+
exit(1)
66+
```
67+
68+
We created a simple fully connected neural networks training program and handed it to the fluid executor to run for 100 passes.
69+
70+
Now let's try to convert it to a distributed version to run in a cluster.
71+
72+
#### Introducing parameter server
73+
74+
As you see from the non-cluster version of training script, there is only one role in it: the trainer, who does the computing as well as holding parameters. In cluster training, since multi-trainers are working on the same task, they need one centralized place to hold and distribute parameters. This centralized place is called the Parameter Server in PaddlePaddle.
75+
76+
![parameter server architect](src/trainer.png)
77+
78+
Parameter Server in fluid does not only hold parameters but is also assigned with a part of the program. Trainers communicate with parameter servers via send/receive OPs. For more tech detail, please refer to this [document](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/dist_refactor/distributed_architecture.md).
79+
80+
Now we need to create program for both trainers and parameter servers, the question is how?
81+
82+
#### Slice the program
83+
84+
Fluid provides a tool called "Distribute Transpiler" to automatically convert the non-cluster program into cluster program.
85+
86+
The idea behind this tool is to find optimize OPs and gradient parameters, slice the program into 2 pieces and connect them with send/receive OP.
87+
88+
Optimize OPs and gradient parameters can be found from the return values of optimizer's minimize function.
89+
90+
To put them together:
91+
92+
``` python
93+
... #define the program, cost, and create sgd optimizer
94+
95+
optimize_ops, params_grads = sgd_optimizer.minimize(avg_cost) #get optimize OPs and gradient parameters
96+
97+
t = fluid.DistributeTranspiler() # create transpiler instance
98+
# slice the program into 2 pieces with optimizer_ops and gradient parameters list, as well as pserver_endpoints, which is a comma separated list of [IP:PORT] and number of trainers
99+
t.transpile(optimize_ops, params_grads, pservers=pserver_endpoints, trainers=2)
100+
101+
... #create executor
102+
103+
# in pserver, run this
104+
exe.run(fluid.default_startup_program())
105+
#current_endpoint here means current pserver IP:PORT you wish to run on
106+
exe.run(t.get_pserver_program(current_endpoint, optimize_ops))
107+
108+
# in trainer, run this
109+
... # define data reader
110+
exe.run(fluid.default_startup_program())
111+
for pass_id in range(100):
112+
for data in train_reader():
113+
exe.run(t.get_trainer_program())
114+
115+
116+
```
117+
118+
### E2E demo
119+
120+
Please find the complete demo from [here](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/tests/book_distribute/notest_dist_fit_a_line.py). In parameter server node run this in the command line:
121+
122+
``` bash
123+
PSERVERS=192.168.1.2:6174 SERVER_ENDPOINT=192.168.1.2:6174 TRAINING_ROLE=PSERVER python notest_dist_fit_a_line.py
124+
```
125+
126+
*please note we assume that your parameter server runs at 192.168.1.2:6174*
127+
128+
Wait until the prompt `Server listening on 192.168.1.2:6174`
129+
130+
Then in 2 of your trainer node run this:
131+
132+
``` bash
133+
PSERVERS=192.168.1.2:6174 SERVER_ENDPOINT=192.168.1.2:6174 TRAINING_ROLE=TRAINER python notest_dist_fit_a_line.py
134+
```
135+
136+
*the reason you need to run this command twice in 2 nodes is: in the script we set the trainer count to be 2. You can change this setting on line 50*
137+
138+
Now you have 2 trainers and 1 parameter server up and running.

0 commit comments

Comments
 (0)