Skip to content

Commit 232b6fc

Browse files
authored
Merge pull request #9633 from weixing02/img
Upload fluid image sources to github
2 parents bc8f436 + d988b9a commit 232b6fc

File tree

80 files changed

+463
-110
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

80 files changed

+463
-110
lines changed

doc/fluid/design/algorithm/parameter_average.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,9 @@ Polyak and Juditsky (1992) showed that the test performance of simple average of
77

88
Hence, to accelerate the speed of Stochastic Gradient Descent, Averaged Stochastic Gradient Descent (ASGD) was proposed in Polyak and Juditsky (1992). For ASGD, the running average of parameters obtained by SGD, is used as the estimator for <img src="./images/theta_star.gif"/><br/> . The averaging is done as follows:
99

10-
![](./images/asgd.gif)
10+
<p align="center">
11+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/asgd.gif"><br />
12+
</p>
1113

1214
We propose averaging for any optimizer similar to how ASGD performs it, as mentioned above.
1315

doc/fluid/design/concurrent/channel.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Introduction
44

5-
A Channel is a data structure that allows for synchronous interprocess
5+
A Channel is a data structure that allows for synchronous interprocess
66
communication via message passing. It is a fundemental component of CSP
77
(communicating sequential processes), and allows for users to pass data
88
between threads without having to worry about synchronization.
@@ -18,7 +18,7 @@ Creates a new channel that takes in variables of a specific dtype.
1818

1919
- **fluid.make_channel(dtype, capacity=0)**
2020
- **dtype**: The data type of variables being sent/received through channel
21-
- **capacity**: The capacity of the channel. A capacity of 0 represents
21+
- **capacity**: The capacity of the channel. A capacity of 0 represents
2222
an unbuffered channel. Capacity > 0 represents a buffered channel
2323

2424
```
@@ -40,8 +40,8 @@ fluid.channel_close(ch)
4040

4141
### Send data to a channel
4242

43-
Sends a variable to a channel. Currently, variables of dtype `LoDTensor`,
44-
`LoDRankTable`, `LoDTensorArray`, `SelectedRows`, `ReaderHolder`, and
43+
Sends a variable to a channel. Currently, variables of dtype `LoDTensor`,
44+
`LoDRankTable`, `LoDTensorArray`, `SelectedRows`, `ReaderHolder`, and
4545
`ChannelHolder` are supported.
4646

4747
By default, the data of the Variable is moved from the sender to the receiver,
@@ -52,7 +52,7 @@ however the user can optionally copy the data before performing the send.
5252
- **variable**: The variable to send to the channel
5353
- **is_copy**: If set to True, channel_send will perform a variable assign
5454
to copy the source variable to a new variable to be sent.
55-
55+
5656
```
5757
ch = fluid.make_channel(dtype=core.VarDesc.VarType.LOD_TENSOR)
5858
var = fill_constant(shape=[1],dtype=core.VarDesc.VarType.INT32, value=100)
@@ -68,7 +68,7 @@ receiving variable.
6868
- **channel**: The channel to receive the variable from
6969
- **return_variable**: The destination variable used to store the data of the
7070
variable received from the channel
71-
71+
7272
```
7373
ch = fluid.make_channel(dtype=core.VarDesc.VarType.LOD_TENSOR)
7474
var = fill_constant(shape=[1],dtype=core.VarDesc.VarType.INT32, value=-1)
@@ -84,9 +84,9 @@ internal queues, locks, and conditional variables.
8484
### QueueMessage
8585

8686
QueueMessage encapsulates the state of the channel send/receive operation to be
87-
put in the **sendq/recvq**. It contains a condition variable used to lock the
87+
put in the **sendq/recvq**. It contains a condition variable used to lock the
8888
thread (when there are no available sends/receives). In addition, it contains
89-
a callback function to notify a thread when the QueueMessage is being
89+
a callback function to notify a thread when the QueueMessage is being
9090
processed by the channel.
9191

9292
### Queues
@@ -108,21 +108,21 @@ channel_recv operation will put a new QueueMessage on the recvq and block the
108108
current thread under two conditions:
109109
1. The channel is buffered and there is no data on the buff_
110110
2. The channel is unbuffered and does not have a sender
111-
111+
112112
### State diagram
113113

114114
#### Channel Send
115115

116116
<p align="center">
117-
<img src="./images/channel_send.png"/><br/>
117+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/channel_send.png"/><br/>
118118
</p>
119-
119+
120120
#### Channel Receive
121121

122122
<p align="center">
123-
<img src="./images/channel_recv.png"/><br/>
123+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/channel_recv.png"/><br/>
124124
</p>
125-
125+
126126
## Limitations and Considerations
127127

128128
### Variable Copy
@@ -135,5 +135,5 @@ be sent before it is sent.
135135

136136
Please note that this is acheived by adding an **assign** operator and creating
137137
a temporary variable that is sent in place of the original variable. Please
138-
note that **assign** operator has limited support for only certain variables
138+
note that **assign** operator has limited support for only certain variables
139139
datatypes.

doc/fluid/design/concurrent/select_op.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,13 @@
22

33
## Introduction
44

5-
In golang, the [**select**](https://golang.org/ref/spec#Select_statements)
6-
statement lets a goroutine wait on multiple communication operations at the
7-
same time. The **select** blocks until one of its cases can run, then
8-
executes the case. If multiple cases are ready to run, then one case is
5+
In golang, the [**select**](https://golang.org/ref/spec#Select_statements)
6+
statement lets a goroutine wait on multiple communication operations at the
7+
same time. The **select** blocks until one of its cases can run, then
8+
executes the case. If multiple cases are ready to run, then one case is
99
choosen at random to be executed.
1010

11-
With the introduction of CSP for Paddle, we mimic this behavior by
11+
With the introduction of CSP for Paddle, we mimic this behavior by
1212
creating a ***select_op***.
1313

1414
## How to use it
@@ -17,11 +17,11 @@ The **select_op** is available as a c++ operator. However most users
1717
will prefer to use the much simplier Python API.
1818

1919
- **fluid.Select()**: Creates a select operator and adds it to the current
20-
block within the main program. Also creates a sub block and adds it to the
21-
main program. This sub block is used to hold all variables and operators
20+
block within the main program. Also creates a sub block and adds it to the
21+
main program. This sub block is used to hold all variables and operators
2222
used by the case statements.
23-
24-
Within the select block, users can add cases by
23+
24+
Within the select block, users can add cases by
2525
calling **select.case** or **select.default** method.
2626

2727
- **fluid.Select.case(channel_action, channel, result_variable)**: Represents
@@ -37,13 +37,13 @@ execute.
3737
```
3838
ch1 = fluid.make_channel(dtype=core.VarDesc.VarType.LOD_TENSOR)
3939
quit_ch = fluid.make_channel(dtype=core.VarDesc.VarType.LOD_TENSOR)
40-
40+
4141
x = fill_constant(shape=[1], dtype=core.VarDesc.VarType.INT32, value=0)
4242
y = fill_constant(shape=[1], dtype=core.VarDesc.VarType.INT32, value=1)
43-
43+
4444
while_cond = fill_constant(shape=[1], dtype=core.VarDesc.VarType.BOOL, value=True)
4545
while_op = While(cond=while_cond)
46-
46+
4747
with while_op.block():
4848
with fluid.Select() as select:
4949
with select.case(fluid.channel_send, channel, x):
@@ -99,17 +99,17 @@ blocks {
9999
}
100100
}
101101
// Create "select" operator.
102-
// inputs:
102+
// inputs:
103103
// X: All input variables used by operators within the select block
104104
// case_to_execute: Variable filled in by select_op when it determines
105105
// which case to execute.
106106
//
107107
// outputs:
108-
// Out: All output variables referenced by operators within select block.
109-
//
108+
// Out: All output variables referenced by operators within select block.
109+
//
110110
// attrs:
111111
// sub_block: The block id containing the select "cases"
112-
// cases: Serialized list of all cases in the select op.
112+
// cases: Serialized list of all cases in the select op.
113113
// Each case is serialized as: '<index>,<type>,<channel>,<value>'
114114
// where type is 0 for default, 1 for send, and 2 for receive.
115115
// No channel and values are needed for default cases.
@@ -150,7 +150,7 @@ into **X**. It will also create a temp variable called **case_to_execute**. Th
150150
filled in by the select_op after it has completed processing the case statements.
151151

152152
If there are no available cases to execute (ie: all cases are blocked on channel operations, and
153-
there is no default statement), then the select_op will block the current thread. The thread will
153+
there is no default statement), then the select_op will block the current thread. The thread will
154154
unblock once there is a channel operation affecting one of the case statements, at which point, the
155155
**select_op** will set the **case_to_execute** variable to the index of the case to execute.
156156

@@ -247,17 +247,17 @@ blocks {
247247
248248
```
249249

250-
Cases are represented by a **conditional_block operator**, whose's condition is set as the output of
251-
equal(**case_to_execute**, **case_index**). Since each case index is unique in this sub-block,
250+
Cases are represented by a **conditional_block operator**, whose's condition is set as the output of
251+
equal(**case_to_execute**, **case_index**). Since each case index is unique in this sub-block,
252252
only one case will be executed.
253253

254254
### select_op flow
255255

256256
<p align="center">
257-
<img src="./images/select_op_workflow.png"/><br/>
257+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/select_op_workflow.png"/><br/>
258258
</p>
259259

260-
The select algorithm is inspired by golang's select routine. Please refer to
260+
The select algorithm is inspired by golang's select routine. Please refer to
261261
http://www.tapirgames.com/blog/golang-concurrent-select-implementation for more information.
262262

263263
## Backward Pass

doc/fluid/design/dist_train/distributed_architecture.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -40,11 +40,11 @@ computation is only specified in Python code which sits outside of PaddlePaddle,
4040

4141
Similar to how a compiler uses an intermediate representation (IR) so that the programmer does not need to manually optimize their code for most of the cases, we can have an intermediate representation in PaddlePaddle as well. The compiler optimizes the IR as follows:
4242

43-
<img src="src/compiler.png"/>
43+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/compiler.png"/>
4444

4545
PaddlePaddle can support model parallelism by converting the IR so that the user no longer needs to manually perform the computation and operations in the Python component:
4646

47-
<img src="src/paddle-compile.png"/>
47+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/paddle-compile.png"/>
4848

4949
The IR for PaddlePaddle after refactoring is called a `Block`, it specifies the computation dependency graph and the variables used in the computation.
5050

@@ -60,7 +60,7 @@ For a detailed explanation, refer to this document -
6060

6161
The revamped distributed training architecture can address the above discussed limitations. Below is the illustration of how it does so:
6262

63-
<img src="src/distributed_architecture.png"/>
63+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/distributed_architecture.png"/>
6464

6565
The major components are: *Python API*, *Distribute Transpiler* and *Remote Executor*.
6666

@@ -152,7 +152,7 @@ for data in train_reader():
152152
`JobDesc` object describe the distributed job resource specification to run on
153153
Cluster environment.
154154

155-
<img src="src/remote_executor.png" width="500" align="center" />
155+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/remote_executor.png" width="500" align="center" />
156156

157157
`RemoteExecutor.run` sends the `ProgramDesc` and
158158
[TrainingJob](https://github.com/PaddlePaddle/cloud/blob/unreleased-tpr/doc/autoscale/README.md#training-job-resource)
@@ -171,7 +171,7 @@ In the future, a more general placement algorithm should be implemented, which m
171171

172172
The local training architecture will be the same as the distributed training architecture, the difference is that everything runs locally, and there is just one PaddlePaddle runtime:
173173

174-
<img src="src/local_architecture.png"/>
174+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/local_architecture.png"/>
175175

176176

177177
### Training Data

doc/fluid/design/dist_train/multi_cpu.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,11 @@ Op graph to a multi-CPU Op graph, and run `ParallelDo` Op to run the graph.
88

99
## Transpiler
1010

11-
<img src="src/multi-threads/[email protected]" width="300">
11+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/[email protected]" width="300">
1212

1313
After converted:
1414

15-
<img src="src/multi-threads/[email protected]" width="1000">
15+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/[email protected]" width="1000">
1616

1717
## Implement
1818

doc/fluid/design/dist_train/parameter_server.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -41,11 +41,11 @@ We will need these OPs: *Send*, *Recv*, *Enqueue*, *Dequeue*.
4141
Below is an example of converting the user defined graph to the
4242
subgraphs for the trainer and the parameter server:
4343

44-
<img src="src/local-graph.png" width="300"/>
44+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/local-graph.png" width="300"/>
4545

4646
After converting:
4747

48-
<img src="src/dist-graph.png" width="700"/>
48+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/dist-graph.png" width="700"/>
4949

5050
1. The parameter variable W and its optimizer program are placed on the parameter server.
5151
1. Operators are added to the program.
@@ -69,7 +69,7 @@ In Fluid, we introduce [SelectedRows](../selected_rows.md) to represent a list o
6969
non-zero gradient data. So when we do parameter optimization both locally and remotely,
7070
we only need to send those non-zero rows to the optimizer operators:
7171

72-
<img src="src/sparse_update.png" width="700" />
72+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/sparse_update.png" width="700" />
7373

7474
### Benefits
7575

doc/fluid/design/dynamic_rnn/rnn.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ This document describes the RNN (Recurrent Neural Network) operator and how it i
55
## RNN Algorithm Implementation
66

77
<p align="center">
8-
<img src="./rnn.jpg"/>
8+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/rnn.jpg"/>
99
</p>
1010

1111
The above diagram shows an RNN unrolled into a full network.
@@ -22,7 +22,7 @@ There are several important concepts here:
2222
There could be local variables defined in each step-net. PaddlePaddle runtime realizes these variables in *step-scopes* which are created for each step.
2323

2424
<p align="center">
25-
<img src="./rnn.png"/><br/>
25+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/rnn.png"/><br/>
2626
Figure 2 illustrates the RNN's data flow
2727
</p>
2828

@@ -93,7 +93,7 @@ For example, we could have a 2-level RNN, where the top level corresponds to par
9393
The following figure illustrates feeding in text into the lower level, one sentence at a step, and the feeding in step outputs to the top level. The final top level output is about the whole text.
9494
9595
<p align="center">
96-
<img src="./2_level_rnn.png"/>
96+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/2_level_rnn.png"/>
9797
</p>
9898
9999
```python
@@ -149,5 +149,5 @@ If the `output_all_steps` is set to False, it will only output the final time st
149149
150150
151151
<p align="center">
152-
<img src="./rnn_2level_data.png"/>
152+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/rnn_2level_data.png"/>
153153
</p>

doc/fluid/design/modules/batch_norm_op.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## What is batch normalization
44

5-
Batch normalization is a frequently-used method in deep network training. It adjusts the mean and variance of a layer's output, and make the data distribution easier for next layer's training.
5+
Batch normalization is a frequently-used method in deep network training. It adjusts the mean and variance of a layer's output, and make the data distribution easier for next layer's training.
66

77
The principle of batch normalization can be summarized into a simple function:
88

@@ -66,21 +66,21 @@ As most C++ operators do, `batch_norm_op` is defined by inputs, outputs, attribu
6666

6767
The following graph showes the training computational process of `batch_norm_op`:
6868

69-
<img src="../images/batch_norm_op_kernel.png" width="800"/>
69+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/batch_norm_op_kernel.png" width="800"/>
7070

7171
cudnn provides APIs to finish the whole series of computation, we can use them in our GPU kernel.
7272

7373
### Python
7474

7575
`batch_norm_op` is warpped as a layer in Python:
7676

77-
```python
78-
def batch_norm_layer(net,
77+
```python
78+
def batch_norm_layer(net,
7979
input,
80-
output,
81-
scale,
82-
bias,
83-
use_global_est = False,
80+
output,
81+
scale,
82+
bias,
83+
use_global_est = False,
8484
epsilon = 1e-6,
8585
momentum = 0.99):
8686
mean_cache = scope.new_var(name = 'estimated_mean', trainable = False)
@@ -119,15 +119,15 @@ for pass_id in range(PASS_NUM):
119119
if pass_id % 100 == 0:
120120
net.infer(test_image) # run inferencing model
121121
# ...
122-
```
122+
```
123123

124124
`is_infer` is an attribute. Once an operator is created, its attributes can not be changed. It suggests us that we shall maintain two `batch_norm_op` in the model, one's `is_infer` is `True`(we call it `infer_batch_norm_op`) and the other one's is `False`(we call it `train_batch_norm_op`). They share all parameters and variables, but be placed in two different branches. That is to say, if a network contains a `batch_norm_op`, it will fork into two branches, one go through `train_batch_norm_op` and the other one go through `infer_batch_norm_op`:
125125

126126
<div align=center>
127-
<img src="../images/batch_norm_fork.png" width="500"/>
127+
<img src="https://github.com/PaddlePaddle/Paddle/tree/develop/doc/fluid/images/batch_norm_fork.png" width="500"/>
128128
</div>
129129

130-
Just like what is shown in the above graph, the net forks before `batch_norm_op` and will never merge again. All the operators after `batch_norm_op` will duplicate.
130+
Just like what is shown in the above graph, the net forks before `batch_norm_op` and will never merge again. All the operators after `batch_norm_op` will duplicate.
131131

132132
When the net runs in training mode, the end of the left branch will be set as the running target, so the dependency tracking process will ignore right branch automatically. When the net runs in inferencing mode, the process is reversed.
133133

0 commit comments

Comments
 (0)