You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/design/block.md
+36-38Lines changed: 36 additions & 38 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,12 +5,12 @@
5
5
Both deep learning systems and programming languages help users describe computation procedures. These systems use various representations of computation:
6
6
7
7
- Caffe, Torch, and Paddle: sequences of layers.
8
-
- TensorFlow, Caffe2, Mxnet: graphs of operators.
8
+
- TensorFlow, Caffe2, Mxnet: graph of operators.
9
9
- PaddlePaddle: nested blocks, like C++ and Java programs.
10
10
11
11
## Block in Programming Languages and Deep Learning
12
12
13
-
In programming languages, a block is a pair of curly braces that includes local variables definitions and a sequence of instructions, or operators.
13
+
In programming languages, a block is a pair of curly braces that includes local variables definitions and a sequence of instructions or operators.
14
14
15
15
Blocks work with control flow structures like `if`, `else`, and `for`, which have equivalents in deep learning:
16
16
@@ -24,14 +24,14 @@ A key difference is that a C++ program describes a one pass computation, whereas
24
24
25
25
## Stack Frames and the Scope Hierarchy
26
26
27
-
The existence of the backward makes the execution of a block of traditional programs and PaddlePaddle different to each other:
27
+
The existence of the backward pass makes the execution of a block of PaddlePaddle different from traditional programs:
| push at entering block| push at entering block |
34
+
| pop at leaving block | destroy when minibatch completes|
35
35
36
36
1. In traditional programs:
37
37
@@ -42,9 +42,9 @@ The existence of the backward makes the execution of a block of traditional prog
42
42
1. In PaddlePaddle
43
43
44
44
- When the execution enters a block, PaddlePaddle adds a new scope, where it realizes variables.
45
-
- PaddlePaddle doesn't pop a scope after the execution of the block because variables therein are to be used by the backward pass. So it has a stack forest known as a *scope hierarchy*.
45
+
- PaddlePaddle doesn't pop a scope after the execution of the block because variables therein are used by the backward pass. So it has a stack forest known as a *scope hierarchy*.
46
46
- The height of the highest tree is the maximum depth of nested blocks.
47
-
- After the process of a minibatch, PaddlePaddle destroys the scope hierarchy.
47
+
- After the processing of a minibatch, PaddlePaddle destroys the scope hierarchy.
48
48
49
49
## Use Blocks in C++ and PaddlePaddle Programs
50
50
@@ -94,14 +94,14 @@ with ie.false_block():
94
94
o1, o2 = ie(cond)
95
95
```
96
96
97
-
In both examples, the left branch computes `x+y` and `softmax(x+y)`, the right branch computes `x+1` and `fc(x)`.
97
+
In both examples, the left branch computes `x+y` and `softmax(x+y)`, the right branch computes `fc(x)` and `x+1`.
98
98
99
-
A difference is that variables in the C++ program contain scalar values, whereas those in the PaddlePaddle programs are mini-batches of instances. The `ie.input(true, 0)` invocation returns instances in the 0-th input, `x`, that corresponds to true values in `cond` as the local variable `x`, where `ie.input(false, 0)` returns instances corresponding to false values.
99
+
The difference is that variables in the C++ program contain scalar values, whereas those in the PaddlePaddle programs are mini-batches of instances.
100
100
101
101
102
102
### Blocks with `for` and `RNNOp`
103
103
104
-
The following RNN model from the [RNN design doc](./rnn.md)
104
+
The following RNN model in PaddlePaddle from the [RNN design doc](./rnn.md) :
105
105
106
106
```python
107
107
x = sequence([10, 20, 30]) # shape=[None, 1]
@@ -112,9 +112,9 @@ U = var(0.375, param=true) # shape=[1]
112
112
rnn = pd.rnn()
113
113
with rnn.step():
114
114
h = rnn.memory(init= m)
115
-
hh= rnn.previous_memory(h)
115
+
h_prev= rnn.previous_memory(h)
116
116
a = layer.fc(W, x)
117
-
b = layer.fc(U, hh)
117
+
b = layer.fc(U, h_prev)
118
118
s = pd.add(a, b)
119
119
act = pd.sigmoid(s)
120
120
rnn.update_memory(h, act)
@@ -147,9 +147,9 @@ for (int i = 1; i <= sizeof(x)/sizeof(x[0]); ++i) {
147
147
148
148
## Compilation and Execution
149
149
150
-
Like TensorFlow programs, a PaddlePaddle program is written in Python. The first part describes a neural network as a protobuf message, and the rest part executes the message for training or inference.
150
+
Like TensorFlow, a PaddlePaddle program is written in Python. The first part describes a neural network as a protobuf message, and the rest executes the message for training or inference.
151
151
152
-
The generation of this protobuf message is like what a compiler generates a binary executable file. The execution of the message that the OS executes the binary file.
152
+
The generation of this protobuf message is similar to how a compiler generates a binary executable file. The execution of the message is similar to how the OS executes the binary file.
153
153
154
154
## The "Binary Executable File Format"
155
155
@@ -186,8 +186,8 @@ Also, the RNN operator in above example is serialized into a protobuf message of
186
186
187
187
```
188
188
OpDesc {
189
-
inputs = {0} // the index of x
190
-
outputs = {5, 3} // indices of act and hidden_out
189
+
inputs = {0} // the index of x in vars of BlockDesc above
190
+
outputs = {5, 3} // indices of act and hidden_out in vars of BlockDesc above
191
191
attrs {
192
192
"memories" : {1} // the index of h
193
193
"step_net" : <above step net>
@@ -203,32 +203,32 @@ This `OpDesc` value is in the `ops` field of the `BlockDesc` value representing
203
203
During the generation of the Protobuf message, the Block should store VarDesc (the Protobuf message which describes Variable) and OpDesc (the Protobuf message which describes Operator).
204
204
205
205
VarDesc in a block should have its name scope to avoid local variables affect parent block's name scope.
206
-
Child block's name scopes should inherit the parent's so that OpDesc in child block can reference a VarDesc that stored in parent block. For example
206
+
Child block's name scopes should inherit the parent's so that OpDesc in child block can reference a VarDesc that stored in parent block. For example:
207
207
208
208
```python
209
-
a = pd.Varaible(shape=[20, 20])
209
+
a = pd.Variable(shape=[20, 20])
210
210
b = pd.fc(a, params=["fc.w", "fc.b"])
211
211
212
212
rnn = pd.create_rnn()
213
-
with rnn.stepnet()
213
+
with rnn.stepnet():
214
214
x = a.as_step_input()
215
215
# reuse fc's parameter
216
216
fc_without_b = pd.get_variable("fc.w")
217
217
rnn.output(fc_without_b)
218
218
219
219
out = rnn()
220
220
```
221
-
the method `pd.get_variable` can help retrieve a Variable by a name, a Variable may store in a parent block, but might be retrieved in a child block, so block should have a variable scope that supports inheritance.
221
+
The method `pd.get_variable` can help retrieve a Variable by the name. The Variable may be stored in a parent block, but might be retrieved in a child block, so block should have a variable scope that supports inheritance.
222
222
223
223
In compiler design, the symbol table is a data structure created and maintained by compilers to store information about the occurrence of various entities such as variable names, function names, classes, etc.
224
224
225
225
To store the definition of variables and operators, we define a C++ class `SymbolTable`, like the one used in compilers.
226
226
227
-
`SymbolTable` can do the following stuff:
227
+
`SymbolTable` can do the following:
228
228
229
229
- store the definitions (some names and attributes) of variables and operators,
230
-
-to verify if a variable was declared,
231
-
-to make it possible to implement type checking (offer Protobuf message pointers to `InferShape` handlers).
230
+
- verify if a variable was declared,
231
+
- make it possible to implement type checking (offer Protobuf message pointers to `InferShape` handlers).
232
232
233
233
234
234
```c++
@@ -240,19 +240,18 @@ class SymbolTable {
240
240
241
241
OpDesc* NewOp(const string& name="");
242
242
243
-
// TODO determine whether name is generated by python or C++
244
-
// currently assume that a unique name will be generated by C++ if the
245
-
// argument name left default.
243
+
// TODO determine whether name is generated by python or C++.
244
+
// Currently assume that a unique name will be generated by C++ if the
245
+
// argument name is left default.
246
246
VarDesc* NewVar(const string& name="");
247
247
248
-
// find a VarDesc by name, if recursive true, find parent's SymbolTable
248
+
// find a VarDesc by name, if recursive is true, find parent's SymbolTable
249
249
// recursively.
250
250
// this interface is introduced to support InferShape, find protobuf messages
251
251
// of variables and operators, pass pointers into InferShape.
252
-
// operator
253
252
//
254
253
// NOTE maybe some C++ classes such as VarDescBuilder and OpDescBuilder should
255
-
// be proposed and embedded into pybind to enable python operate on C++ pointers.
254
+
// be proposed and embedded into pybind to enable python operation on C++ pointers.
// some other necessary interfaces of NetOp are list below
304
+
// some other necessary interfaces of NetOp are listed below
306
305
// ...
307
306
308
307
private:
@@ -316,15 +315,14 @@ private:
316
315
Block inherits from OperatorBase, which has a Run method.
317
316
Block's Run method will run its operators sequentially.
318
317
319
-
There is another important interface called `Eval`, which take some arguments called targets, and generate a minimal graph which takes targets as the end points and creates a new Block,
320
-
after `Run`, `Eval` will get the latest value and return the targets.
318
+
There is another important interface called `Eval`, which takes some arguments called targets and generates a minimal graph which treats targets as the end points and creates a new Block. After `Run`, `Eval` will get the latest value and return the targets.
321
319
322
320
The definition of Eval is as follows:
323
321
324
322
```c++
325
323
// clean a block description by targets using the corresponding dependency graph.
326
324
// return a new BlockDesc with minimal number of operators.
327
-
// NOTE not return a Block but the block's description so that this can be distributed
325
+
// NOTE: The return type is not a Block but the block's description so that this can be distributed
0 commit comments