Skip to content

Commit 663f035

Browse files
committed
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_upsample_layer
2 parents 4747b2c + 1a4b0d6 commit 663f035

28 files changed

+910
-276
lines changed

cmake/generic.cmake

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -587,6 +587,9 @@ function(grpc_library TARGET_NAME)
587587
get_filename_component(PROTO_WE ${grpc_library_PROTO} NAME_WE)
588588
get_filename_component(PROTO_PATH ${ABS_PROTO} PATH)
589589

590+
#FIXME(putcn): the follwoing line is supposed to generate *.pb.h and cc, but
591+
# somehow it didn't. line 602 to 604 is to patching this. Leaving this here
592+
# for now to enable dist CI.
590593
protobuf_generate_cpp(grpc_proto_srcs grpc_proto_hdrs "${ABS_PROTO}")
591594
set(grpc_grpc_srcs "${CMAKE_CURRENT_BINARY_DIR}/${PROTO_WE}.grpc.pb.cc")
592595
set(grpc_grpc_hdrs "${CMAKE_CURRENT_BINARY_DIR}/${PROTO_WE}.grpc.pb.h")
@@ -597,6 +600,9 @@ function(grpc_library TARGET_NAME)
597600
COMMAND ${PROTOBUF_PROTOC_EXECUTABLE}
598601
ARGS --grpc_out "${CMAKE_CURRENT_BINARY_DIR}" -I "${PROTO_PATH}"
599602
--plugin=protoc-gen-grpc="${GRPC_CPP_PLUGIN}" "${ABS_PROTO}"
603+
COMMAND ${PROTOBUF_PROTOC_EXECUTABLE}
604+
ARGS --cpp_out "${CMAKE_CURRENT_BINARY_DIR}" -I "${PROTO_PATH}"
605+
"${ABS_PROTO}"
600606
DEPENDS "${ABS_PROTO}" ${PROTOBUF_PROTOC_EXECUTABLE} extern_grpc)
601607

602608
# FIXME(typhoonzero): grpc generated code do not generate virtual-dtor, mark it

doc/fluid/design/concepts/cpp_data_feeding.md

Lines changed: 38 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ To solve this problem, we introduce `ReaderHolder` as a wrapper. It acts as an e
113113

114114
To create and invoke readers, some new ops are introduced:
115115

116-
### CreateReaderOp
116+
### Operators That Create Readers
117117

118118
Each reader has its creation op. File readers' creation ops have no input and yield the created file reader as its output. Decorated readers' creation ops take the underlying readers as inputs and then yield new decorated readers.
119119

@@ -153,19 +153,52 @@ double_buffer_reader = create_double_buffer_op(batch_reader)
153153
The forwarding ops of the corresponding `main_program` would be like this:
154154

155155
```
156-
while_op {
156+
not_completed = true
157+
pass_count = 0
158+
while_op(not_completed) {
157159
has_next = has_next_op(double_buffer_reader)
158160
if_else_op(has_next) {
159161
batch_data = read_op(double_buffer_reader)
160162
... (subsequent training ops)
161163
} else {
162164
reset_op(double_buffer_reader)
165+
increase_op(pass_count)
166+
not_completed = less_than_op(pass_count, reqiured_pass_num)
163167
}
164168
}
165169
```
166170

167-
Two important considerations for these programs are as follows:
171+
A few important considerations for these programs are as follows:
168172

169-
1. The multiple\_reader is the batch\_reader's underlying reader, and the batch\_reader is the double\_buffer\_reader's underlying reader. `read_op`, `has_next_op` and other reader related ops will only invoke the top-most reader. In this case, it's the double\_buffer\_reader.
173+
1. `not_completed`, `pass_count` and other variables shown above are all Fluid Variables.
170174

171-
2. All readers exist in both `startup_program` and `main_program`. And they are persistable.
175+
2. The multiple\_reader is the batch\_reader's underlying reader, and the batch\_reader is the double\_buffer\_reader's underlying reader. `read_op`, `has_next_op` and other reader related ops will only invoke the top-most reader. In this case, it's the double\_buffer\_reader.
176+
177+
3. All readers exist in both `startup_program` and `main_program`. And they are persistable.
178+
179+
### Simplify Configuration by MultiPassReader
180+
181+
The Program configuration mentioned above is complicated. Users need to be very familiar to concepts of Program and Block to prevent making mistakes in their code. To make the usage of C++ readers more friendly to new users, we introduce `MultiPassReader`.
182+
183+
`MultiPassReader` is a decorated reader. A multi-pass reader is used to continuously yield data for several training passes. It takes the number of passes to run as one of its attributes('pass_num') and maintains a counter to record how many passes it has completed. Each time its underlying reader reaches the EOF, the multi-pass reader checks whether it has completed the training of given number of pass. If not, the underlying reader will be re-initialized and starts a new pass automatically. Before completing the whole training, the return of MultiPassReader's `HasNext()` will always be `true`.
184+
185+
With `MultiPassReader`, the startup program would be like this:
186+
187+
```
188+
multiple_reader = open_files_op(...)
189+
batch_reader = create_batch_reader_op(multiple_reader)
190+
multi_pass_reader = create_multi_pass_reader_op(batch_reader)
191+
double_buffer_reader = create_double_buffer_op(multi_pass_reader)
192+
... (other initializers)
193+
```
194+
195+
The forwarding part of the corresponding `main_program` would be like this:
196+
197+
```
198+
not_completed = true
199+
while_op(not_completed) {
200+
batch_data = read_op(double_buffer_reader)
201+
... (subsequent training ops)
202+
not_completed = has_next_op(double_buffer_reader)
203+
}
204+
```
Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
# Channel Design
2+
3+
## Introduction
4+
5+
A Channel is a data structure that allows for synchronous interprocess
6+
communication via message passing. It is a fundemental component of CSP
7+
(communicating sequential processes), and allows for users to pass data
8+
between threads without having to worry about synchronization.
9+
10+
## How to use it
11+
12+
Paddle offers python APIs to open and close channels, along with sending
13+
and receiving data to/from a channel.
14+
15+
### Create a channel
16+
17+
Creates a new channel that takes in variables of a specific dtype.
18+
19+
- **fluid.make_channel(dtype, capacity=0)**
20+
- **dtype**: The data type of variables being sent/received through channel
21+
- **capacity**: The capacity of the channel. A capacity of 0 represents
22+
an unbuffered channel. Capacity > 0 represents a buffered channel
23+
24+
```
25+
ch = fluid.make_channel(dtype=core.VarDesc.VarType.LOD_TENSOR, 10)
26+
```
27+
28+
### Close a channel
29+
30+
Closes a channel. Any pending senders and receivers will be awoken during
31+
this time. Receivers can still receive from a closed channel, but senders
32+
are not allowed to send any additional data to the channel (Paddle will
33+
raise an exception if users try to send to a closed channel.)
34+
35+
- **fluid.channel_close(channel)**
36+
37+
```
38+
fluid.channel_close(ch)
39+
```
40+
41+
### Send data to a channel
42+
43+
Sends a variable to a channel. Currently, variables of dtype `LoDTensor`,
44+
`LoDRankTable`, `LoDTensorArray`, `SelectedRows`, `ReaderHolder`, and
45+
`ChannelHolder` are supported.
46+
47+
By default, the data of the Variable is moved from the sender to the receiver,
48+
however the user can optionally copy the data before performing the send.
49+
50+
- **channel_send(channel, variable, is_copy=False)**
51+
- **channel**: The channel to send the variable to
52+
- **variable**: The variable to send to the channel
53+
- **is_copy**: If set to True, channel_send will perform a variable assign
54+
to copy the source variable to a new variable to be sent.
55+
56+
```
57+
ch = fluid.make_channel(dtype=core.VarDesc.VarType.LOD_TENSOR)
58+
var = fill_constant(shape=[1],dtype=core.VarDesc.VarType.INT32, value=100)
59+
fluid.channel_send(ch, var, True)
60+
```
61+
62+
### Receive data from a channel
63+
64+
Receives a variable from a channel. The data of the variable is moved to the
65+
receiving variable.
66+
67+
- **channel_recv(channel, return_variable)**
68+
- **channel**: The channel to receive the variable from
69+
- **return_variable**: The destination variable used to store the data of the
70+
variable received from the channel
71+
72+
```
73+
ch = fluid.make_channel(dtype=core.VarDesc.VarType.LOD_TENSOR)
74+
var = fill_constant(shape=[1],dtype=core.VarDesc.VarType.INT32, value=-1)
75+
fluid.channel_recv(ch, var)
76+
```
77+
78+
## How it Works
79+
80+
Channels provides a simple interface for different threads to share data.
81+
To support the synchronization requirements, channels utilizes a series of
82+
internal queues, locks, and conditional variables.
83+
84+
### QueueMessage
85+
86+
QueueMessage encapsulates the state of the channel send/receive operation to be
87+
put in the **sendq/recvq**. It contains a condition variable used to lock the
88+
thread (when there are no available sends/receives). In addition, it contains
89+
a callback function to notify a thread when the QueueMessage is being
90+
processed by the channel.
91+
92+
### Queues
93+
94+
- **buff_**: This queue holds the data buffer in a buffered channel. The
95+
capacity is set to the capacity of the channel. This data buffer is not
96+
used in an unbuffered channel.
97+
98+
- **sendq**: This queue holds the QueueMessage of any pending senders of a
99+
channel. When a thread performs a channel_send operation on the channel, the
100+
channel_send operation will put a new QueueMessage on the sendq and block the
101+
current thread under two conditions:
102+
1. The channel is buffered and is full
103+
2. The channel is unbuffered and does not have a receiver
104+
105+
- **recvq**: This queue holds the QueueMessage of any pending receivers of a
106+
channel. When a thread performs a channel_recv operation on the channel, the
107+
channel_recv operation will put a new QueueMessage on the recvq and block the
108+
current thread under two conditions:
109+
1. The channel is buffered and there is no data on the buff_
110+
2. The channel is unbuffered and does not have a sender
111+
112+
### State diagram
113+
114+
#### Channel Send
115+
116+
<p align="center">
117+
<img src="./images/channel_send.png"/><br/>
118+
</p>
119+
120+
#### Channel Receive
121+
122+
<p align="center">
123+
<img src="./images/channel_recv.png"/><br/>
124+
</p>
125+
126+
## Limitations and Considerations
127+
128+
### Variable Copy
129+
130+
In golang, variables in channels are copied from the sender to the receiver.
131+
In Paddle, the data from our variables are **moved** from sender to receiver.
132+
As a result, these variables should not be used after they are sent. We
133+
provide a flag in channel_send method to allow users to copy the variable to
134+
be sent before it is sent.
135+
136+
Please note that this is acheived by adding an **assign** operator and creating
137+
a temporary variable that is sent in place of the original variable. Please
138+
note that **assign** operator has limited support for only certain variables
139+
datatypes.
133 KB
Loading
83.6 KB
Loading

paddle/fluid/framework/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ cc_test(init_test SRCS init_test.cc DEPS init)
100100
cc_test(op_kernel_type_test SRCS op_kernel_type_test.cc DEPS place device_context framework_proto)
101101
cc_test(cow_ptr_tests SRCS details/cow_ptr_test.cc)
102102

103-
cc_test(channel_test SRCS channel_test.cc)
103+
# cc_test(channel_test SRCS channel_test.cc)
104104
cc_test(tuple_test SRCS tuple_test.cc )
105105
cc_test(concurrency_test SRCS concurrency_test.cc DEPS go_op channel_close_op channel_create_op
106106
channel_send_op channel_recv_op sum_op select_op elementwise_add_op compare_op

paddle/fluid/framework/block_desc.cc

Lines changed: 43 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -147,15 +147,52 @@ void BlockDesc::RemoveOp(size_t s, size_t e) {
147147
if (ops_.begin() + s == ops_.end() || ops_.begin() + e == ops_.end()) {
148148
return;
149149
}
150+
auto get_vars = [](std::deque<std::unique_ptr<OpDesc>>::iterator &op,
151+
std::vector<std::string> &v) {
152+
auto in_names = (*op)->InputArgumentNames();
153+
v.insert(v.end(), in_names.begin(), in_names.end());
154+
auto out_names = (*op)->OutputArgumentNames();
155+
v.insert(v.end(), out_names.begin(), out_names.end());
156+
std::sort(v.begin(), v.end());
157+
auto last = std::unique(v.begin(), v.end());
158+
v.erase(last, v.end());
159+
};
150160
need_update_ = true;
151-
for (auto it = ops_.begin() + s; it != ops_.begin() + e; it++) {
152-
auto names = (*it)->InputArgumentNames();
153-
for (auto n : names) {
154-
// TODO(typhoonzero): delete vars if no other op use it.
155-
VLOG(3) << "deleting var " << n;
161+
162+
for (size_t i = s; i < e; i++) {
163+
// since remove op one by one, every time remove the first op.
164+
auto op = ops_.begin() + s;
165+
166+
// collect input and output variables from current delete op
167+
std::vector<std::string> cur_vars;
168+
get_vars(op, cur_vars);
169+
170+
// remove current op
171+
ops_.erase(ops_.begin() + s);
172+
173+
// collect input and output variables from other ops
174+
std::vector<std::string> other_vars;
175+
for (auto it = ops_.begin(); it != ops_.end(); it++) {
176+
get_vars(it, other_vars);
177+
}
178+
179+
// variables should be deleted
180+
std::vector<std::string> delete_vars;
181+
// delete_vars = cur_vars - cur_vars ^ other_input_vars
182+
std::set_difference(cur_vars.begin(), cur_vars.end(), other_vars.begin(),
183+
other_vars.end(),
184+
std::inserter(delete_vars, delete_vars.end()));
185+
// remove variables
186+
for (size_t i = 0; i < delete_vars.size(); i++) {
187+
auto name = delete_vars[i];
188+
auto it = vars_.find(name);
189+
PADDLE_ENFORCE(it != vars_.end(),
190+
"%s is not in variable list, it should not be deleted",
191+
name);
192+
vars_.erase(it);
193+
VLOG(3) << "deleting variable " << name;
156194
}
157195
}
158-
ops_.erase(ops_.begin() + s, ops_.begin() + e);
159196
}
160197

161198
std::vector<OpDesc *> BlockDesc::AllOps() const {

paddle/fluid/framework/block_desc.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,11 @@ class BlockDesc {
8989

9090
OpDesc *InsertOp(size_t index);
9191

92+
/*
93+
* Remove Op and its input/output variables.
94+
* Note that for either input or ouput variable, if it is also an input or
95+
* output variable of other ops, we should remain it.
96+
*/
9297
void RemoveOp(size_t s, size_t e);
9398

9499
std::vector<OpDesc *> AllOps() const;

paddle/fluid/operators/activation_op.cc

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,36 @@ Floor Activation Operator.
260260
}
261261
};
262262

263+
class CosOpMaker : public framework::OpProtoAndCheckerMaker {
264+
public:
265+
CosOpMaker(OpProto *proto, OpAttrChecker *op_checker)
266+
: framework::OpProtoAndCheckerMaker(proto, op_checker) {
267+
AddInput("X", "Input of Cosine operator");
268+
AddOutput("Out", "Output of Cosine operator");
269+
AddComment(R"DOC(
270+
Cosine Activation Operator.
271+
272+
$out = cos(x)$
273+
274+
)DOC");
275+
}
276+
};
277+
278+
class SinOpMaker : public framework::OpProtoAndCheckerMaker {
279+
public:
280+
SinOpMaker(OpProto *proto, OpAttrChecker *op_checker)
281+
: framework::OpProtoAndCheckerMaker(proto, op_checker) {
282+
AddInput("X", "Input of Sine operator");
283+
AddOutput("Out", "Output of Sine operator");
284+
AddComment(R"DOC(
285+
Sine Activation Operator.
286+
287+
$out = sin(x)$
288+
289+
)DOC");
290+
}
291+
};
292+
263293
class RoundOpMaker : public framework::OpProtoAndCheckerMaker {
264294
public:
265295
RoundOpMaker(OpProto *proto, OpAttrChecker *op_checker)
@@ -561,6 +591,12 @@ REGISTER_OP(ceil, ops::ActivationOp, ops::CeilOpMaker, ceil_grad,
561591
REGISTER_OP(floor, ops::ActivationOp, ops::FloorOpMaker, floor_grad,
562592
ops::ActivationOpGrad);
563593

594+
REGISTER_OP(cos, ops::ActivationOp, ops::CosOpMaker, cos_grad,
595+
ops::ActivationOpGrad);
596+
597+
REGISTER_OP(sin, ops::ActivationOp, ops::SinOpMaker, sin_grad,
598+
ops::ActivationOpGrad);
599+
564600
REGISTER_OP(round, ops::ActivationOp, ops::RoundOpMaker, round_grad,
565601
ops::ActivationOpGrad);
566602

0 commit comments

Comments
 (0)