Skip to content

Commit 5a3d136

Browse files
author
chengduo
authored
Merge pull request #5951 from chengduoZH/fix_conv_doc
fix conv and conv_trans op doc
2 parents 1b6dcc2 + c339e1b commit 5a3d136

File tree

5 files changed

+104
-82
lines changed

5 files changed

+104
-82
lines changed

paddle/operators/conv_op.cc

Lines changed: 36 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ Conv2DOpMaker::Conv2DOpMaker(framework::OpProto* proto,
9797
.SetDefault({0, 0});
9898
AddAttr<int>(
9999
"groups",
100-
"(int default:1), the group size of convolution operator. "
100+
"(int default:1), the groups number of the convolution operator. "
101101
"According to grouped convolution in Alex Krizhevsky's Deep CNN paper: "
102102
"when group=2, the first half of the filters is only connected to the "
103103
"first half of the input channels, while the second half of the filters "
@@ -112,23 +112,29 @@ Conv2DOpMaker::Conv2DOpMaker(framework::OpProto* proto,
112112
Convolution Operator.
113113
114114
The convolution operation calculates the output based on the input, filter
115-
and strides, paddings, groups, dilations parameters. The size of each dimension of the
115+
and strides, paddings, dilations, groups parameters. The size of each dimension of the
116116
parameters is checked in the infer-shape.
117-
Input(Input, Filter) and output(Output) are in NCHW format. Where N is batch
117+
Input(Input) and Output(Output) are in NCHW format. Where N is batch
118118
size, C is the number of channels, H is the height of the feature, and W is
119-
the width of the feature. Parameters(ksize, strides, paddings, dilations) are two elements.
120-
These two elements represent height and width, respectively.
119+
the width of the feature.
120+
Filters(Input) is MCHW format. Where M is the number of output image channels, C is
121+
the number of input image channels, H is the height of the filter, and W
122+
is the width of the filter.
123+
Parameters(strides, paddings, dilations) are two elements. These two elements represent
124+
height and width, respectively.
121125
The input(X) size and output(Out) size may be different.
122126
123127
Example:
124128
Input:
125-
Input shape: (N, C_in, H_in, W_in)
126-
Filter shape: (C_out, C_in, H_f, W_f)
129+
Input shape: $(N, C_{in}, H_{in}, W_{in})$
130+
Filter shape: $(C_{out}, C_{in}, H_f, W_f)$
127131
Output:
128-
Output shape: (N, C_out, H_out, W_out)
129-
where
130-
H_out = (H_in + 2 * paddings[0] - (dilations[0]*(filter_size[0] - 1) + 1)) / strides[0] + 1;
131-
W_out = (W_in + 2 * paddings[1] - (dilations[1]*(filter_size[1] - 1) + 1)) / strides[1] + 1;
132+
Output shape: $(N, C_{out}, H_{out}, W_{out})$
133+
Where
134+
$$
135+
H_{out}= \frac{(H_{in} + 2 * paddings[0] - (dilations[0] * (H_f - 1) + 1))}{strides[0]}+ 1 \\
136+
W_{out}= \frac{(W_{in} + 2 * paddings[1] - (dilations[1] * (W_f - 1) + 1))}{strides[1]}+ 1
137+
$$
132138
)DOC");
133139
}
134140

@@ -165,7 +171,7 @@ Conv3DOpMaker::Conv3DOpMaker(framework::OpProto* proto,
165171
.SetDefault({0, 0, 0});
166172
AddAttr<int>(
167173
"groups",
168-
"(int default:1), the group size of convolution operator. "
174+
"(int default:1), the groups number of the convolution operator. "
169175
"According to grouped convolution in Alex Krizhevsky's Deep CNN paper: "
170176
"when group=2, the first half of the filters is only connected to the "
171177
"first half of the input channels, while the second half of the filters "
@@ -174,32 +180,37 @@ Conv3DOpMaker::Conv3DOpMaker(framework::OpProto* proto,
174180
AddAttr<std::vector<int>>("dilations",
175181
"(vector<int> default:{1, 1, 1}), the "
176182
"dilations(d_dilation, h_dilation, w_dilation) of "
177-
"convolution operator. Currently, conv3d doesn't "
178-
"support dilation.")
183+
"convolution operator.")
179184
.SetDefault({1, 1, 1});
180185

181186
AddComment(R"DOC(
182187
Convolution3D Operator.
183188
184189
The convolution operation calculates the output based on the input, filter
185-
and strides, paddings, groups parameters. The size of each dimension of the
190+
and strides, paddings, dilations, groups parameters. The size of each dimension of the
186191
parameters is checked in the infer-shape.
187-
Input(Input, Filter) and output(Output) are in NCDHW format. Where N is batch
192+
Input(Input) and output(Output) are in NCDHW format, where N is batch
188193
size, C is the number of channels,D is the depth of the feature, H is the height of
189-
the feature, and W is the width of the feature. Parameters(ksize, strides, paddings)
190-
are three elements. These three elements represent depth, height and width, respectively.
194+
the feature, and W is the width of the feature.
195+
Filters(Input) is MCDHW format, where M is the number of output image channels,
196+
C is the number of input image channels, D is the depth of the filter,
197+
H is the height of the filter, and W is the width of the filter.
198+
Parameters(strides, paddings, dilations) are three elements. These three elements
199+
represent depth, height and width, respectively.
191200
The input(X) size and output(Out) size may be different.
192201
193202
Example:
194203
Input:
195-
Input shape: (N, C_in, D_in, H_in, W_in)
196-
Filter shape: (C_out, C_in, D_f, H_f, W_f)
204+
Input shape: $(N, C_{in}, D_{in}, H_{in}, W_{in})$
205+
Filter shape: $(C_{out}, C_{in}, D_f, H_f, W_f)$
197206
Output:
198-
Output shape: (N, C_out, D_out, H_out, W_out)
199-
where
200-
D_out = (D_in - filter_size[0] + 2 * paddings[0]) / strides[0] + 1;
201-
H_out = (H_in - filter_size[1] + 2 * paddings[1]) / strides[1] + 1;
202-
W_out = (W_in - filter_size[2] + 2 * paddings[2]) / strides[2] + 1;
207+
Output shape: $(N, C_{out}, D_{out}, H_{out}, W_{out})$
208+
Where
209+
$$
210+
D_{out}= \frac{(D_{in} + 2 * paddings[0] - (dilations[0] * (D_f - 1) + 1))}{ strides[0]}+ 1 \\
211+
H_{out}= \frac{(H_{in} + 2 * paddings[1] - (dilations[1] * (H_f - 1) + 1))}{ strides[1]}+ 1 \\
212+
W_{out}= \frac{(W_{in} + 2 * paddings[2] - (dilations[2] * (W_f - 1) + 1))}{ strides[2]}+ 1
213+
$$
203214
)DOC");
204215
}
205216

paddle/operators/conv_transpose_op.cc

Lines changed: 47 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ void ConvTransposeOp::InferShape(framework::InferShapeContext* ctx) const {
3939
"ConvTransposeOp input dimension and strides dimension should "
4040
"be consistent.");
4141
PADDLE_ENFORCE_EQ(paddings.size(), strides.size(),
42-
"ConvTransposeOp paddings dimension and Conv strides "
42+
"ConvTransposeOp paddings dimension and strides "
4343
"dimension should be the same.");
4444
PADDLE_ENFORCE_EQ(in_dims[1], filter_dims[0],
4545
"In ConvTransposeOp, The input channel should be the same "
@@ -62,13 +62,14 @@ Conv2DTransposeOpMaker::Conv2DTransposeOpMaker(
6262
"The format of input tensor is NCHW. Where N is batch size, C is the "
6363
"number of input channels, H is the height of the feature, and "
6464
"W is the width of the feature.");
65-
AddInput("Filter",
66-
"(Tensor) The filter tensor of convolution transpose operator. "
67-
"The format of the filter tensor is CMHW, where C is the number of "
68-
"output image channels, M is the number of input image channels, "
69-
"H is the height of the filter, and W is the width of the filter. "
70-
"We enforce groups number == 1 and padding == 0 in "
71-
"the convolution transpose scenario.");
65+
AddInput(
66+
"Filter",
67+
"(Tensor) The filter tensor of convolution transpose operator. "
68+
"The format of the filter tensor is MCHW, where M is the number of "
69+
"input feature channels, C is the number of "
70+
"output feature channels,"
71+
"H is the height of the filter, and W is the width of the filter. "
72+
"We enforce groups number == 1 in the convolution transpose scenario.");
7273
AddOutput("Output",
7374
"(Tensor) The output tensor of convolution transpose operator. "
7475
"The format of output tensor is also NCHW.");
@@ -88,21 +89,26 @@ Convolution2D Transpose Operator.
8889
The convolution transpose operation calculates the output based on the input, filter
8990
and strides, paddings, groups parameters. The size of each dimension of the
9091
parameters is checked in the infer-shape.
91-
92-
Input(Input, Filter) and output(Output) are in NCHW format. Where N is batch
93-
size, C is the number of channels, H is the height of the feature, and
94-
W is the width of the feature. Parameters(ksize, strides, paddings) are two elements.
95-
These two elements represent height and width, respectively.
92+
Input(Input) and output(Output) are in NCHW format. Where N is batchsize, C is the
93+
number of channels, H is the height of the feature, and W is the width of the feature.
94+
Filter(Input) is in MCHW format. Where M is the number of input feature channels,
95+
C is the number of output feature channels, H is the height of the filter,
96+
and W is the width of the filter.
97+
Parameters(strides, paddings) are two elements. These two elements represent height
98+
and width, respectively.
9699
The input(X) size and output(Out) size may be different.
100+
97101
Example:
98102
Input:
99-
Input shape: (N, C_in, H_in, W_in)
100-
Filter shape: (C_in, C_out, H_f, W_f)
103+
Input shape: $(N, C_{in}, H_{in}, W_{in})$
104+
Filter shape: $(C_{in}, C_{out}, H_f, W_f)$
101105
Output:
102-
Output shape: (N, C_out, H_out, W_out)
103-
where
104-
H_out = (H_in - 1) * strides[0] - 2 * paddings[0] + H_f;
105-
W_out = (W_in - 1) * strides[1] - 2 * paddings[1] + W_f;
106+
Output shape: $(N, C_{out}, H_{out}, W_{out})$
107+
Where
108+
$$
109+
H_{out} = (H_{in} - 1) * strides[0] - 2 * paddings[0] + H_f \\
110+
W_{out} = (W_{in} - 1) * strides[1] - 2 * paddings[1] + W_f
111+
$$
106112
)DOC");
107113
}
108114

@@ -117,8 +123,9 @@ Conv3DTransposeOpMaker::Conv3DTransposeOpMaker(
117123
"W is the width of the feature.");
118124
AddInput("Filter",
119125
"(Tensor) The filter tensor of convolution transpose operator."
120-
"The format of the filter tensor is CMDHW, where C is the number of "
121-
"output image channels, M is the number of input image channels, D "
126+
"The format of the filter tensor is MCDHW, where M is the number of "
127+
"input feature channels, C is the number of "
128+
"output feature channels, D "
122129
"is the depth of the filter, H is the height of the filter, and "
123130
"W is the width of the filter."
124131
"We enforce groups number == 1 and padding == 0 in "
@@ -144,23 +151,28 @@ Convolution3D Transpose Operator.
144151
The convolution transpose operation calculates the output based on the input, filter
145152
and strides, paddings, groups parameters. The size of each dimension of the
146153
parameters is checked in the infer-shape.
147-
148-
Input(Input, Filter) and output(Output) are in NCDHW format. Where N is batch
149-
size, C is the number of channels, D is the depth of the feature,
150-
H is the height of the feature, and W is the width of the feature.
151-
Parameters(ksize, strides, paddings) are three elements.
152-
These three elements represent depth, height and width, respectively.
154+
Input(Input) and output(Output) are in NCDHW format. Where N is batch size, C is the
155+
number of channels, D is the depth of the feature, H is the height of the feature,
156+
and W is the width of the feature.
157+
Filter(Input) is in MCDHW format. Where M is the number of input feature channels,
158+
C is the number of output feature channels, D is the depth of the filter,H is the
159+
height of the filter, and W is the width of the filter.
160+
Parameters(strides, paddings) are three elements. These three elements represent
161+
depth, height and width, respectively.
153162
The input(X) size and output(Out) size may be different.
154-
Example:
163+
164+
Example:
155165
Input:
156-
Input shape: (N, C_in, D_in, H_in, W_in)
157-
Filter shape: (C_in, C_out, D_f, H_f, W_f)
166+
Input shape: $(N, C_{in}, D_{in}, H_{in}, W_{in})$
167+
Filter shape: $(C_{in}, C_{out}, D_f, H_f, W_f)$
158168
Output:
159-
Output shape: (N, C_out, D_out, H_out, W_out)
160-
where
161-
D_out = (D_in - 1) * strides[0] - 2 * paddings[0] + D_f;
162-
H_out = (H_in - 1) * strides[1] - 2 * paddings[1] + H_f;
163-
W_out = (W_in - 1) * strides[2] - 2 * paddings[2] + W_f;
169+
Output shape: $(N, C_{out}, D_{out}, H_{out}, W_{out})$
170+
Where
171+
$$
172+
D_{out} = (D_{in} - 1) * strides[0] - 2 * paddings[0] + D_f \\
173+
H_{out} = (H_{in} - 1) * strides[1] - 2 * paddings[1] + H_f \\
174+
W_{out} = (W_{in} - 1) * strides[2] - 2 * paddings[2] + W_f
175+
$$
164176
)DOC");
165177
}
166178

paddle/operators/conv_transpose_op.h

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,6 @@ class GemmConvTransposeKernel : public framework::OpKernel<T> {
6363

6464
std::vector<int> strides = context.Attr<std::vector<int>>("strides");
6565
std::vector<int> paddings = context.Attr<std::vector<int>>("paddings");
66-
// TODO(Zhuoyuan): Paddings can be added in future.
6766
// groups will alway be disabled in conv2dtranspose.
6867

6968
const int batch_size = static_cast<int>(input->dims()[0]);

paddle/operators/pool_op.cc

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ Pool2dOpMaker::Pool2dOpMaker(framework::OpProto *proto,
105105
// TypedAttrChecker don't support vector type.)
106106
AddAttr<std::vector<int>>(
107107
"paddings",
108-
"(vector<int>, defalut {0,0}), paddings(height, width) of pooling "
108+
"(vector<int>, default {0,0}), paddings(height, width) of pooling "
109109
"operator."
110110
"If global_pooling = true, paddings and ksize will be ignored.")
111111
.SetDefault({0, 0}); // TODO(Chengduo): Add checker. (Currently,
@@ -122,15 +122,15 @@ Parameters(ksize, strides, paddings) are two elements.
122122
These two elements represent height and width, respectively.
123123
The input(X) size and output(Out) size may be different.
124124
125-
Example:
125+
Example:
126126
Input:
127127
X shape: $(N, C, H_{in}, W_{in})$
128128
Output:
129129
Out shape: $(N, C, H_{out}, W_{out})$
130-
where
130+
Where
131131
$$
132-
H_{out} = (H_{in} - ksize[0] + 2 * paddings[0]) / strides[0] + 1 \\
133-
W_{out} = (W_{in} - ksize[1] + 2 * paddings[1]) / strides[1] + 1
132+
H_{out} = \frac{(H_{in} - ksize[0] + 2 * paddings[0])}{strides[0]} + 1 \\
133+
W_{out} = \frac{(W_{in} - ksize[1] + 2 * paddings[1])}{strides[1]} + 1
134134
$$
135135
136136
)DOC");
@@ -177,7 +177,7 @@ Pool3dOpMaker::Pool3dOpMaker(framework::OpProto *proto,
177177
// TypedAttrChecker don't support vector type.)
178178
AddAttr<std::vector<int>>(
179179
"paddings",
180-
"(vector<int>, defalut {0,0,0}), paddings(depth, height, "
180+
"(vector<int>, default {0,0,0}), paddings(depth, height, "
181181
"width) of pooling operator. "
182182
"If global_pooling = true, ksize and paddings will be ignored.")
183183
.SetDefault({0, 0, 0}); // TODO(Chengduo): Add checker. (Currently,
@@ -199,12 +199,12 @@ width, respectively. The input(X) size and output(Out) size may be different.
199199
X shape: $(N, C, D_{in}, H_{in}, W_{in})$
200200
Output:
201201
Out shape: $(N, C, D_{out}, H_{out}, W_{out})$
202-
where
203-
$$
204-
D_{out} = (D_{in} - ksize[0] + 2 * paddings[0]) / strides[0] + 1 \\
205-
H_{out} = (H_{in} - ksize[1] + 2 * paddings[1]) / strides[1] + 1 \\
206-
W_{out} = (W_{in} - ksize[2] + 2 * paddings[2]) / strides[2] + 1
207-
$$
202+
Where
203+
$$
204+
D_{out} = \frac{(D_{in} - ksize[0] + 2 * paddings[0])}{strides[0]} + 1 \\
205+
H_{out} = \frac{(H_{in} - ksize[1] + 2 * paddings[1])}{strides[1]} + 1 \\
206+
W_{out} = \frac{(W_{in} - ksize[2] + 2 * paddings[2])}{strides[2]} + 1
207+
$$
208208
209209
)DOC");
210210
}

paddle/operators/pool_with_index_op.cc

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ class MaxPool2dWithIndexOpMaker : public framework::OpProtoAndCheckerMaker {
142142
// TypedAttrChecker don't support vector type.)
143143
AddAttr<std::vector<int>>(
144144
"paddings",
145-
"(vector<int>, defalut:{0, 0}), paddings(height, width) of pooling "
145+
"(vector<int>, default:{0, 0}), paddings(height, width) of pooling "
146146
"operator. "
147147
"If global_pooling = true, paddings and will be ignored.")
148148
.SetDefault({0, 0}); // TODO(Chengduo): Add checker. (Currently,
@@ -166,10 +166,10 @@ The input(X) size and output(Out, Mask) size may be different.
166166
Output:
167167
Out shape: $(N, C, H_{out}, W_{out})$
168168
Mask shape: $(N, C, H_{out}, W_{out})$
169-
where
169+
Where
170170
$$
171-
H_{out} = (H_{in} - ksize[0] + 2 * paddings[0]) / strides[0] + 1 \\
172-
W_{out} = (W_{in} - ksize[1] + 2 * paddings[1]) / strides[1] + 1
171+
H_{out} = \frac{(H_{in} - ksize[0] + 2 * paddings[0])}{strides[0]} + 1 \\
172+
W_{out} = \frac{(W_{in} - ksize[1] + 2 * paddings[1])}{strides[1]} + 1
173173
$$
174174
175175
)DOC");
@@ -220,7 +220,7 @@ class MaxPool3dWithIndexOpMaker : public framework::OpProtoAndCheckerMaker {
220220
// TypedAttrChecker don't support vector type.)
221221
AddAttr<std::vector<int>>(
222222
"paddings",
223-
"(vector, defalut {0,0,0}), paddings(depth, "
223+
"(vector, default {0,0,0}), paddings(depth, "
224224
"height, width) of pooling operator. "
225225
"If global_pooling = true, paddings and ksize will be ignored.")
226226
.SetDefault({0, 0, 0}); // TODO(Chengduo): Add checker. (Currently,
@@ -244,11 +244,11 @@ The input(X) size and output(Out, Mask) size may be different.
244244
Output:
245245
Out shape: $(N, C, D_{out}, H_{out}, W_{out})$
246246
Mask shape: $(N, C, D_{out}, H_{out}, W_{out})$
247-
where
247+
Where
248248
$$
249-
D_{out} = (D_{in} - ksize[0] + 2 * paddings[0]) / strides[0] + 1 \\
250-
H_{out} = (H_{in} - ksize[1] + 2 * paddings[1]) / strides[1] + 1 \\
251-
W_{out} = (W_{in} - ksize[2] + 2 * paddings[2]) / strides[2] + 1
249+
D_{out} = \frac{(D_{in} - ksize[0] + 2 * paddings[0])}{strides[0]} + 1 \\
250+
H_{out} = \frac{(H_{in} - ksize[1] + 2 * paddings[1])}{strides[1]} + 1 \\
251+
W_{out} = \frac{(W_{in} - ksize[2] + 2 * paddings[2])}{strides[2]} + 1
252252
$$
253253
254254
)DOC");

0 commit comments

Comments
 (0)