Skip to content

Commit 0e73967

Browse files
author
ranqiu
committed
Update the annotations of layers.py
1 parent 7d343fc commit 0e73967

File tree

1 file changed

+117
-104
lines changed
  • python/paddle/trainer_config_helpers

1 file changed

+117
-104
lines changed

python/paddle/trainer_config_helpers/layers.py

Lines changed: 117 additions & 104 deletions
Original file line numberDiff line numberDiff line change
@@ -5135,12 +5135,19 @@ def block_expand_layer(input,
51355135
@layer_support()
51365136
def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None):
51375137
"""
5138-
A layer to do max out on conv layer output.
5139-
- Input: output of a conv layer.
5140-
- Output: feature map size same as input. Channel is (input channel) / groups.
5138+
A layer to do max out on convolutional layer output.
5139+
- Input: the output of a convolutional layer.
5140+
- Output: feature map size same as the input's, and its channel number is
5141+
(input channel) / groups.
51415142
51425143
So groups should be larger than 1, and the num of channels should be able
5143-
to devided by groups.
5144+
to be devided by groups.
5145+
5146+
Reference:
5147+
Maxout Networks
5148+
http://www.jmlr.org/proceedings/papers/v28/goodfellow13.pdf
5149+
Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks
5150+
https://arxiv.org/pdf/1312.6082v4.pdf
51445151
51455152
.. math::
51465153
y_{si+j} = \max_k x_{gsi + sk + j}
@@ -5150,12 +5157,6 @@ def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None):
51505157
0 \le j < s
51515158
0 \le k < groups
51525159
5153-
Please refer to Paper:
5154-
- Maxout Networks: http://www.jmlr.org/proceedings/papers/v28/goodfellow13.pdf
5155-
- Multi-digit Number Recognition from Street View \
5156-
Imagery using Deep Convolutional Neural Networks: \
5157-
https://arxiv.org/pdf/1312.6082v4.pdf
5158-
51595160
The simple usage is:
51605161
51615162
.. code-block:: python
@@ -5166,14 +5167,16 @@ def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None):
51665167
51675168
:param input: The input of this layer.
51685169
:type input: LayerOutput
5169-
:param num_channels: The channel number of input layer. If None will be set
5170-
automatically from previous output.
5171-
:type num_channels: int | None
5170+
:param num_channels: The number of input channels. If the parameter is not set or
5171+
set to None, its actual value will be automatically set to
5172+
the channels number of the input.
5173+
:type num_channels: int
51725174
:param groups: The group number of input layer.
51735175
:type groups: int
51745176
:param name: The name of this layer. It is optional.
5175-
:type name: None | basestring.
5176-
:param layer_attr: Extra Layer attribute.
5177+
:type name: basestring
5178+
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
5179+
details.
51775180
:type layer_attr: ExtraLayerAttribute
51785181
:return: LayerOutput object.
51795182
:rtype: LayerOutput
@@ -5205,20 +5208,20 @@ def ctc_layer(input,
52055208
layer_attr=None):
52065209
"""
52075210
Connectionist Temporal Classification (CTC) is designed for temporal
5208-
classication task. That is, for sequence labeling problems where the
5211+
classication task. e.g. sequence labeling problems where the
52095212
alignment between the inputs and the target labels is unknown.
52105213
5211-
More details can be found by referring to `Connectionist Temporal
5212-
Classification: Labelling Unsegmented Sequence Data with Recurrent
5213-
Neural Networks <http://machinelearning.wustl.edu/mlpapers/paper_files/
5214-
icml2006_GravesFGS06.pdf>`_
5214+
Reference:
5215+
Connectionist Temporal Classification: Labelling Unsegmented Sequence Data
5216+
with Recurrent Neural Networks
5217+
http://machinelearning.wustl.edu/mlpapers/paper_files/icml2006_GravesFGS06.pdf
52155218
52165219
Note:
5217-
Considering the 'blank' label needed by CTC, you need to use
5218-
(num_classes + 1) as the input size. num_classes is the category number.
5219-
And the 'blank' is the last category index. So the size of 'input' layer, such as
5220-
fc_layer with softmax activation, should be num_classes + 1. The size of ctc_layer
5221-
should also be num_classes + 1.
5220+
Considering the 'blank' label needed by CTC, you need to use (num_classes + 1)
5221+
as the size of the input, where num_classes is the category number.
5222+
And the 'blank' is the last category index. So the size of 'input' layer (e.g.
5223+
fc_layer with softmax activation) should be (num_classes + 1). The size of
5224+
ctc_layer should also be (num_classes + 1).
52225225
52235226
The example usage is:
52245227
@@ -5231,16 +5234,17 @@ def ctc_layer(input,
52315234
52325235
:param input: The input of this layer.
52335236
:type input: LayerOutput
5234-
:param label: The data layer of label with variable length.
5237+
:param label: The input label.
52355238
:type label: LayerOutput
5236-
:param size: category numbers + 1.
5239+
:param size: The dimension of this layer, which must be equal to (category number + 1).
52375240
:type size: int
52385241
:param name: The name of this layer. It is optional.
5239-
:type name: basestring | None
5240-
:param norm_by_times: Whether to normalization by times. False by default.
5242+
:type name: basestring
5243+
:param norm_by_times: Whether to do normalization by times. False is the default.
52415244
:type norm_by_times: bool
5242-
:param layer_attr: Extra Layer config.
5243-
:type layer_attr: ExtraLayerAttribute | None
5245+
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
5246+
details.
5247+
:type layer_attr: ExtraLayerAttribute
52445248
:return: LayerOutput object.
52455249
:rtype: LayerOutput
52465250
"""
@@ -5281,20 +5285,19 @@ def warp_ctc_layer(input,
52815285
building process, PaddlePaddle will clone the source codes, build and
52825286
install it to :code:`third_party/install/warpctc` directory.
52835287
5284-
More details of CTC can be found by referring to `Connectionist Temporal
5285-
Classification: Labelling Unsegmented Sequence Data with Recurrent
5286-
Neural Networks <http://machinelearning.wustl.edu/mlpapers/paper_files/
5287-
icml2006_GravesFGS06.pdf>`_.
5288+
Reference:
5289+
Connectionist Temporal Classification: Labelling Unsegmented Sequence Data
5290+
with Recurrent Neural Networks
5291+
http://machinelearning.wustl.edu/mlpapers/paper_files/icml2006_GravesFGS06.pdf
52885292
52895293
Note:
5290-
- Let num_classes represent the category number. Considering the 'blank'
5291-
label needed by CTC, you need to use (num_classes + 1) as the input size.
5292-
Thus, the size of both warp_ctc layer and 'input' layer should be set to
5293-
num_classes + 1.
5294+
- Let num_classes represents the category number. Considering the 'blank'
5295+
label needed by CTC, you need to use (num_classes + 1) as the size of
5296+
warp_ctc layer.
52945297
- You can set 'blank' to any value ranged in [0, num_classes], which
5295-
should be consistent as that used in your labels.
5298+
should be consistent with those used in your labels.
52965299
- As a native 'softmax' activation is interated to the warp-ctc library,
5297-
'linear' activation is expected instead in the 'input' layer.
5300+
'linear' activation is expected to be used instead in the 'input' layer.
52985301
52995302
The example usage is:
53005303
@@ -5308,18 +5311,19 @@ def warp_ctc_layer(input,
53085311
53095312
:param input: The input of this layer.
53105313
:type input: LayerOutput
5311-
:param label: The data layer of label with variable length.
5314+
:param label: The input label.
53125315
:type label: LayerOutput
5313-
:param size: category numbers + 1.
5316+
:param size: The dimension of this layer, which must be equal to (category number + 1).
53145317
:type size: int
53155318
:param name: The name of this layer. It is optional.
5316-
:type name: basestring | None
5317-
:param blank: the 'blank' label used in ctc
5319+
:type name: basestring
5320+
:param blank: The 'blank' label used in ctc.
53185321
:type blank: int
5319-
:param norm_by_times: Whether to normalization by times. False by default.
5322+
:param norm_by_times: Whether to do normalization by times. False is the default.
53205323
:type norm_by_times: bool
5321-
:param layer_attr: Extra Layer config.
5322-
:type layer_attr: ExtraLayerAttribute | None
5324+
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
5325+
details.
5326+
:type layer_attr: ExtraLayerAttribute
53235327
:return: LayerOutput object.
53245328
:rtype: LayerOutput
53255329
"""
@@ -5365,23 +5369,25 @@ def crf_layer(input,
53655369
label=label,
53665370
size=label_dim)
53675371
5368-
:param input: The first input layer is the feature.
5372+
:param input: The first input layer.
53695373
:type input: LayerOutput
5370-
:param label: The second input layer is label.
5374+
:param label: The input label.
53715375
:type label: LayerOutput
53725376
:param size: The category number.
53735377
:type size: int
5374-
:param weight: The third layer is "weight" of each sample, which is an
5375-
optional argument.
5378+
:param weight: The scale of the cost of each sample. It is optional.
53765379
:type weight: LayerOutput
5377-
:param param_attr: Parameter attribute. None means default attribute
5380+
:param param_attr: The parameter attribute. See ParameterAttribute for
5381+
details.
53785382
:type param_attr: ParameterAttribute
53795383
:param name: The name of this layer. It is optional.
5380-
:type name: None | basestring
5381-
:param coeff: The coefficient affects the gradient in the backward.
5384+
:type name: basestring
5385+
:param coeff: The weight of the gradient in the back propagation.
5386+
1.0 is the default.
53825387
:type coeff: float
5383-
:param layer_attr: Extra Layer config.
5384-
:type layer_attr: ExtraLayerAttribute | None
5388+
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
5389+
details.
5390+
:type layer_attr: ExtraLayerAttribute
53855391
:return: LayerOutput object.
53865392
:rtype: LayerOutput
53875393
"""
@@ -5427,9 +5433,9 @@ def crf_decoding_layer(input,
54275433
"""
54285434
A layer for calculating the decoding sequence of sequential conditional
54295435
random field model. The decoding sequence is stored in output.ids.
5430-
If a second input is provided, it is treated as the ground-truth label, and
5431-
this layer will also calculate error. output.value[i] is 1 for incorrect
5432-
decoding or 0 for correct decoding.
5436+
If the input 'label' is provided, it is treated as the ground-truth label, and
5437+
this layer will also calculate error. output.value[i] is 1 for an incorrect
5438+
decoding and 0 for the correct.
54335439
54345440
The example usage is:
54355441
@@ -5440,16 +5446,18 @@ def crf_decoding_layer(input,
54405446
54415447
:param input: The first input layer.
54425448
:type input: LayerOutput
5443-
:param size: size of this layer.
5449+
:param size: The dimension of this layer.
54445450
:type size: int
5445-
:param label: None or ground-truth label.
5446-
:type label: LayerOutput or None
5447-
:param param_attr: Parameter attribute. None means default attribute
5451+
:param label: The input label.
5452+
:type label: LayerOutput | None
5453+
:param param_attr: The parameter attribute. See ParameterAttribute for
5454+
details.
54485455
:type param_attr: ParameterAttribute
54495456
:param name: The name of this layer. It is optional.
5450-
:type name: None | basestring
5451-
:param layer_attr: Extra Layer config.
5452-
:type layer_attr: ExtraLayerAttribute | None
5457+
:type name: basestring
5458+
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
5459+
details.
5460+
:type layer_attr: ExtraLayerAttribute
54535461
:return: LayerOutput object.
54545462
:rtype: LayerOutput
54555463
"""
@@ -5494,8 +5502,10 @@ def nce_layer(input,
54945502
layer_attr=None):
54955503
"""
54965504
Noise-contrastive estimation.
5497-
Implements the method in the following paper:
5498-
A fast and simple algorithm for training neural probabilistic language models.
5505+
5506+
Reference:
5507+
A fast and simple algorithm for training neural probabilistic language models.
5508+
http://www.icml.cc/2012/papers/855.pdf
54995509
55005510
The example usage is:
55015511
@@ -5507,31 +5517,33 @@ def nce_layer(input,
55075517
55085518
:param name: The name of this layer. It is optional.
55095519
:type name: basestring
5510-
:param input: The input layers. It could be a LayerOutput of list/tuple of LayerOutput.
5520+
:param input: The first input of this layer.
55115521
:type input: LayerOutput | list | tuple | collections.Sequence
5512-
:param label: label layer
5522+
:param label: The input label.
55135523
:type label: LayerOutput
5514-
:param weight: weight layer, can be None(default)
5524+
:param weight: The scale of the cost. It is optional.
55155525
:type weight: LayerOutput
5516-
:param num_classes: number of classes.
5526+
:param num_classes: The number of classes.
55175527
:type num_classes: int
55185528
:param act: Activation type. SigmoidActivation is the default.
55195529
:type act: BaseActivation
5520-
:param param_attr: The Parameter Attribute|list.
5530+
:param param_attr: The parameter attribute. See ParameterAttribute for
5531+
details.
55215532
:type param_attr: ParameterAttribute
5522-
:param num_neg_samples: number of negative samples. Default is 10.
5533+
:param num_neg_samples: The number of negative samples. 10 is the default.
55235534
:type num_neg_samples: int
5524-
:param neg_distribution: The distribution for generating the random negative labels.
5525-
A uniform distribution will be used if not provided.
5526-
If not None, its length must be equal to num_classes.
5535+
:param neg_distribution: The probability distribution for generating the random negative
5536+
labels. If this parameter is not set, a uniform distribution will
5537+
be used. If not None, its length must be equal to num_classes.
55275538
:type neg_distribution: list | tuple | collections.Sequence | None
55285539
:param bias_attr: The bias attribute. If the parameter is set to False or an object
55295540
whose type is not ParameterAttribute, no bias is defined. If the
55305541
parameter is set to True, the bias is initialized to zero.
55315542
:type bias_attr: ParameterAttribute | None | bool | Any
5532-
:param layer_attr: Extra Layer Attribute.
5543+
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
5544+
details.
55335545
:type layer_attr: ExtraLayerAttribute
5534-
:return: layer name.
5546+
:return: LayerOutput object.
55355547
:rtype: LayerOutput
55365548
"""
55375549
if isinstance(input, LayerOutput):
@@ -5605,11 +5617,11 @@ def rank_cost(left,
56055617
coeff=1.0,
56065618
layer_attr=None):
56075619
"""
5608-
A cost Layer for learning to rank using gradient descent. Details can refer
5609-
to `papers <http://research.microsoft.com/en-us/um/people/cburges/papers/
5610-
ICML_ranking.pdf>`_.
5611-
This layer contains at least three inputs. The weight is an optional
5612-
argument, which affects the cost.
5620+
A cost Layer for learning to rank using gradient descent.
5621+
5622+
Reference:
5623+
Learning to Rank using Gradient Descent
5624+
http://research.microsoft.com/en-us/um/people/cburges/papers/ICML_ranking.pdf
56135625
56145626
.. math::
56155627
@@ -5640,14 +5652,15 @@ def rank_cost(left,
56405652
:type right: LayerOutput
56415653
:param label: Label is 1 or 0, means positive order and reverse order.
56425654
:type label: LayerOutput
5643-
:param weight: The weight affects the cost, namely the scale of cost.
5644-
It is an optional argument.
5655+
:param weight: The scale of cost. It is optional.
56455656
:type weight: LayerOutput
56465657
:param name: The name of this layer. It is optional.
5647-
:type name: None | basestring
5648-
:param coeff: The coefficient affects the gradient in the backward.
5658+
:type name: basestring
5659+
:param coeff: The weight of the gradient in the back propagation.
5660+
1.0 is the default.
56495661
:type coeff: float
5650-
:param layer_attr: Extra Layer Attribute.
5662+
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
5663+
details.
56515664
:type layer_attr: ExtraLayerAttribute
56525665
:return: LayerOutput object.
56535666
:rtype: LayerOutput
@@ -5692,25 +5705,25 @@ def lambda_cost(input,
56925705
NDCG_num=8,
56935706
max_sort_size=-1)
56945707
5695-
:param input: Samples of the same query should be loaded as sequence.
5708+
:param input: The first input of this layer, which is often a document
5709+
samples list of the same query and whose type must be sequence.
56965710
:type input: LayerOutput
5697-
:param score: The 2nd input. Score of each sample.
5711+
:param score: The scores of the samples.
56985712
:type input: LayerOutput
56995713
:param NDCG_num: The size of NDCG (Normalized Discounted Cumulative Gain),
57005714
e.g., 5 for NDCG@5. It must be less than or equal to the
5701-
minimum size of lists.
5715+
minimum size of the list.
57025716
:type NDCG_num: int
5703-
:param max_sort_size: The size of partial sorting in calculating gradient.
5704-
If max_sort_size = -1, then for each list, the
5705-
algorithm will sort the entire list to get gradient.
5706-
In other cases, max_sort_size must be greater than or
5707-
equal to NDCG_num. And if max_sort_size is greater
5708-
than the size of a list, the algorithm will sort the
5709-
entire list of get gradient.
5717+
:param max_sort_size: The size of partial sorting in calculating gradient. If
5718+
max_sort_size is equal to -1 or greater than the number
5719+
of the samples in the list, then the algorithm will sort
5720+
the entire list to compute the gradient. In other cases,
5721+
max_sort_size must be greater than or equal to NDCG_num.
57105722
:type max_sort_size: int
57115723
:param name: The name of this layer. It is optional.
5712-
:type name: None | basestring
5713-
:param layer_attr: Extra Layer Attribute.
5724+
:type name: basestring
5725+
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
5726+
details.
57145727
:type layer_attr: ExtraLayerAttribute
57155728
:return: LayerOutput object.
57165729
:rtype: LayerOutput
@@ -6830,8 +6843,8 @@ def img_conv3d_layer(input,
68306843
parameter is set to True, the bias is initialized to zero.
68316844
:type bias_attr: ParameterAttribute | None | bool | Any
68326845
:param num_channels: The number of input channels. If the parameter is not set or
6833-
set to None, its actual value will be automatically set to
6834-
the channels number of the input .
6846+
set to None, its actual value will be automatically set to
6847+
the channels number of the input.
68356848
:type num_channels: int
68366849
:param param_attr: The parameter attribute of the convolution. See ParameterAttribute for
68376850
details.

0 commit comments

Comments
 (0)