@@ -2243,19 +2243,28 @@ def layer_norm(input,
2243
2243
Args:
2244
2244
input(Variable): The input tensor variable.
2245
2245
scale(bool): Whether to learn the adaptive gain :math:`g` after
2246
- normalization.
2246
+ normalization. Default True.
2247
2247
shift(bool): Whether to learn the adaptive bias :math:`b` after
2248
- normalization.
2249
- begin_norm_axis(bool ): The normalization will be performed along
2248
+ normalization. Default True.
2249
+ begin_norm_axis(int ): The normalization will be performed along
2250
2250
dimensions from :attr:`begin_norm_axis` to :attr:`rank(input)`.
2251
+ Default 1.
2251
2252
epsilon(float): The small value added to the variance to prevent
2252
- division by zero.
2253
+ division by zero. Default 1e-05.
2253
2254
param_attr(ParamAttr|None): The parameter attribute for the learnable
2254
- gain :math:`g`.
2255
+ gain :math:`g`. If :attr:`scale` is False, :attr:`param_attr` is
2256
+ omitted. If :attr:`scale` is True and :attr:`param_attr` is None,
2257
+ a default :code:`ParamAttr` would be added as scale. The
2258
+ :attr:`param_attr` is initialized as 1 if it is added. Default None.
2255
2259
bias_attr(ParamAttr|None): The parameter attribute for the learnable
2256
- bias :math:`b`.
2260
+ bias :math:`b`. If :attr:`shift` is False, :attr:`bias_attr` is
2261
+ omitted. If :attr:`shift` is True and :attr:`param_attr` is None,
2262
+ a default :code:`ParamAttr` would be added as bias. The
2263
+ :attr:`bias_attr` is initialized as 0 if it is added. Default None.
2257
2264
act(str): Activation to be applied to the output of layer normalizaiton.
2258
- name (str): The name of this layer. It is optional.
2265
+ Default None.
2266
+ name(str): The name of this layer. It is optional. Default None, and a
2267
+ unique name would be generated automatically.
2259
2268
2260
2269
Returns:
2261
2270
${y_comment}
0 commit comments