@@ -2302,19 +2302,28 @@ def layer_norm(input,
2302
2302
Args:
2303
2303
input(Variable): The input tensor variable.
2304
2304
scale(bool): Whether to learn the adaptive gain :math:`g` after
2305
- normalization.
2305
+ normalization. Default True.
2306
2306
shift(bool): Whether to learn the adaptive bias :math:`b` after
2307
- normalization.
2308
- begin_norm_axis(bool ): The normalization will be performed along
2307
+ normalization. Default True.
2308
+ begin_norm_axis(int ): The normalization will be performed along
2309
2309
dimensions from :attr:`begin_norm_axis` to :attr:`rank(input)`.
2310
+ Default 1.
2310
2311
epsilon(float): The small value added to the variance to prevent
2311
- division by zero.
2312
+ division by zero. Default 1e-05.
2312
2313
param_attr(ParamAttr|None): The parameter attribute for the learnable
2313
- gain :math:`g`.
2314
+ gain :math:`g`. If :attr:`scale` is False, :attr:`param_attr` is
2315
+ omitted. If :attr:`scale` is True and :attr:`param_attr` is None,
2316
+ a default :code:`ParamAttr` would be added as scale. The
2317
+ :attr:`param_attr` is initialized as 1 if it is added. Default None.
2314
2318
bias_attr(ParamAttr|None): The parameter attribute for the learnable
2315
- bias :math:`b`.
2319
+ bias :math:`b`. If :attr:`shift` is False, :attr:`bias_attr` is
2320
+ omitted. If :attr:`shift` is True and :attr:`param_attr` is None,
2321
+ a default :code:`ParamAttr` would be added as bias. The
2322
+ :attr:`bias_attr` is initialized as 0 if it is added. Default None.
2316
2323
act(str): Activation to be applied to the output of layer normalizaiton.
2317
- name (str): The name of this layer. It is optional.
2324
+ Default None.
2325
+ name(str): The name of this layer. It is optional. Default None, and a
2326
+ unique name would be generated automatically.
2318
2327
2319
2328
Returns:
2320
2329
${y_comment}
0 commit comments