@@ -2317,19 +2317,28 @@ def layer_norm(input,
2317
2317
Args:
2318
2318
input(Variable): The input tensor variable.
2319
2319
scale(bool): Whether to learn the adaptive gain :math:`g` after
2320
- normalization.
2320
+ normalization. Default True.
2321
2321
shift(bool): Whether to learn the adaptive bias :math:`b` after
2322
- normalization.
2323
- begin_norm_axis(bool ): The normalization will be performed along
2322
+ normalization. Default True.
2323
+ begin_norm_axis(int ): The normalization will be performed along
2324
2324
dimensions from :attr:`begin_norm_axis` to :attr:`rank(input)`.
2325
+ Default 1.
2325
2326
epsilon(float): The small value added to the variance to prevent
2326
- division by zero.
2327
+ division by zero. Default 1e-05.
2327
2328
param_attr(ParamAttr|None): The parameter attribute for the learnable
2328
- gain :math:`g`.
2329
+ gain :math:`g`. If :attr:`scale` is False, :attr:`param_attr` is
2330
+ omitted. If :attr:`scale` is True and :attr:`param_attr` is None,
2331
+ a default :code:`ParamAttr` would be added as scale. The
2332
+ :attr:`param_attr` is initialized as 1 if it is added. Default None.
2329
2333
bias_attr(ParamAttr|None): The parameter attribute for the learnable
2330
- bias :math:`b`.
2334
+ bias :math:`b`. If :attr:`shift` is False, :attr:`bias_attr` is
2335
+ omitted. If :attr:`shift` is True and :attr:`param_attr` is None,
2336
+ a default :code:`ParamAttr` would be added as bias. The
2337
+ :attr:`bias_attr` is initialized as 0 if it is added. Default None.
2331
2338
act(str): Activation to be applied to the output of layer normalizaiton.
2332
- name (str): The name of this layer. It is optional.
2339
+ Default None.
2340
+ name(str): The name of this layer. It is optional. Default None, and a
2341
+ unique name would be generated automatically.
2333
2342
2334
2343
Returns:
2335
2344
${y_comment}
0 commit comments