@@ -2316,19 +2316,28 @@ def layer_norm(input,
2316
2316
Args:
2317
2317
input(Variable): The input tensor variable.
2318
2318
scale(bool): Whether to learn the adaptive gain :math:`g` after
2319
- normalization.
2319
+ normalization. Default True.
2320
2320
shift(bool): Whether to learn the adaptive bias :math:`b` after
2321
- normalization.
2322
- begin_norm_axis(bool ): The normalization will be performed along
2321
+ normalization. Default True.
2322
+ begin_norm_axis(int ): The normalization will be performed along
2323
2323
dimensions from :attr:`begin_norm_axis` to :attr:`rank(input)`.
2324
+ Default 1.
2324
2325
epsilon(float): The small value added to the variance to prevent
2325
- division by zero.
2326
+ division by zero. Default 1e-05.
2326
2327
param_attr(ParamAttr|None): The parameter attribute for the learnable
2327
- gain :math:`g`.
2328
+ gain :math:`g`. If :attr:`scale` is False, :attr:`param_attr` is
2329
+ omitted. If :attr:`scale` is True and :attr:`param_attr` is None,
2330
+ a default :code:`ParamAttr` would be added as scale. The
2331
+ :attr:`param_attr` is initialized as 1 if it is added. Default None.
2328
2332
bias_attr(ParamAttr|None): The parameter attribute for the learnable
2329
- bias :math:`b`.
2333
+ bias :math:`b`. If :attr:`shift` is False, :attr:`bias_attr` is
2334
+ omitted. If :attr:`shift` is True and :attr:`param_attr` is None,
2335
+ a default :code:`ParamAttr` would be added as bias. The
2336
+ :attr:`bias_attr` is initialized as 0 if it is added. Default None.
2330
2337
act(str): Activation to be applied to the output of layer normalizaiton.
2331
- name (str): The name of this layer. It is optional.
2338
+ Default None.
2339
+ name(str): The name of this layer. It is optional. Default None, and a
2340
+ unique name would be generated automatically.
2332
2341
2333
2342
Returns:
2334
2343
${y_comment}
0 commit comments