@@ -1313,21 +1313,24 @@ def sequence_softmax(input, param_attr=None, bias_attr=None, use_cudnn=True):
1313
1313
1314
1314
def softmax (input , param_attr = None , bias_attr = None , use_cudnn = True , name = None ):
1315
1315
"""
1316
- The input of the softmax layer is a 2-D tensor with shape N x K (N is the
1317
- batch_size, K is the dimension of input feature). The output tensor has the
1318
- same shape as the input tensor.
1316
+ The input of the softmax operator is a tensor of any rank. The output tensor
1317
+ has the same shape as the input.
1319
1318
1320
- For each row of the input tensor, the softmax operator squashes the
1321
- K-dimensional vector of arbitrary real values to a K-dimensional vector of real
1322
- values in the range [0, 1] that add up to 1.
1319
+ The input tensor will first be logically flattened to a 2-D matrix. The matrix's
1320
+ second dimension(row length) is as same as the last dimension of the input
1321
+ tensor, and the first dimension(column length) is the product of all other
1322
+ dimensions of the input tensor. For each row of the matrix, the softmax operator
1323
+ squashes the K-dimensional(K is the width of the matrix, which is also the size
1324
+ of the input tensor's last dimension) vector of arbitrary real values to a
1325
+ K-dimensional vector of real values in the range [0, 1] that add up to 1.
1323
1326
1324
1327
It computes the exponential of the given dimension and the sum of exponential
1325
1328
values of all the other dimensions in the K-dimensional vector input.
1326
1329
Then the ratio of the exponential of the given dimension and the sum of
1327
1330
exponential values of all the other dimensions is the output of the softmax
1328
1331
operator.
1329
1332
1330
- For each row :math:`i` and each column :math:`j` in Input(X) , we have:
1333
+ For each row :math:`i` and each column :math:`j` in the matrix , we have:
1331
1334
1332
1335
.. math::
1333
1336
0 commit comments