Skip to content

Commit 2c471db

Browse files
authored
Merge pull request #5884 from lcy-seso/fix_latex
fix LaTeX syntax in three operators' comments.
2 parents e1b2651 + 8ba62a5 commit 2c471db

File tree

3 files changed

+28
-27
lines changed

3 files changed

+28
-27
lines changed

paddle/operators/linear_chain_crf_op.cc

Lines changed: 23 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -32,19 +32,19 @@ class LinearChainCRFOpMaker : public framework::OpProtoAndCheckerMaker {
3232
"[(D + 2) x D]. The learnable parameter for the linear_chain_crf "
3333
"operator. See more details in the operator's comments.");
3434
AddInput("Label",
35-
"(LoDTensor, default LoDTensor<int>) A LoDTensor with shape "
35+
"(LoDTensor, default LoDTensor<int64_t>) A LoDTensor with shape "
3636
"[N x 1], where N is the total element number in a mini-batch. "
3737
"The ground truth.");
3838
AddOutput(
3939
"Alpha",
4040
"(Tensor, default Tensor<float>) A 2-D Tensor with shape [N x D]. "
41-
"The forward vectors for the entire batch. Denote it as \f$\alpha\f$. "
42-
"\f$\alpha$\f is a memo table used to calculate the normalization "
43-
"factor in CRF. \f$\alpha[k, v]$\f stores the unnormalized "
41+
"The forward vectors for the entire batch. Denote it as $\alpha$. "
42+
"$\alpha$ is a memo table used to calculate the normalization "
43+
"factor in CRF. $\alpha[k, v]$ stores the unnormalized "
4444
"probabilites of all possible unfinished sequences of tags that end at "
45-
"position \f$k$\f with tag \f$v$\f. For each \f$k$\f, "
46-
"\f$\alpha[k, v]$\f is a vector of length \f$D$\f with a component for "
47-
"each tag value \f$v$\f. This vector is called a forward vecotr and "
45+
"position $k$ with tag $v$. For each $k$, "
46+
"$\alpha[k, v]$ is a vector of length $D$ with a component for "
47+
"each tag value $v$. This vector is called a forward vecotr and "
4848
"will also be used in backward computations.")
4949
.AsIntermediate();
5050
AddOutput(
@@ -73,9 +73,9 @@ LinearChainCRF Operator.
7373
7474
Conditional Random Field defines an undirected probabilistic graph with nodes
7575
denoting random variables and edges denoting dependencies between these
76-
variables. CRF learns the conditional probability \f$P(Y|X)\f$, where
77-
\f$X = (x_1, x_2, ... , x_n)\f$ are structured inputs and
78-
\f$Y = (y_1, y_2, ... , y_n)\f$ are labels for the inputs.
76+
variables. CRF learns the conditional probability $P(Y|X)$, where
77+
$X = (x_1, x_2, ... , x_n)$ are structured inputs and
78+
$Y = (y_1, y_2, ... , y_n)$ are labels for the inputs.
7979
8080
Linear chain CRF is a special case of CRF that is useful for sequence labeling
8181
task. Sequence labeling tasks do not assume a lot of conditional
@@ -88,21 +88,22 @@ CRF. Please refer to http://www.cs.columbia.edu/~mcollins/fb.pdf and
8888
http://cseweb.ucsd.edu/~elkan/250Bwinter2012/loglinearCRFs.pdf for details.
8989
9090
Equation:
91-
1. Denote Input(Emission) to this operator as \f$x\f$ here.
91+
1. Denote Input(Emission) to this operator as $x$ here.
9292
2. The first D values of Input(Transition) to this operator are for starting
93-
weights, denoted as \f$a\f$ here.
93+
weights, denoted as $a$ here.
9494
3. The next D values of Input(Transition) of this operator are for ending
95-
weights, denoted as \f$b\f$ here.
95+
weights, denoted as $b$ here.
9696
4. The remaning values of Input(Transition) are for transition weights,
97-
denoted as \f$w\f$ here.
98-
5. Denote Input(Label) as \f$s\f$ here.
99-
100-
The probability of a sequence \f$s\f$ of length \f$L\f$ is defined as:
101-
\f$P(s) = (1/Z) \exp(a_{s_1} + b_{s_L}
102-
+ \sum_{l=1}^L x_{s_l}
103-
+ \sum_{l=2}^L w_{s_{l-1},s_l})\f$
104-
where \f$Z\f$ is a normalization value so that the sum of \f$P(s)\f$ over
105-
all possible sequences is \f$1\f$, and \f$x\f$ is the emission feature weight
97+
denoted as $w$ here.
98+
5. Denote Input(Label) as $s$ here.
99+
100+
The probability of a sequence $s$ of length $L$ is defined as:
101+
$$P(s) = (1/Z) \exp(a_{s_1} + b_{s_L}
102+
+ \sum_{l=1}^L x_{s_l}
103+
+ \sum_{l=2}^L w_{s_{l-1},s_l})$$
104+
105+
where $Z$ is a normalization value so that the sum of $P(s)$ over
106+
all possible sequences is 1, and $x$ is the emission feature weight
106107
to the linear chain CRF.
107108
108109
Finally, the linear chain CRF operator outputs the logarithm of the conditional

paddle/operators/softmax_op.cc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ Then the ratio of the exponential of the given dimension and the sum of
5959
exponential values of all the other dimensions is the output of the softmax
6060
operator.
6161
62-
For each row `i` and each column `j` in input X, we have:
62+
For each row $i$ and each column $j$ in Input(X), we have:
6363
$$Y[i, j] = \frac{\exp(X[i, j])}{\sum_j(exp(X[i, j])}$$
6464
6565
)DOC");

paddle/operators/softmax_with_cross_entropy_op.cc

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -67,15 +67,15 @@ The equation is as follows:
6767
6868
1) Hard label (one-hot label, so every sample has exactly one class)
6969
70-
$$Loss_j = \f$ -\text{Logit}_{Label_j} +
70+
$$Loss_j = -\text{Logit}_{Label_j} +
7171
\log\left(\sum_{i=0}^{K}\exp(\text{Logit}_i)\right),
72-
j = 1, ..., K $\f$$
72+
j = 1,..., K$$
7373
7474
2) Soft label (each sample can have a distribution over all classes)
7575
76-
$$Loss_j = \f$ -\sum_{i=0}^{K}\text{Label}_i\left(\text{Logit}_i -
76+
$$Loss_j = -\sum_{i=0}^{K}\text{Label}_i \left(\text{Logit}_i -
7777
\log\left(\sum_{i=0}^{K}\exp(\text{Logit}_i)\right)\right),
78-
j = 1,...,K $\f$$
78+
j = 1,...,K$$
7979
8080
)DOC");
8181
}

0 commit comments

Comments
 (0)