@@ -32,19 +32,19 @@ class LinearChainCRFOpMaker : public framework::OpProtoAndCheckerMaker {
32
32
" [(D + 2) x D]. The learnable parameter for the linear_chain_crf "
33
33
" operator. See more details in the operator's comments." );
34
34
AddInput (" Label" ,
35
- " (LoDTensor, default LoDTensor<int >) A LoDTensor with shape "
35
+ " (LoDTensor, default LoDTensor<int64_t >) A LoDTensor with shape "
36
36
" [N x 1], where N is the total element number in a mini-batch. "
37
37
" The ground truth." );
38
38
AddOutput (
39
39
" Alpha" ,
40
40
" (Tensor, default Tensor<float>) A 2-D Tensor with shape [N x D]. "
41
- " The forward vectors for the entire batch. Denote it as \f $\a lpha\f $. "
42
- " \f $\a lpha$\f is a memo table used to calculate the normalization "
43
- " factor in CRF. \f $\a lpha[k, v]$\f stores the unnormalized "
41
+ " The forward vectors for the entire batch. Denote it as $\a lpha$. "
42
+ " $\a lpha$ is a memo table used to calculate the normalization "
43
+ " factor in CRF. $\a lpha[k, v]$ stores the unnormalized "
44
44
" probabilites of all possible unfinished sequences of tags that end at "
45
- " position \f $k$\f with tag \f $v$\f . For each \f $k$\f , "
46
- " \f $\a lpha[k, v]$\f is a vector of length \f $D$\f with a component for "
47
- " each tag value \f $v$\f . This vector is called a forward vecotr and "
45
+ " position $k$ with tag $v$. For each $k$, "
46
+ " $\a lpha[k, v]$ is a vector of length $D$ with a component for "
47
+ " each tag value $v$. This vector is called a forward vecotr and "
48
48
" will also be used in backward computations." )
49
49
.AsIntermediate ();
50
50
AddOutput (
@@ -73,9 +73,9 @@ LinearChainCRF Operator.
73
73
74
74
Conditional Random Field defines an undirected probabilistic graph with nodes
75
75
denoting random variables and edges denoting dependencies between these
76
- variables. CRF learns the conditional probability \f $P(Y|X)\f $, where
77
- \f $X = (x_1, x_2, ... , x_n)\f $ are structured inputs and
78
- \f $Y = (y_1, y_2, ... , y_n)\f $ are labels for the inputs.
76
+ variables. CRF learns the conditional probability $P(Y|X)$, where
77
+ $X = (x_1, x_2, ... , x_n)$ are structured inputs and
78
+ $Y = (y_1, y_2, ... , y_n)$ are labels for the inputs.
79
79
80
80
Linear chain CRF is a special case of CRF that is useful for sequence labeling
81
81
task. Sequence labeling tasks do not assume a lot of conditional
@@ -88,21 +88,22 @@ CRF. Please refer to http://www.cs.columbia.edu/~mcollins/fb.pdf and
88
88
http://cseweb.ucsd.edu/~elkan/250Bwinter2012/loglinearCRFs.pdf for details.
89
89
90
90
Equation:
91
- 1. Denote Input(Emission) to this operator as \f$x\f $ here.
91
+ 1. Denote Input(Emission) to this operator as $x $ here.
92
92
2. The first D values of Input(Transition) to this operator are for starting
93
- weights, denoted as \f$a\f $ here.
93
+ weights, denoted as $a $ here.
94
94
3. The next D values of Input(Transition) of this operator are for ending
95
- weights, denoted as \f$b\f $ here.
95
+ weights, denoted as $b $ here.
96
96
4. The remaning values of Input(Transition) are for transition weights,
97
- denoted as \f$w\f$ here.
98
- 5. Denote Input(Label) as \f$s\f$ here.
99
-
100
- The probability of a sequence \f$s\f$ of length \f$L\f$ is defined as:
101
- \f$P(s) = (1/Z) \exp(a_{s_1} + b_{s_L}
102
- + \sum_{l=1}^L x_{s_l}
103
- + \sum_{l=2}^L w_{s_{l-1},s_l})\f$
104
- where \f$Z\f$ is a normalization value so that the sum of \f$P(s)\f$ over
105
- all possible sequences is \f$1\f$, and \f$x\f$ is the emission feature weight
97
+ denoted as $w$ here.
98
+ 5. Denote Input(Label) as $s$ here.
99
+
100
+ The probability of a sequence $s$ of length $L$ is defined as:
101
+ $$P(s) = (1/Z) \exp(a_{s_1} + b_{s_L}
102
+ + \sum_{l=1}^L x_{s_l}
103
+ + \sum_{l=2}^L w_{s_{l-1},s_l})$$
104
+
105
+ where $Z$ is a normalization value so that the sum of $P(s)$ over
106
+ all possible sequences is 1, and $x$ is the emission feature weight
106
107
to the linear chain CRF.
107
108
108
109
Finally, the linear chain CRF operator outputs the logarithm of the conditional
0 commit comments