@@ -181,7 +181,7 @@ class LSTMOpMaker : public framework::OpProtoAndCheckerMaker {
181
181
AddComment (R"DOC(
182
182
Long-Short Term Memory (LSTM) Operator.
183
183
184
- The defalut implementation is diagonal/peephole connection
184
+ The defalut implementation is diagonal/peephole connection
185
185
(https://arxiv.org/pdf/1402.1128.pdf), the formula is as follows:
186
186
187
187
$$
@@ -198,27 +198,27 @@ c_t = f_t \odot c_{t-1} + i_t \odot \tilde{c_t} \\
198
198
h_t = o_t \odot act_h(c_t)
199
199
$$
200
200
201
- where the W terms denote weight matrices (e.g. \f $W_{xi}\f $ is the matrix
202
- of weights from the input gate to the input), \f $W_{ic}, W_{fc}, W_{oc}\f $
201
+ where the W terms denote weight matrices (e.g. $W_{xi}$ is the matrix
202
+ of weights from the input gate to the input), $W_{ic}, W_{fc}, W_{oc}$
203
203
are diagonal weight matrices for peephole connections. In our implementation,
204
204
we use vectors to reprenset these diagonal weight matrices. The b terms
205
- denote bias vectors (\f $b_i\f $ is the input gate bias vector), \f $\sigma\f $
205
+ denote bias vectors ($b_i$ is the input gate bias vector), $\sigma$
206
206
is the non-line activations, such as logistic sigmoid function, and
207
- \f $i, f, o\f $ and \f$c\f $ are the input gate, forget gate, output gate,
207
+ $i, f, o$ and $c $ are the input gate, forget gate, output gate,
208
208
and cell activation vectors, respectively, all of which have the same size as
209
- the cell output activation vector \f$h\f $.
209
+ the cell output activation vector $h $.
210
210
211
- The \f $\odot\f $ is the element-wise product of the vectors. \f $act_g\f $ and \f $act_h\f $
211
+ The $\odot$ is the element-wise product of the vectors. $act_g$ and $act_h$
212
212
are the cell input and cell output activation functions and `tanh` is usually
213
- used for them. \f $\tilde{c_t}\f $ is also called candidate hidden state,
213
+ used for them. $\tilde{c_t}$ is also called candidate hidden state,
214
214
which is computed based on the current input and the previous hidden state.
215
215
216
- Set `use_peepholes` False to disable peephole connection
217
- (http://www.bioinf.jku.at/publications/older/2604.pdf). The formula
218
- is omitted here .
216
+ Set `use_peepholes` False to disable peephole connection. The formula
217
+ is omitted here, please refer to the paper
218
+ http://www.bioinf.jku.at/publications/older/2604.pdf for details .
219
219
220
- Note that these \f $W_{xi}x_{t}, W_{xf}x_{t}, W_{xc}x_{t}, W_{xo}x_{t}\f $
221
- operations on the input \f $x_{t}\f $ are NOT included in this operator.
220
+ Note that these $W_{xi}x_{t}, W_{xf}x_{t}, W_{xc}x_{t}, W_{xo}x_{t}$
221
+ operations on the input $x_{t}$ are NOT included in this operator.
222
222
Users can choose to use fully-connect operator before LSTM operator.
223
223
224
224
)DOC" );
0 commit comments