28
28
'batch_norm' , 'beam_search_decode' , 'conv2d_transpose' , 'sequence_expand' ,
29
29
'lstm_unit' , 'reduce_sum' , 'reduce_mean' , 'reduce_max' , 'reduce_min' ,
30
30
'sequence_first_step' , 'sequence_last_step' , 'dropout' , 'split' ,
31
- 'l2_normalize' , 'matmul' , 'warpctc'
31
+ 'l2_normalize' , 'matmul' , 'warpctc' , 'sequence_reshape'
32
32
]
33
33
34
34
@@ -213,33 +213,33 @@ def dynamic_lstm(input,
213
213
(https://arxiv.org/pdf/1402.1128.pdf), the formula is as follows:
214
214
215
215
.. math::
216
-
217
- i_t & = \sigma(W_{ix}x_{t} + W_{ih}h_{t-1} + W_{ic}c_{t-1} + b_i)
218
216
219
- f_t & = \sigma(W_{fx }x_{t} + W_{fh }h_{t-1} + W_{fc }c_{t-1} + b_f)
217
+ i_t & = \sigma(W_{ix }x_{t} + W_{ih }h_{t-1} + W_{ic }c_{t-1} + b_i)
220
218
221
- \\ tilde{c_t} & = act_g (W_{cx}x_t + W_{ch }h_{t-1} + b_c)
219
+ f_t & = \sigma (W_{fx}x_{t} + W_{fh }h_{t-1} + W_{fc}c_{t-1} + b_f)
222
220
223
- o_t & = \sigma (W_{ox}x_{t} + W_{oh }h_{t-1} + W_{oc}c_t + b_o)
221
+ \\ tilde{c_t} & = act_g (W_{cx}x_t + W_{ch }h_{t-1} + b_c)
224
222
225
- c_t & = f_t \odot c_{t-1} + i_t \odot \\ tilde{c_t}
223
+ o_t & = \sigma(W_{ox}x_{t} + W_{oh}h_{t-1} + W_{oc}c_t + b_o)
224
+
225
+ c_t & = f_t \odot c_{t-1} + i_t \odot \\ tilde{c_t}
226
226
227
227
h_t & = o_t \odot act_h(c_t)
228
228
229
- where the :math:`W` terms denote weight matrices (e.g. :math:`W_{xi}` is
229
+ where the :math:`W` terms denote weight matrices (e.g. :math:`W_{xi}` is
230
230
the matrix of weights from the input gate to the input), :math:`W_{ic}, \
231
- W_{fc}, W_{oc}` are diagonal weight matrices for peephole connections. In
232
- our implementation, we use vectors to reprenset these diagonal weight
233
- matrices. The :math:`b` terms denote bias vectors (:math:`b_i` is the input
234
- gate bias vector), :math:`\sigma` is the non-line activations, such as
235
- logistic sigmoid function, and :math:`i, f, o` and :math:`c` are the input
236
- gate, forget gate, output gate, and cell activation vectors, respectively,
231
+ W_{fc}, W_{oc}` are diagonal weight matrices for peephole connections. In
232
+ our implementation, we use vectors to reprenset these diagonal weight
233
+ matrices. The :math:`b` terms denote bias vectors (:math:`b_i` is the input
234
+ gate bias vector), :math:`\sigma` is the non-line activations, such as
235
+ logistic sigmoid function, and :math:`i, f, o` and :math:`c` are the input
236
+ gate, forget gate, output gate, and cell activation vectors, respectively,
237
237
all of which have the same size as the cell output activation vector :math:`h`.
238
238
239
- The :math:`\odot` is the element-wise product of the vectors. :math:`act_g`
240
- and :math:`act_h` are the cell input and cell output activation functions
241
- and `tanh` is usually used for them. :math:`\\ tilde{c_t}` is also called
242
- candidate hidden state, which is computed based on the current input and
239
+ The :math:`\odot` is the element-wise product of the vectors. :math:`act_g`
240
+ and :math:`act_h` are the cell input and cell output activation functions
241
+ and `tanh` is usually used for them. :math:`\\ tilde{c_t}` is also called
242
+ candidate hidden state, which is computed based on the current input and
243
243
the previous hidden state.
244
244
245
245
Set `use_peepholes` to `False` to disable peephole connection. The formula
@@ -251,38 +251,38 @@ def dynamic_lstm(input,
251
251
Users can choose to use fully-connect layer before LSTM layer.
252
252
253
253
Args:
254
- input(Variable): The input of dynamic_lstm layer, which supports
255
- variable-time length input sequence. The underlying
256
- tensor in this Variable is a matrix with shape
257
- (T X 4D), where T is the total time steps in this
254
+ input(Variable): The input of dynamic_lstm layer, which supports
255
+ variable-time length input sequence. The underlying
256
+ tensor in this Variable is a matrix with shape
257
+ (T X 4D), where T is the total time steps in this
258
258
mini-batch, D is the hidden size.
259
259
size(int): 4 * hidden size.
260
- param_attr(ParamAttr): The parameter attribute for the learnable
261
- hidden-hidden weights.
260
+ param_attr(ParamAttr): The parameter attribute for the learnable
261
+ hidden-hidden weights.
262
262
263
- - The shape is (D x 4D), where D is the hidden
264
- size.
263
+ - The shape is (D x 4D), where D is the hidden
264
+ size.
265
265
- Weights = {:math:`W_{ch}, W_{ih}, \
266
266
W_{fh}, W_{oh}`}
267
267
bias_attr(ParamAttr): The bias attribute for the learnable bias
268
- weights, which contains two parts, input-hidden
269
- bias weights and peephole connections weights if
270
- setting `use_peepholes` to `True`.
268
+ weights, which contains two parts, input-hidden
269
+ bias weights and peephole connections weights if
270
+ setting `use_peepholes` to `True`.
271
271
272
- 1. `use_peepholes = False`
273
- - The shape is (1 x 4D).
272
+ 1. `use_peepholes = False`
273
+ - The shape is (1 x 4D).
274
274
- Biases = {:math:`b_c, b_i, b_f, b_o`}.
275
- 2. `use_peepholes = True`
276
- - The shape is (1 x 7D).
275
+ 2. `use_peepholes = True`
276
+ - The shape is (1 x 7D).
277
277
- Biases = { :math:`b_c, b_i, b_f, b_o, W_{ic}, \
278
278
W_{fc}, W_{oc}`}.
279
- use_peepholes(bool): Whether to enable diagonal/peephole connections,
279
+ use_peepholes(bool): Whether to enable diagonal/peephole connections,
280
280
default `True`.
281
281
is_reverse(bool): Whether to compute reversed LSTM, default `False`.
282
- gate_activation(str): The activation for input gate, forget gate and
283
- output gate. Choices = ["sigmoid", "tanh", "relu",
282
+ gate_activation(str): The activation for input gate, forget gate and
283
+ output gate. Choices = ["sigmoid", "tanh", "relu",
284
284
"identity"], default "sigmoid".
285
- cell_activation(str): The activation for cell output. Choices = ["sigmoid",
285
+ cell_activation(str): The activation for cell output. Choices = ["sigmoid",
286
286
"tanh", "relu", "identity"], default "tanh".
287
287
candidate_activation(str): The activation for candidate hidden state.
288
288
Choices = ["sigmoid", "tanh", "relu", "identity"],
@@ -1914,3 +1914,57 @@ def warpctc(input, label, blank=0, norm_by_times=False, **kwargs):
1914
1914
attrs = {'blank' : blank ,
1915
1915
'norm_by_times' : norm_by_times })
1916
1916
return loss_out
1917
+
1918
+
1919
+ def sequence_reshape (input , new_dim ):
1920
+ """
1921
+ **Sequence Reshape Layer**
1922
+
1923
+ This layer will rearrange the input sequences. The new dimension is set by
1924
+ user. Length of each sequence is computed according to original length,
1925
+ original dimension and new dimension. The following example will help to
1926
+ illustrate the function of this layer:
1927
+
1928
+ .. code-block:: text
1929
+
1930
+ x is a LoDTensor:
1931
+ x.lod = [[0, 2, 6]]
1932
+ x.data = [[1, 2], [3, 4],
1933
+ [5, 6], [7, 8], [9, 10], [11, 12]]
1934
+ x.dims = [6, 2]
1935
+
1936
+ set new_dim = 4
1937
+
1938
+ then out is a LoDTensor:
1939
+ out.lod = [[0, 1, 3]]
1940
+ out.data = [[1, 2, 3, 4],
1941
+ [5, 6, 7, 8], [9, 10, 11, 12]]
1942
+ out.dims = [3, 4]
1943
+
1944
+ Currently, only 1-level LoDTensor is supported and please make sure
1945
+ (original length * original dimension) can be divided by new dimension with
1946
+ no remainder for each sequence.
1947
+
1948
+ Args:
1949
+ input (Variable): (LodTensor, default: LoDTensor<float>), a 2-D LoDTensor
1950
+ with shape being [N, M] where M for dimension.
1951
+ new_dim (int): New dimension which the input LoDTensor is reshaped to.
1952
+
1953
+ Returns:
1954
+ Variable: Reshaped LoDTensor according to new dimension.
1955
+
1956
+ Examples:
1957
+ .. code-block:: python
1958
+
1959
+ x = fluid.layers.data(name='x', shape=[5, 20],
1960
+ dtype='float32', lod_level=1)
1961
+ x_reshaped = layers.sequence_reshape(input=x, new_dim=10)
1962
+ """
1963
+ helper = LayerHelper ('sequence_reshape' , ** locals ())
1964
+ out = helper .create_tmp_variable (helper .input_dtype ())
1965
+ helper .append_op (
1966
+ type = 'sequence_reshape' ,
1967
+ inputs = {'X' : [input ]},
1968
+ outputs = {'Out' : [out ]},
1969
+ attrs = {'new_dim' : new_dim })
1970
+ return out
0 commit comments