-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Hi,
I am wondering is there a way to use this across multiple LSTMs fed into one model in a Hierarchical Attention Network style as implemented in this blog post? https://richliao.github.io/supervised/classification/2016/12/26/textclassifier-HATN/
This currently works for just one LSTM layer by passing in shape=(None, word_vector_dimension) but how to make it work for both the word-level LSTM and the sentence-level LSTM? The Hierarchical Attention Network uses one LSTM at word level to encode features of words in a sentence using attention to determine which words in the sentence are important, then again another attention layer at the document level to determine which sentences are important out of all sentences in a document. I don't currently know how to get your code to work for both levels because when I try to use shape=(None, None) for the input of the "review" level (comparing multiple sentences in one document), I get
AsTensorError: ('Cannot convert (-1, None) to TensorType', <class 'tuple'>)
For reference here is my current code:
sentence_input= Input(shape=(None, 300))
l_lstm = Bidirectional(GRU(100, return_sequences=True))(sentence_input)
l_dense = TimeDistributed(Dense(200))(l_lstm)
l_att = AttLayer()(l_dense)
sentEncoder = Model(sentence_input, l_att)
#review_input = Input(shape=(MAX_SENTS,MAX_SENT_LENGTH), dtype='int32')
review_input = Input(shape=(7,None), dtype='int32')
review_encoder = TimeDistributed(sentEncoder)(review_input)
l_lstm_sent = Bidirectional(GRU(100, return_sequences=True))(review_encoder)
l_dense_sent = TimeDistributed(Dense(200))(l_lstm_sent)
l_att_sent = AttLayer()(l_dense_sent)
preds = Dense(2, activation='softmax')(l_att_sent)
model = Model(review_input, preds)