Skip to content

Using this in a Hierarchical Attention Network for NLP #1

@cidetraq

Description

@cidetraq

Hi,
I am wondering is there a way to use this across multiple LSTMs fed into one model in a Hierarchical Attention Network style as implemented in this blog post? https://richliao.github.io/supervised/classification/2016/12/26/textclassifier-HATN/

This currently works for just one LSTM layer by passing in shape=(None, word_vector_dimension) but how to make it work for both the word-level LSTM and the sentence-level LSTM? The Hierarchical Attention Network uses one LSTM at word level to encode features of words in a sentence using attention to determine which words in the sentence are important, then again another attention layer at the document level to determine which sentences are important out of all sentences in a document. I don't currently know how to get your code to work for both levels because when I try to use shape=(None, None) for the input of the "review" level (comparing multiple sentences in one document), I get

AsTensorError: ('Cannot convert (-1, None) to TensorType', <class 'tuple'>)

For reference here is my current code:

sentence_input= Input(shape=(None, 300))
l_lstm = Bidirectional(GRU(100, return_sequences=True))(sentence_input)
l_dense = TimeDistributed(Dense(200))(l_lstm)
l_att = AttLayer()(l_dense)
sentEncoder = Model(sentence_input, l_att)
 
#review_input = Input(shape=(MAX_SENTS,MAX_SENT_LENGTH), dtype='int32')
review_input = Input(shape=(7,None), dtype='int32')
review_encoder = TimeDistributed(sentEncoder)(review_input)
l_lstm_sent = Bidirectional(GRU(100, return_sequences=True))(review_encoder)
l_dense_sent = TimeDistributed(Dense(200))(l_lstm_sent)
l_att_sent = AttLayer()(l_dense_sent)
preds = Dense(2, activation='softmax')(l_att_sent)
model = Model(review_input, preds)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions