Hi John
Thanks for your sharing. I have a question on word embedding. Correct me if I am wrong: noticed the word embedding created here only contains words in the training/test set. I would think a word embedding including all vocab in GloVE file will be better? For example, if in production, we encounter a new word than in training/test set, but it is part of the GloVE vocab, in this case, we can capture the meaning of the production words although we don't see it in training/test set. I think this will benefit sentiment classification problems with smaller training set?
Thanks!
Regards
Xiaohong