|
| 1 | +.. --------------------------------------------------------------------------- |
| 2 | +.. Copyright 2017-2018 Intel Corporation |
| 3 | +.. |
| 4 | +.. Licensed under the Apache License, Version 2.0 (the "License"); |
| 5 | +.. you may not use this file except in compliance with the License. |
| 6 | +.. You may obtain a copy of the License at |
| 7 | +.. |
| 8 | +.. http://www.apache.org/licenses/LICENSE-2.0 |
| 9 | +.. |
| 10 | +.. Unless required by applicable law or agreed to in writing, software |
| 11 | +.. distributed under the License is distributed on an "AS IS" BASIS, |
| 12 | +.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| 13 | +.. See the License for the specific language governing permissions and |
| 14 | +.. limitations under the License. |
| 15 | +.. --------------------------------------------------------------------------- |
| 16 | +
|
| 17 | +Supervised Sentiment |
| 18 | +#################### |
| 19 | + |
| 20 | +Overview |
| 21 | +======== |
| 22 | + |
| 23 | +This is a set of models which are examples of supervised implementations for sentiment analysis. |
| 24 | +The larger idea behind these models is to allow ensembling with other supervised or unsupervised models. |
| 25 | + |
| 26 | +Files |
| 27 | +===== |
| 28 | + |
| 29 | +- **nlp_architect/models/supervised_sentiment.py**: Sentiment analysis models - currently an LSTM and a one-hot CNN |
| 30 | +- **nlp_architect/data/amazon_reviews.py**: Code which will download and process the Amazon datasets described below |
| 31 | +- **nlp_architect/utils/ensembler.py**: Contains the ensembling algorithm(s) |
| 32 | +- **example_ensemble.py**: An example of how the sentiment models can be trained and ensembled. |
| 33 | +- **optimize_example.py**: An example of using an hyperparameter optimizer with the simple LSTM model. |
| 34 | + |
| 35 | + |
| 36 | +Models |
| 37 | +====== |
| 38 | +Two models are shown as classification examples. Additional models can be added as desired. |
| 39 | + |
| 40 | +Bi-directional LSTM |
| 41 | +------------------- |
| 42 | +A simple bidirectional lstm with one fully connected layer. The number of vocab features, dense output size, and document input length, should be determined in the data preprocessing steps. The user can then change the size of the lstm hidden layer, and the recurrent dropout rate. |
| 43 | + |
| 44 | +Temporal CNN |
| 45 | +------------ |
| 46 | +As defined in "Text Understanding from Scratch" by Zhang, LeCun 2015 https://arxiv.org/pdf/1502.01710v4.pdf this model is a series of 1D CNNs, with a maxpooling and fully connected layers. The frame sizes may either be large or small. |
| 47 | + |
| 48 | + |
| 49 | +Datasets |
| 50 | +======== |
| 51 | +The dataset in this example is the Amazon Reviews dataset, though other datasets can be easily substituted. |
| 52 | +The Amazon review dataset(s) should be downloaded from http://jmcauley.ucsd.edu/data/amazon/. These are `*.json.gzip` files which should be unzipped. The terms and conditions of the data set license apply. Intel does not grant any rights to the data files. |
| 53 | +For best results, a medium sized dataset should be chosen though the algorithms will work on larger and smaller datasets as well. For experimentation I chose the Movie and TV reviews. |
| 54 | +Only the "overall", "reviewText", and "summary" columns of the review dataset will be retained. The "overall" is the overall rating in terms of stars - this is transformed into a rating where currently 4-5 stars is a positive review, 3 is neutral, and 1-2 stars is a negative review. |
| 55 | +The "summary" or title of the review is concatenated with the review text and subsequently cleaned. |
| 56 | + |
| 57 | +The Amazon Review Dataset was published in the following papers: |
| 58 | + |
| 59 | +Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering |
| 60 | +R. He, J. McAuley |
| 61 | +WWW, 2016 |
| 62 | +http://cseweb.ucsd.edu/~jmcauley/pdfs/www16a.pdf |
| 63 | + |
| 64 | +Image-based recommendations on styles and substitutes |
| 65 | +J. McAuley, C. Targett, J. Shi, A. van den Hengel |
| 66 | +SIGIR, 2015 |
| 67 | +http://cseweb.ucsd.edu/~jmcauley/pdfs/sigir15.pdf |
| 68 | + |
| 69 | + |
| 70 | +Running Modalities |
| 71 | +================== |
| 72 | + |
| 73 | +Ensemble Train/Test |
| 74 | +------------------- |
| 75 | +Currently, the pipeline shows a full train/test/ensemble cycle. The main pipeline can be run with the following command: |
| 76 | +``` |
| 77 | + python example_ensemble.py --file_path ./reviews_Movies_and_TV.json/ |
| 78 | +``` |
| 79 | +At the conclusion of training a final confusion matrix will be displayed. |
| 80 | + |
| 81 | +Hyperparameter optimization |
| 82 | +--------------------------- |
| 83 | +An example of hyperparameter optimization is given using the python package hyperopt which uses a Tree of Parzen estimator to optimize the simple bi-lstm algorithm. To run this example the following command can be utilized: |
| 84 | +``` |
| 85 | + python optimize_example.py --file_path ./reviews_Movies_and_TV.json/ --new_trials 50 --output_file ./data/optimize_output.pkl |
| 86 | +``` |
| 87 | +The file will output a result of each of the trial attempts to the specified pickle file. |
0 commit comments