Skip to content

wicky818/MSDS453

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

MSDS453

Wine NLP Project

Contributors: Alexander Wang, Ruchi Kumar, and Wicky Woo

Introduction:

There are ever increasing options to choose from when it comes to wine. Finding a wine that meets your standards is difficult and may take a lifetime of wine tasting. To better serve the average wine drinker who does not have infinite amounts of time and money to try wine, the project focuses on clustering and classifying wines by type and flavor profile. Reviews are a subjective issue where everybody will likely have a different opinion. A lot of reviews are also oftentimes mislabeled or refer to the wrong item. The data set chosen deals with sommelier reviews of various wines from across the world. The reviews will be analyzed by description by TFIDF and then using the description vectorization, each wine review can be clustered together. To allow potential wine customers to gain better insights from wine reviews, these reviews will be clustered and classified by wine type and country. Following this, a neural network can be trained to predict and classify the type of wine and region based on descriptions. Using this, people who like certain types of wine will be able to find other similar types of wine very easily. This application provides a lot of value in the case of missing labels or even can be built upon into a recommender system using semantic analysis and wine scoring. Natural Language Processing encompasses a large number of techniques for the computational processing of natural languages such as text or speech. This topic grew out of the field of linguistics as data and computing power increased the ability to automatically process this information. The two main phases of NLP are preprocessing and algorithm development. By conducting several experiments, it will be observable how varying the vectorization technique, topic modeling, and other hyperparameters affect results. The process of experimentation will allow the evaluation of how NLP algorithms perform under certain conditions. Additionally, neural networks will be evaluated on their effectiveness and accuracy for predicting labels using text from the wine reviews.

Description:

In this Repo you will find the source code of the unsupervised learning methods used to analyze wne reviews and the supervised learning methods to classify the type of wine and the country that the wine is from.

About

Wine NLP Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages