NLP: Animal Mouse vs Computer Mouse

This project was implemented as a side-project during my graduate program at the University of San Diego (USD).

Project Status: Completed

Project Description

This project is about classifying whether the word 'mouse' in a sentence is referring to a mouse as animal or as a computer mouse. The dataset consists of separate texts containing the word 'mouse'. The texts are first preprocessed by NLP techniques such as Tokenization, Lemmatization, and Stop Words Removal. Then they are converted into vectors by utilizing three different word embedding approaches (Count Vectorizer, TF-IDF, N-gram level). The feature vectors are then fed into two classifiers (Logistic Regression and Naive Bayesian). The trained models are used for classifying the validation texts. At last, Ensemble Models are used to combine the best 3 performing models in order to improve the performance.

Methods Used

NLP
Word Embedding
Text Classification
Machine Learning
Inferential Statistics

Technologies

Python
NLTK
Numpy
Scipy
Scikit-learn
Pandas
Matplotlib

Instructions:

Clone or download the zip file, then, extract the zip file and open the Jupyter notebook file (.ipynb) in the same folder. Finally, open the notebook in the Jupyter of your local computer or on the cloud, and run each cell.

Data source:

https://www.kaggle.com/werty12121/animal-mouse-vs-computer-mouse-text-dataset#animal.csv

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
datasets		datasets
README.md		README.md
Text_Classification_Mouse.ipynb		Text_Classification_Mouse.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NLP: Animal Mouse vs Computer Mouse

Project Description

Methods Used

Technologies

Instructions:

Data source:

About

Uh oh!

Releases

Packages

Languages

Max-Sanii/NLP-Text-Classification

Folders and files

Latest commit

History

Repository files navigation

NLP: Animal Mouse vs Computer Mouse

Project Description

Methods Used

Technologies

Instructions:

Data source:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages