Skip to content

This project is about using NLP and Machine Learning to detect whether the word 'mouse' in a sentence is referring to an animal mouse or a computer mouse.

Notifications You must be signed in to change notification settings

Max-Sanii/NLP-Text-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

NLP: Animal Mouse vs Computer Mouse

This project was implemented as a side-project during my graduate program at the University of San Diego (USD).

  • Project Status: Completed

Project Description

This project is about classifying whether the word 'mouse' in a sentence is referring to a mouse as animal or as a computer mouse. The dataset consists of separate texts containing the word 'mouse'. The texts are first preprocessed by NLP techniques such as Tokenization, Lemmatization, and Stop Words Removal. Then they are converted into vectors by utilizing three different word embedding approaches (Count Vectorizer, TF-IDF, N-gram level). The feature vectors are then fed into two classifiers (Logistic Regression and Naive Bayesian). The trained models are used for classifying the validation texts. At last, Ensemble Models are used to combine the best 3 performing models in order to improve the performance.

Methods Used

  • NLP
  • Word Embedding
  • Text Classification
  • Machine Learning
  • Inferential Statistics

Technologies

  • Python
  • NLTK
  • Numpy
  • Scipy
  • Scikit-learn
  • Pandas
  • Matplotlib

Instructions:

Clone or download the zip file, then, extract the zip file and open the Jupyter notebook file (.ipynb) in the same folder. Finally, open the notebook in the Jupyter of your local computer or on the cloud, and run each cell.

Data source:

https://www.kaggle.com/werty12121/animal-mouse-vs-computer-mouse-text-dataset#animal.csv

About

This project is about using NLP and Machine Learning to detect whether the word 'mouse' in a sentence is referring to an animal mouse or a computer mouse.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published