Skip to content

This project predicts a person's Myers-Briggs personality type using Natural Language Processing and Machine Learning techniques. The dataset is preprocessed to handle imbalanced data, and a Flask application is created for easy input and output.

License

Notifications You must be signed in to change notification settings

MOAzeemKhan/MBTI-Personality-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MBTI-Personality-Prediction

The project aims to predict personality types of individuals using Natural Language Processing (NLP) techniques and Machine Learning (ML) algorithms. The dataset used for training and testing the model is the Myers-Briggs Type Indicator (MBTI) dataset which contains a collection of posts from individuals in the PersonalityCafe forum, along with their corresponding personality types based on the MBTI framework.

The project report can be read here

Data Preprocessing

The dataset was preprocessed by performing the following steps:

  1. Converting all text to lowercase
  2. Removing URLs, mentions, special characters, and stop words
  3. Stemming and lemmatization
  4. Vectorizing the text using the Term Frequency-Inverse Document Frequency (TF-IDF) technique

Handling Imbalanced Data

The MBTI dataset was imbalanced, with some personality types having a significantly smaller number of samples than others. To handle this, undersampling, oversampling, and SMOTe techniques were used to balance the data.

Model Training

Three different models were trained on the preprocessed dataset:

  1. Linear SVC
  2. SVC
  3. KNN
  4. Random Forest
  5. Multinomial Naive Bayes
  6. Logistic Regression

GUI

A simple web-based graphical user interface (GUI) was built using Flask, which allows users to input a text sample and receive a predicted personality type based on the trained models.

pip install -r requirements.txt

Then run the following command:

python app.py

Kaggle Notebook

A Kaggle notebook was created to provide a step-by-step guide for the project. It includes the code, visualizations, and explanations of the various techniques used.

Credits

This project was created by:

  1. Mohammed Azeem Khan
  2. Rishabh Kinhikar
  3. Omkar Iyer
  4. Achintya Kamath

The dataset used in this project was obtained from Kaggle and can be found here.

If you have any questions or feedback, feel free to open an issue or contact me at:

Email: mohdazeemkhan64@gmail.com

Thank you for checking out this project!

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

This project predicts a person's Myers-Briggs personality type using Natural Language Processing and Machine Learning techniques. The dataset is preprocessed to handle imbalanced data, and a Flask application is created for easy input and output.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages