💬 TF-IDF Sentiment Analysis on Twitter Dataset

This project performs sentiment classification on Twitter data using TF-IDF vectorization and machine learning classifiers. The notebook includes data cleaning, vectorization, training, and performance evaluation using classic models like Logistic Regression and Naive Bayes.

📁 Project Structure

TFIDF-Sentiment-Analysis/
├── TF-IDFSentimentAnalysis.ipynb  	# Main notebook
├── NLP TF-IDF Sentiment Analysis.pdf   # Project summary
├── sample_sentiment_dataset.csv        # 5K sample of original dataset
├── requirements.txt                    # Python dependencies
├── README.md                           # Project documentation
└── .gitignore                          # Git exclusion rules

🧠 Dataset

The original dataset is from Kaggle Sentiment140, containing over 1.6M tweets labeled with sentiment polarity (0 = Negative, 2 = Neutral, 4 = Positive).

⚠️ Due to GitHub file size limits, only a 5,000-row sample is included as sample_sentiment_dataset.csv for testing and demo purposes.

Each record includes:

Polarity (0/2/4)
Tweet ID
Date
Query
Username
Cleaned tweet text

🔍 Methods Used

Text cleaning (lowercasing, punctuation, stopwords removal)
TF-IDF vectorization with TfidfVectorizer
Machine learning classifiers:
- Logistic Regression
- Naive Bayes
- SVM (optional)
Evaluation:
- Accuracy, precision, recall, F1-score
- Confusion matrix
- Visual plots

🚀 Getting Started

1. Clone the Repository

git clone https://github.com/your-username/TFIDF-Sentiment-Analysis.git
cd TFIDF-Sentiment-Analysis

2. Install Dependencies

pip install -r requirements.txt

3. Run the Notebook

jupyter notebook TF-IDFSentimentAnalysis.ipynb

📄 License

👤 Author

Sanjana Shah
✨ Machine Learning & Generative AI Enthusiast
📫 Connect on LinkedIn GitHub: @shahsanjanav

⭐ If you like this project, consider starring it on GitHub!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💬 TF-IDF Sentiment Analysis on Twitter Dataset

📁 Project Structure

🧠 Dataset

🔍 Methods Used

🚀 Getting Started

1. Clone the Repository

2. Install Dependencies

3. Run the Notebook

📄 License

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
LICENSE		LICENSE
NLP TF-IDF Sentiment Analysis.pdf		NLP TF-IDF Sentiment Analysis.pdf
README.md		README.md
TF-IDFSentimentAnalysis.ipynb		TF-IDFSentimentAnalysis.ipynb
requirements.txt		requirements.txt
sample_sentiment_dataset.csv		sample_sentiment_dataset.csv

Folders and files

Latest commit

History

Repository files navigation

💬 TF-IDF Sentiment Analysis on Twitter Dataset

📁 Project Structure

🧠 Dataset

🔍 Methods Used

🚀 Getting Started

1. Clone the Repository

2. Install Dependencies

3. Run the Notebook

📄 License

👤 Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages