Sentiment Analysis on Twitter Data
Project Description
This project aims to analyze sentiments in Twitter data using Natural Language Processing (NLP) and Machine Learning techniques. The dataset consists of tweets labeled with their respective sentiments (e.g., Positive, Negative, Neutral). The goal is to preprocess the text, train a machine learning model to classify sentiments, and predict the sentiment of new user-input statements.
Objectives
Data Preprocessing: Load and clean the dataset.Handle missing values and inconsistencies.Convert text data to lowercase and remove special characters.
Feature Extraction: Convert textual data into numerical format using the CountVectorizer.
Model Training & Evaluation: Train a Logistic Regression model for sentiment classification.Split data into training and testing sets.Evaluate the model using accuracy, precision, recall, and F1-score.
Prediction on New Data: Accept new user-input statements.Predict their sentiment using the trained model.
Expected Outcome
A trained sentiment analysis model capable of accurately classifying tweets into Positive, Negative, or Neutral sentiments.
A system that can take any new text input and predict its sentiment.