Skip to content

Sentiment Analysis of Twitter feeds using Spark MLLib and Spark SQL, deployed on Azure Databricks and Azure SQL Database Warehouse

Notifications You must be signed in to change notification settings

PankajB1997/Twitter_Sentiment_Analysis

Repository files navigation

Twitter Sentiment Analysis

CS4225 Project

Total tweets added: 31,130,257 (3.01 GB)

This is consolidated from multiple sources, including an online twitter archive, Sentiment140 dataset on Kaggle and around 40,000 tweets that we downloaded manually using the Twitter API.

To consolidate the entire dataset into one large CSV file, simply clone this repo and run python prepare_tweets_csv.py

In order to load the large CSV file in another script after running the python script:
tweets_csv = pandas.read_csv(open("tweets_large_32M.csv", "rU", encoding="utf-8"), header=None, index_col=None)

About

Sentiment Analysis of Twitter feeds using Spark MLLib and Spark SQL, deployed on Azure Databricks and Azure SQL Database Warehouse

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors