Skip to content

tobias-mack/Big_Data_Covid_Social_Media_Analysis

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Big_Data_Covid_Social_Media_Analysis

Analyzes how covid related posts correspond to world covid test and hospitalization data

Reddit Dataset: https://socialgrep.com/datasets/the-reddit-covid-dataset

Twitter Dataset: https://www.kaggle.com/datasets/imoore/covid19-complete-twitter-dataset-daily-updates?select=dailies

Covid Tests Dataset: https://github.com/owid/covid-19-data/tree/master/public/data/testing

Hospitalizations Dataset: https://github.com/owid/covid-19-data/tree/master/public/data/hospitalizations


Dev Resources

Docker images: https://github.com/Marcel-Jan/docker-hadoop-spark


Notes

This is a reposotory for the code which analyzes correlations between Covid-19 Twitter dataset and World Covid statistics.

The First Analysis Approach directory contains code for the first attempt.

Second Analysis Approach directory contains code for the second attempt. (Spark code in README)

About

Analyzes how covid related posts correspond to world covid test and hospitalization data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 69.7%
  • Java 30.3%