In this data analysis of 10,000+ reddit depression post,I used python libraries to tabulate the following important and useful information from this dataset:
- Total number of posts
- Total number of unique authors
- Average post length (measured in word count)
- Date range of the dataset
- Top 20 most important words in the posts (selftext is the column which has the post data)
Upon this tabulation, I generated a wordcloud that visualized the top 20 most important words in a clear diagram.