22
33A simple tool to scrape posts and comments from Reddit subreddits.
44
5- ## What it does
6-
7- - Scrapes top posts and their comments from specified subreddits
8- - Supports monthly or yearly time periods
9- - Can limit the number of posts scraped per subreddit
10- - Saves data in JSON format for easy analysis
11-
125## How it works
136
147### The Smart Way to Scrape Reddit
@@ -88,8 +81,6 @@ Data is saved in JSON files under the `data/` directory, one file per subreddit.
8881
8982### Output Format
9083
91- The scraper generates JSON files with a clean, structured format that's easy to work with. Here's what the output looks like:
92-
9384``` json
9485[
9586 {
@@ -121,28 +112,6 @@ The scraper generates JSON files with a clean, structured format that's easy to
121112]
122113```
123114
124- Key features of the output format:
125-
126- - ** Posts** : Each post is represented as an object with:
127- - ` post_body ` : The title of the post
128- - ` post_user ` : The username of the post author
129- - ` post_time ` : ISO-formatted timestamp of when the post was created
130- - ` comments ` : Array of comments on the post
131-
132- - ** Comments** : Each comment is represented as an object with:
133- - ` body ` : The text content of the comment
134- - ` user ` : The username of the comment author
135- - ` time ` : ISO-formatted timestamp of when the comment was created
136- - ` replies ` : Array of replies to the comment (nested comments)
137-
138- - ** Nested Structure** : Comments can have replies, which can have their own replies, creating a tree structure that preserves the conversation flow
139-
140- This format makes it easy to:
141- - Analyze post and comment content
142- - Track user activity
143- - Measure engagement over time
144- - Import into data analysis tools
145-
146115## Development
147116
148117### Project Structure
@@ -166,22 +135,16 @@ reddit-scraper/
166135
167136### Pre-commit Hooks
168137
169- This project uses pre-commit hooks to ensure code quality. To set them up:
170-
171138``` bash
172139pre-commit install
173140```
174141
175- The hooks will run automatically on commit, or you can run them manually:
176-
177142``` bash
178143pre-commit run --all-files
179144```
180145
181146### Testing
182147
183- Run the tests with:
184-
185148``` bash
186149pytest
187150```
0 commit comments