Skip to content

Commit a293955

Browse files
committed
README upd
1 parent c7b0225 commit a293955

File tree

1 file changed

+0
-37
lines changed

1 file changed

+0
-37
lines changed

README.md

Lines changed: 0 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,6 @@
22

33
A simple tool to scrape posts and comments from Reddit subreddits.
44

5-
## What it does
6-
7-
- Scrapes top posts and their comments from specified subreddits
8-
- Supports monthly or yearly time periods
9-
- Can limit the number of posts scraped per subreddit
10-
- Saves data in JSON format for easy analysis
11-
125
## How it works
136

147
### The Smart Way to Scrape Reddit
@@ -88,8 +81,6 @@ Data is saved in JSON files under the `data/` directory, one file per subreddit.
8881

8982
### Output Format
9083

91-
The scraper generates JSON files with a clean, structured format that's easy to work with. Here's what the output looks like:
92-
9384
```json
9485
[
9586
{
@@ -121,28 +112,6 @@ The scraper generates JSON files with a clean, structured format that's easy to
121112
]
122113
```
123114

124-
Key features of the output format:
125-
126-
- **Posts**: Each post is represented as an object with:
127-
- `post_body`: The title of the post
128-
- `post_user`: The username of the post author
129-
- `post_time`: ISO-formatted timestamp of when the post was created
130-
- `comments`: Array of comments on the post
131-
132-
- **Comments**: Each comment is represented as an object with:
133-
- `body`: The text content of the comment
134-
- `user`: The username of the comment author
135-
- `time`: ISO-formatted timestamp of when the comment was created
136-
- `replies`: Array of replies to the comment (nested comments)
137-
138-
- **Nested Structure**: Comments can have replies, which can have their own replies, creating a tree structure that preserves the conversation flow
139-
140-
This format makes it easy to:
141-
- Analyze post and comment content
142-
- Track user activity
143-
- Measure engagement over time
144-
- Import into data analysis tools
145-
146115
## Development
147116

148117
### Project Structure
@@ -166,22 +135,16 @@ reddit-scraper/
166135

167136
### Pre-commit Hooks
168137

169-
This project uses pre-commit hooks to ensure code quality. To set them up:
170-
171138
```bash
172139
pre-commit install
173140
```
174141

175-
The hooks will run automatically on commit, or you can run them manually:
176-
177142
```bash
178143
pre-commit run --all-files
179144
```
180145

181146
### Testing
182147

183-
Run the tests with:
184-
185148
```bash
186149
pytest
187150
```

0 commit comments

Comments
 (0)