@@ -86,6 +86,63 @@ reddit-scraper -d month -s subreddits.json -l 50
8686
8787Data is saved in JSON files under the ` data/ ` directory, one file per subreddit.
8888
89+ ### Output Format
90+
91+ The scraper generates JSON files with a clean, structured format that's easy to work with. Here's what the output looks like:
92+
93+ ``` json
94+ [
95+ {
96+ "post_body" : " This is the title of the post" ,
97+ "post_user" : " username123" ,
98+ "post_time" : " 2023-04-15T14:30:45" ,
99+ "comments" : [
100+ {
101+ "body" : " This is a top-level comment" ,
102+ "user" : " commenter456" ,
103+ "time" : " 2023-04-15T15:20:10" ,
104+ "replies" : [
105+ {
106+ "body" : " This is a reply to the comment" ,
107+ "user" : " replier789" ,
108+ "time" : " 2023-04-15T16:05:30" ,
109+ "replies" : []
110+ }
111+ ]
112+ }
113+ ]
114+ },
115+ {
116+ "post_body" : " Another post title" ,
117+ "post_user" : " anotheruser" ,
118+ "post_time" : " 2023-04-14T09:15:22" ,
119+ "comments" : []
120+ }
121+ ]
122+ ```
123+
124+ Key features of the output format:
125+
126+ - ** Posts** : Each post is represented as an object with:
127+ - ` post_body ` : The title of the post
128+ - ` post_user ` : The username of the post author
129+ - ` post_time ` : ISO-formatted timestamp of when the post was created
130+ - ` comments ` : Array of comments on the post
131+
132+ - ** Comments** : Each comment is represented as an object with:
133+ - ` body ` : The text content of the comment
134+ - ` user ` : The username of the comment author
135+ - ` time ` : ISO-formatted timestamp of when the comment was created
136+ - ` replies ` : Array of replies to the comment (nested comments)
137+
138+ - ** Nested Structure** : Comments can have replies, which can have their own replies, creating a tree structure that preserves the conversation flow
139+
140+ This format makes it easy to:
141+ - Analyze post and comment content
142+ - Track user activity
143+ - Measure engagement over time
144+ - Import into data analysis tools
145+
89146## Development
90147
91148### Project Structure
0 commit comments