A Streamlit application that scrapes and analyzes Reddit posts related to business ideas, with a focus on AI and app development.
- Search for relevant subreddits based on your query
- Fetch and analyze posts from selected subreddits
- Sentiment analysis of posts and comments
- Generate detailed reports with insights
- Export data in CSV format
- Interactive visualizations
- Clone the repository:
git clone https://github.com/SukinShetty/redditscraper.git
cd redditscraper- Install dependencies:
pip install -r requirements.txt- Create a
.envfile with your Reddit API credentials:
REDDIT_CLIENT_ID=your_client_id
REDDIT_CLIENT_SECRET=your_client_secret
- Run the application:
python -m streamlit run simple_streamlit_app.py- Enter your search query (e.g., "gen ai app ideas")
- Set the maximum number of subreddits and posts per subreddit
- Click "Generate Report" to start the analysis
- View the results in the Report, Visualizations, and Raw Data tabs
- Download the analysis as a text report or CSV file
If you encounter a 401 HTTP response:
- Double-check your Reddit API credentials in the
.envfile - Make sure the client ID and secret are correct
- Verify that your Reddit account has the necessary permissions
If you encounter NLTK-related errors:
- The application will automatically download required NLTK data
- If you still encounter issues, manually download the required package:
import nltk
nltk.download('punkt')- Add data visualizations (word clouds, trend graphs)
- Implement sentiment analysis
- Add competitor analysis features
- Create email reports for regular updates
- Add user authentication for the web interface
MIT
This tool is for educational purposes only. Be sure to comply with Reddit's API Terms of Service and rate limits when using this scraper.