This Python script is designed to scrape tweets containing the keyword "galamsey" from X.com (formerly Twitter). It utilizes undetected-chromedriver and Selenium to automate browser interactions, including logging in and scrolling through search results to collect a target number of tweets.
- Targeted Scraping: Collects tweets based on a specific keyword ("galamsey").
- Persistent Scrolling: Employs an intelligent scrolling mechanism to load more tweets dynamically.
- Batch Processing: Fetches tweets in batches and introduces strategic pauses to prevent rate limiting and ensure more content loads.
- Login Automation: Handles X.com login to access search results using securely managed credentials.
- Data Export: Saves scraped tweet data (username, content, timestamp, replies, retweets, likes, URL) to a CSV file.
- Brave Browser Support: Configured to work specifically with Brave browser for enhanced stealth.
Before you begin, ensure you have the following installed:
- Python 3.x: Download from python.org.
- Brave Browser: Download and install Brave from brave.com.
- Git: For cloning the repository.
Follow these steps to get the scraper up and running on your local machine.
First, clone this repository to your local machine:
git clone [https://github.com/Daemonlite/Python-selenium-twitter-galamsey-data.git](https://github.com/Daemonlite/Python-selenium-twitter-galamsey-data.git)
cd Python-selenium-twitter-galamsey-datatouch .env