Web Scraper Pro

A beautiful web-based frontend for Selenium-powered web scraping, featuring website content extraction and Google image search capabilities.

Features

Website Content Scraper: Extract text, links, and images from any website
- Handle dynamic content with infinite scrolling support
- Customizable scroll attempts
- Screenshot capability
- JSON export of results
Google Images Scraper: Search and download images from Google
- Specify number of images to download
- Image gallery preview
- Batch download as ZIP
Modern, Responsive UI
- Beautiful gradient design
- Mobile-friendly interface
- Dark mode support
- Task history with local storage

Installation

Clone this repository:

git clone https://github.com/yourusername/web-scraper-pro.git
cd web-scraper-pro

Create a virtual environment and activate it:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```
Make sure Chrome/Chromium is installed on your system (required for Selenium).

Usage

Start the Flask server:
```
python app.py
```
Open your browser and navigate to:
```
http://localhost:5000
```
Use the Web Scraper Pro interface:
- Enter website URLs to scrape content
- Enter search queries to find and download images
- View and download results

Project Structure

web-scraper-pro/
├── app.py                 # Flask application
├── selenium_scraper/
│   └── script.py          # Selenium scraping logic
├── static/
│   ├── css/
│   │   └── styles.css     # Custom CSS styles
│   └── js/
│       └── script.js      # Frontend JavaScript
├── templates/
│   ├── index.html         # Main application page
│   ├── 404.html           # Error page
│   └── 500.html           # Error page
└── output/                # Storage for scraped data
    ├── images/            # Downloaded images
    ├── screenshots/       # Website screenshots
    └── debug/             # Debug information

Requirements

Python 3.6+
Flask
Selenium
BeautifulSoup4
Chrome/Chromium browser

See requirements.txt for the complete list of dependencies.

How It Works

Website Scraping:
- The Flask application receives URLs to scrape
- Selenium WebDriver loads each page in Chrome
- Dynamic content is handled by scrolling and waiting
- BeautifulSoup extracts the content
- Results are saved as JSON and can be downloaded
Image Scraping:
- User enters a search query
- Selenium navigates to Google Images
- Images are clicked to get full-size versions
- Images are downloaded and displayed in the gallery
- Results can be downloaded individually or as a ZIP

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Selenium for browser automation
Flask for the web framework
Bootstrap for the UI components

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Web Scraper Pro

Features

Installation

Usage

Project Structure

Requirements

How It Works

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
selenium_scraper		selenium_scraper
static		static
templates		templates
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

License

arham2003/Web-Scraper-Pro

Folders and files

Latest commit

History

Repository files navigation

Web Scraper Pro

Features

Installation

Usage

Project Structure

Requirements

How It Works

Contributing

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages